AI4LAM Community Call: AI Evaluation Working Group

In the second part of our next Community Call we will focus on evaluation

Surface to illustrate Dither Effect

Second part of Our Next Community Call!

Tuesday, March 17, 2026

8:00 AM California | 11:00 AM Washington DC | 16:00 UK | 17:00 Oslo & Paris | 01:00 Sydney

2026-03-17 ai4lam Community Call

The AI Evaluation group assesses tools that incorporate artificial intelligence in the context of LAM operations. Our evaluation framework combines quantitative benchmarking against a control or gold standard with qualitative assessments. Last year we completed a scholarly article summarization use-case, and in our current study, we focus on AI-enabled literature search tools. 

We are documenting the literature sources these tools draw on, along with key product features, and comparing their performance against published systematic reviews. We aim to create a shared updateable resource that can support ongoing comparison as these tools mature. The  evaluation process often raises new questions and sometimes gives us a glimpse into how humans think.

Presenter: William Thorne