A fresh set of benchmarks could help specialists better understand artificial intelligence.
Artificial intelligence (AI) models can perform as well as humans on law exams when answering multiple-choice, short-answer, and essay questions, as seen in a preprint at SSRN (2025). However, they struggle to perform real-world legal tasks.
Some lawyers have learnt that the hard way, and have been fined for filing AI-generated court briefs that misrepresented principles of law and cited non-existent cases.
Author: Chaudhri, principal scientist at Knowledge Systems Research in Sunnyvale, California.
Search author on: PubMed or Google Scholar.
Author's summary: New benchmarks are needed to assess AI's real-world knowledge.