Humanity's Last Exam

I was one of the top contributors for Humanity’s Last Exam!

A total of 14 of my submissions were accepted: 12 public and 2 private. Many problems are from computational game theory, and some problems are in algebra and general relativity.

One or more of my submitted questions was selected as part of the top 550 for HLE, which won me a total prize of $500 from Scale AI for being a top 550 contributor.

Thanks to Richard Stanley my Erdos number is now 3.

What is HLE?

Humanity’s Last Exam is a multi-modal benchmark at the frontier of human knowledge, designed to be the final closed-ended academic benchmark with broad subject coverage. It consists of 3,000 questions across dozens of subjects, including mathematics, humanities, and the natural sciences. The benchmark was developed globally by subject-matter experts and published in Nature.

State-of-the-art LLMs demonstrate low accuracy and calibration on HLE, highlighting a significant gap between current LLM capabilities and the expert human frontier on closed-ended academic questions.