Benchmarks are a key indicator of progress in the AI field, and great progress has been made. Large models are compared against large test sets of questions, images, and tasks. As a control, humans take these tests to set a benchmark for comparison vs. AI models. Over time, one of…