Search
Browse By Day
Browse By Time
Browse By Person
Browse By Division
Browse By Session or Event Type
Virtual Exhibit Hall
Search Tips
How to Build a Personal Program
Virtual Exhibit Hall
Personal Schedule
Change Preferences / Time Zone
We consider the problem of learning a set of ground truth labels using an LLM (judge) and human annotators (a jury). We formally show an interpolation-extrapolation trade-off: LLMs are more reliable for problems commonly observed in training data.