- GitHub - huggingface/evaluation-guidebook: Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard and designing lighteval!
- Evaluating the Effectiveness of LLM-Evaluators (aka LLM-as-Judge)
- LLM Evaluation doesn’t need to be complicated