Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: LLM Benchmarking Suite (github.com/dhyaneesh)
2 points by Dhyaneesh 5 days ago | hide | past | favorite | discuss
A comprehensive benchmarking suite for evaluating Gemma and other language models on various benchmarks including MMLU (Massive Multitask Language Understanding) and GSM8K (Grade School Math 8K).





Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: