第6节(作业&笔记)OpenCompass 大模型评测 | Notion

随着大模型的蓬勃发展，如何全面系统地评估大模型的各项能力成为了亟待解决的问题。

让我们开始学习如何评测LLM.

Untitled

Untitled

Untitled

Untitled

Untitled

Untitled

Untitled

开始动手！

Untitled

Untitled

Untitled

Untitled

Untitled

Untitled

Untitled

OpenCompass 评测 InternLM2-Chat-7B 模型在 C-Eval 数据集上的性能结果。

耗时30分钟左右，Done.

后续继续学习中。。。