Table of Contents - F.748.44 (03/2025) - Assessment criteria for foundation models – Benchmark

1	Scope
2 References
3 Definitions
3.1 Terms defined elsewhere
3.2 Terms defined in this Recommendation
4 Abbreviations and acronyms
5 Conventions
6 Overview of benchmark for foundation models
6.1 General
6.2 Testing capabilities
6.3 Testing datasets
6.4 Testing method
6.5 Testing tool
7 Requirements of foundation models
7.1 Understanding ability
7.2 Generation ability
7.3 Reasoning ability
7.4 Knowledge ability
7.5 Reliability
7.6 Robustness
8 Evaluation methods of foundation models
8.1 Automated evaluation
8.2 Manual evaluation
Appendix I – Use cases of benchmark testing