SuperGLUE is a benchmark designed to pose a more rigorous test of language understanding than GLUE. It consists of a public leaderboard built around eight challenging language understanding tasks, accompanied by a single-number performance metric, and an analysis toolkit. SuperGLUE improves upon GLUE with more challenging tasks, diverse task formats, comprehensive human baselines, and improved code support.
from benchthing import Bench
bench = Bench("super-glue")
bench.run(
benchmark="super-glue",
task_id="1",
models=yourLanguageModels
)
result = bench.get_result("1")