On this page

Overview

Apertus is an open source Large Languag Model (LLM) developed in Switzerland. This documentation shows you how to get started with the LLM, whether as user, researcher, or advanced contributor: we are maintaining this knowledge base for you, and could ✉️ use your feedback.

About the project

The model development team is part of the Swiss AI Initiative, which started in late 2023. This is a platform for over 80 data science projects including the LLM development. Key highlights of the LLM project, as announced in July, include:

Multilingualism: Trained on more than 15 trillion tokens across 1,500+ languages, 40% non-English - equal usage cost across languages - see @epfml
Performance: This is a large model (8 billion and 70 billion parameters), trained on a lot of tokens, and it will be continue to be actively optimized.
Open & Transparent: Published under Apache-2.0 license - including source code, weights, and open training data.
Data Privacy: Compliant with GDPR, EU AI Act, and Swiss data protection laws - see Fan et al 2025
Infrastructure: Developed on the new Alps supercomputer at CSCS with over 10,000 NVIDIA GH200 Grace-Hopper chips
Global Reach: Research and borderless applications in mind, for sovereign and international public-interest AI.

Tech specs

The Swiss LLM is trained on the Alps supercomputer, operational at CSCS since September 2024:

10,752 NVIDIA GH200 Grace-Hopper chips
Computing power: 270-435 PFLOPS
Ranked 6th on the TOP500 list (June 2024)

The Swiss LLM was trained on approximately 15 trillion tokens. Particularly noteworthy is the high proportion of non-English data (40%) and coverage of over 1,500 languages, including rare ones like Romansh or Zulu. The data was ethically sourced - without illegal scraping, respecting robots.txt and copyright requirements. While this limits access to certain specialized information, CSCS emphasizes: «For general tasks, this doesn’t lead to measurable performance losses.»

For more technical references, see the Sources further below.

Initial benchmarks

See the Evaluation section of the Apertus Model Card, and Section 5 of the Tech Report for more data. This is an initial independent evaluation, and we expect more to come soon:

Model	MMLU (Knowledge)	Global-MMLU (Multilingual)	GSM8K (Math)	HumanEval (Code)	RULER @32k (Long Context)
Claude 3.5 Sonnet	88.7%	—	96.4%	92.0%	—
Llama 3.1 70B	83.6%	—	95.1%	80.5%	—
Apertus-70B	69.6%	62.7%	77.6%	73.0%	80.6%
Apertus-8B	60.9%	55.7%	62.9%	67.0%	69.5%

Performance comparison

Model	Parameters	Openness	Language Coverage	Training Hardware	Strengths
Swiss LLM	8B / 70B	Open Source, Weights, Data	>1,500	Alps: 10,752 GH200 GPUs	Linguistic diversity, data privacy, transparency
GPT-4.5	~2T (estimated)	Proprietary	~80 - 120	Azure: ~25,000 A100 GPUs	Creativity, natural conversation, agentic planning
Claude 4	Not published	Proprietary	?	Anthropic: ?	Adaptive reasoning, coding
Llama 4	109B / 400B	Open Weight	12, with 200+ in training	Meta: ~20,000 H100 GPUs	Multimodality, large community, agentic tasks
Grok 4	~1.8T MoE	Proprietary	?	Colossus: 200,000 H100 GPUs	Reasoning, real-time data, humor…

Source: effektiv.ch

Sources

For further information:

Swiss AI Initiative (swiss-ai.org)
July Announcement (ethz.ch)
ETH Zurich AI Center (ai.ethz.ch)
EPFL Machine Learning Lab (epfl.ch)
Apertus Tech Session (swiss-ai-weeks.ch)
Can the Swiss LLM Compete? (effektiv.ch)
AlgorithmWatch statement & position paper
Alps Supercomputer ranking (TOP500.org)

Quickstart

Overview

About the project link

Tech specs link

Initial benchmarks link

Performance comparison link

Sources link

About the project

Tech specs

Initial benchmarks

Performance comparison

Sources