Glossary A–Z: Definitions, Acronyms & FAQs

Sparse attention – definition

• A way for AI models to look at only the most useful parts of text instead of everything at once.
• It skips less relevant tokens to save time and memory.

Why it matters

Sparse attention can make long-context tasks faster and cheaper without a big hit to quality. If you’re summarizing a 100-page PDF, it may finish quicker and use less GPU or phone memory. Many modern large-context models use it to scale beyond 100K tokens. The trade-off: if the selection is too sparse, the model might miss details or nuance.