back to top
More
    HomeTechCohere's Tiny Aya: The 3.35B-Parameter Model Bringing 70+ Languages to Your Pocket

    Cohere’s Tiny Aya: The 3.35B-Parameter Model Bringing 70+ Languages to Your Pocket

    Published on

    Oracle’s AI Data Centers Are Designed to Protect the Communities They Enter

    A large conventional data center can drain up to 5 million gallons of water every single day, roughly equivalent to the daily needs of a town of 50,000 residents. That number explains why community opposition

    Key Takeaways

    • Cohere Tiny Aya packs 3.35 billion parameters and supports 70+ languages including Hindi, Tamil, Bengali, and Marathi
    • Five variants Base, Global, Earth, Fire, Water each optimized for a specific regional or deployment context
    • Runs entirely offline on standard laptops and phones; no internet or cloud subscription required
    • Outperforms Gemma 3-4B and Qwen 3-4B on low-resource language benchmarks (GlobalMGSM)

    Most AI models demand English fluency, stable broadband, and a cloud server to function. Cohere’s Tiny Aya removes all three requirements at once. Launched February 17, 2026, at the India AI Summit, this open-weight model runs 70+ languages fully offline on a standard laptop, no internet required. What the specs, benchmark results, and real deployment data actually show is more consequential than the announcement itself.

    What Is Cohere Tiny Aya?

    Cohere Labs Cohere’s dedicated research division released Tiny Aya on February 17, 2026, as a family of open-weight small language models. The base model contains 3.35 billion parameters and handles more than 70 languages, with particular strength in underserved regional languages across Africa, South Asia, Asia-Pacific, West Asia, and Europe. The term “open-weight” means the model’s underlying code and weights are publicly available for anyone to use and modify.

    Tiny Aya was built specifically to run on consumer hardware without an internet connection. Cohere engineered the underlying software to require less computing power than most comparable models, making offline translation and local language applications viable even on everyday devices. A technical report detailing the full training methodology is forthcoming from Cohere.

    Five Tiny Aya Variants and What Each Does

    Cohere released five models in the Tiny Aya family, each built for a different deployment scope. The core design principle across all variants is consistent: strong regional linguistic grounding while retaining broad multilingual coverage, making every variant a flexible starting point for further adaptation and research.

    Cohere explained the rationale directly: This approach allows each model to develop stronger linguistic grounding and cultural nuance, creating systems that feel more natural and reliable for the communities they are meant to serve. At the same time, all Tiny Aya models retain broad multilingual coverage, making them flexible starting points for further adaptation and research.

    Variant Regional Focus Primary Use Case
    Tiny Aya Base All 70+ languages (pretrained foundation) Fine-tuning, research, custom app development
    Tiny Aya Global Balanced across all supported languages, instruction-tuned Apps requiring broad cross-language command-following
    TinyAya-Earth Africa African language apps and offline deployments
    TinyAya-Fire South Asia: Hindi, Bengali, Punjabi, Urdu, Gujarati, Tamil, Telugu, Marathi India-focused apps and vernacular services
    TinyAya-Water Asia-Pacific, West Asia, and Europe Regional deployments across APAC, Middle East, Europe

    All five variants retain full cross-lingual capability regional specialization sharpens local performance without removing universal coverage.

    Architecture: What Cohere Has Confirmed

    Cohere has confirmed a limited set of architecture and training details, with a full technical report still pending. The verified facts are:

    • Total parameters: 3.35 billion
    • Training hardware: Single cluster of 64 Nvidia H100 GPUs
    • Training scale: Described as “relatively modest computing resources” by Cohere
    • On-device optimization: Software built specifically to require less computing power than most comparable models

    No further architectural specifications, layer counts, vocabulary sizes, attention mechanisms, or pretraining token counts have been verified from primary sources as of this publication. This article will be updated when the official technical report is released.

    How was Cohere Tiny Aya trained?

    Cohere trained Tiny Aya on a single cluster of 64 Nvidia H100 GPUs, which the company describes as relatively modest computing resources for a model of its capability. The full training methodology, including data and architecture details, will be covered in a forthcoming technical report from Cohere.

    Tiny Aya vs. Gemma 3: Where It Wins and Where It Falls Short

    Tiny Aya and Gemma 3 target different priorities Tiny Aya optimizes for multilingual depth and offline accessibility, while Gemma 3 prioritizes context length and multimodal capability.

    Feature Tiny Aya Google Gemma 3-4B
    Parameters 3.35B  4B 
    Context window 8K tokens  128K tokens 
    Languages 70+ (deep regional coverage)  140+ (broad) 
    Low-resource language performance Best on GlobalMGSM  Moderate 
    Regional variants Yes (Earth/Fire/Water)  No
    Multimodal (image + text) No Yes 
    Offline capable Yes  Yes
    Open-weight Yes  Yes
    Release February 2026  March 2025 

    On the GlobalMGSM benchmark which tests translation and mathematical reasoning across diverse languages Tiny Aya outperforms both Gemma 3-4B and Qwen 3-4B specifically on low-resource African and West Asian languages. Gemma 3 retains a significant lead on context length (128K vs. 8K tokens) and is the stronger option for tasks involving image-text inputs or long documents.

    On-Device Performance: Real Hardware Results

    Cohere designed Tiny Aya’s software from the ground up to minimize compute requirements for on-device inference. Independent testing by Futurum Group recorded Tiny Aya achieving 32 tokens per second on an iPhone 17 Pro, a benchmark that confirms practical usability on current consumer mobile hardware.

    This performance figure is significant for India specifically. A developer building a Hindi or Tamil language app can run TinyAya-Fire locally on a mid-range device without GPU acceleration, cloud API costs, or dependence on consistent broadband access. Cohere highlighted India’s linguistically diverse infrastructure as a primary motivator for the model’s offline-first architecture.

    Why the India AI Summit Launch Matters

    Cohere chose the India AI Summit as Tiny Aya’s global launch venue, a deliberate signal about the model’s target market. TinyAya-Fire directly addresses eight of India’s most widely spoken languages: Bengali, Hindi, Punjabi, Urdu, Gujarati, Tamil, Telugu, and Marathi. No prior small language model had covered this combination with offline capability at this parameter scale.

    Cohere’s business trajectory reinforces the strategic focus. The company posted $240 million in annual recurring revenue at the end of 2025, with 50% quarter-over-quarter growth, and CEO Aidan Gomez has confirmed plans for a public offering. Tiny Aya positions Cohere as the leading multilingual AI infrastructure provider for sovereign and emerging market deployments ahead of that IPO.

    How to Access and Deploy Tiny Aya

    Tiny Aya is available across four platforms, with no paywall for model weights:

    1. Hugging Face   Download model weights and access training and evaluation datasets
    2. Kaggle   Local deployment and notebook-based experimentation
    3. Ollama   Single-command local setup for offline use on personal hardware
    4. Cohere Platform   Managed API access for production-scale enterprise integrations

    Developers can use any of the regional variants as a starting point for fine-tuning toward more specific tasks or narrower language domains.

    Limitations to Know Before Deploying

    • 8K context window: Unsuitable for long-document tasks legal texts, extended reports, or multi-turn conversations beyond roughly 6,000 words
    • Text-only: No multimodal capability; cannot process images, audio, or video
    • Reasoning ceiling: Optimized for edge inference; complex multi-step logical reasoning favors larger models such as Cohere’s own Command R+
    • Language depth variance: 70+ languages does not mean uniform quality; major Indic and African languages are primary targets, smaller dialects may perform inconsistently
    • Pending technical report: Full architecture, training data, and benchmark methodology details are not yet publicly available performance claims beyond GlobalMGSM should be treated as provisional

    Frequently Asked Questions (FAQs)

    What is Cohere Tiny Aya?

    Cohere Tiny Aya is a family of open-weight small language models released by Cohere Labs on February 17, 2026. With 3.35 billion parameters, it supports over 70 languages and runs fully offline on everyday devices like laptops and phones, with no internet connection required.

    How many languages does Tiny Aya support?

    Tiny Aya supports over 70 languages across African, South Asian, Asia-Pacific, West Asian, and European regions. The TinyAya-Fire regional variant covers eight South Asian languages: Hindi, Bengali, Punjabi, Urdu, Gujarati, Tamil, Telugu, and Marathi.

    Can Tiny Aya run offline on a smartphone?

    Yes. Cohere built Tiny Aya’s underlying software specifically for on-device use, requiring less computing power than most comparable models. Independent testing recorded 32 tokens per second on an iPhone 17 Pro, confirming practical offline performance on current consumer mobile hardware.

    What are the five Tiny Aya model variants?

    The five variants are: Tiny Aya Base (open pretrained multilingual foundation), Tiny Aya Global (instruction-tuned for broad cross-language use), TinyAya-Earth (African languages), TinyAya-Fire (South Asian languages including all eight major Indian languages), and TinyAya-Water (Asia-Pacific, West Asia, and Europe).

    How does Tiny Aya compare to Gemma 3-4B?

    Tiny Aya outperforms Gemma 3-4B on GlobalMGSM benchmark scores for low-resource African and West Asian languages. Gemma 3 leads on context length at 128K tokens versus Tiny Aya’s 8K, and supports multimodal image-text inputs that Tiny Aya does not.

    Is Cohere Tiny Aya free to use?

    Tiny Aya is open-weight and freely available on Hugging Face, Kaggle, and Ollama. The model’s underlying code and weights are publicly available for anyone to use and modify. Enterprise API access via the Cohere Platform is also available for production deployments.

    Why was Tiny Aya launched at the India AI Summit?

    Cohere selected the India AI Summit as the launch venue because Tiny Aya directly addresses India’s scale of linguistic diversity. TinyAya-Fire covers eight major Indian languages with full offline capability enabling developers in connectivity-limited areas to build native language applications without cloud access.

    Where can I download Cohere Tiny Aya?

    Cohere Tiny Aya is free to download on Hugging Face, Kaggle, and Ollama for local offline deployment. It is also available via the Cohere Platform for API-based enterprise use. Training and evaluation datasets are published on Hugging Face alongside all five model weight variants.

    How does Tiny Aya compare to Gemma 3?

    Tiny Aya (3.35B) outperforms Gemma 3-4B on GlobalMGSM benchmark scores for low-resource African and West Asian languages. However, Gemma 3-4B leads on context length 128K tokens vs. Tiny Aya’s 8K and supports multimodal image-text tasks that Tiny Aya does not offer. Both are open-weight and offline-capable.

    Mohammad Kashif
    Mohammad Kashif
    Senior Technology Analyst and Writer at AdwaitX, specializing in the convergence of Mobile Silicon, Generative AI, and Consumer Hardware. Moving beyond spec sheets, his reviews rigorously test "real-world" metrics analyzing sustained battery efficiency, camera sensor behavior, and long-term software support lifecycles. Kashif’s data-driven approach helps enthusiasts and professionals distinguish between genuine innovation and marketing hype, ensuring they invest in devices that offer lasting value.

    Latest articles

    Oracle’s AI Data Centers Are Designed to Protect the Communities They Enter

    A large conventional data center can drain up to 5 million gallons of water every single day, roughly equivalent to the daily needs of a town of 50,000 residents. That number explains why community opposition

    Samsung’s New Bixby in One UI 8.5 Finally Understands How You Actually Talk

    Samsung just redesigned Bixby from the ground up, and for the first time, it behaves more like a conversation than a command prompt. With the One UI 8.5 beta now live across six markets including India

    AT&T and Cisco’s 5G Standalone IoT Platform Sets a New Enterprise Standard

    AT&T and Cisco converted 5G Standalone’s enterprise potential into a live commercial platform this month, announced February 19, 2026. The integration pairs AT&T’s nationwide 5G SA core with Cisco’s Mobility

    Sam Altman Dropped Out of Stanford at 19, Raised $30M, and Built the Startup That Led to OpenAI

    The man now steering the most consequential AI company on the planet once built a location-sharing app most people have never heard of. In 2005, Sam Altman walked away from Stanford at 19, secured more than $30 million in

    More like this

    Oracle’s AI Data Centers Are Designed to Protect the Communities They Enter

    A large conventional data center can drain up to 5 million gallons of water every single day, roughly equivalent to the daily needs of a town of 50,000 residents. That number explains why community opposition

    Samsung’s New Bixby in One UI 8.5 Finally Understands How You Actually Talk

    Samsung just redesigned Bixby from the ground up, and for the first time, it behaves more like a conversation than a command prompt. With the One UI 8.5 beta now live across six markets including India

    AT&T and Cisco’s 5G Standalone IoT Platform Sets a New Enterprise Standard

    AT&T and Cisco converted 5G Standalone’s enterprise potential into a live commercial platform this month, announced February 19, 2026. The integration pairs AT&T’s nationwide 5G SA core with Cisco’s Mobility
    Skip to main content