Cohere just shipped a 3.35 billion parameter model that speaks 70+ languages and runs on your laptop without internet. Trained on just 64 H100 GPUs. It’s called Tiny Aya, and the LocalLLaMA community is buzzing.
TL;DR
- What: Cohere Labs released Tiny Aya — a family of open-weight multilingual models at 3.35B parameters
- Why it’s big: Covers 70+ languages including underserved South Asian and African languages, runs fully offline on consumer hardware
- Four variants: TinyAya-Global (broad), TinyAya-Earth (African), TinyAya-Fire (South Asian), TinyAya-Water (Asia-Pacific/Europe)
- Available now on HuggingFace, Kaggle, and Ollama for local deployment
A compact multilingual model family that punches above its weight class. 3.35B parameters, 70+ languages, runs offline on a laptop. Four regional variants let you optimize for your target market. Open-weight, free to use and modify.
What Just Happened
Cohere Labs — the research arm of enterprise AI company Cohere — announced Tiny Aya at the India AI Summit on February 17, 2026. TechCrunch covered the launch within the hour.
The model family includes:
| Model | Focus | Use Case |
|---|---|---|
| TinyAya (base) | 70+ languages | Research, fine-tuning |
| TinyAya-Global | Instruction-tuned, broad | General multilingual apps |
| TinyAya-Earth | African languages | Translation, local content |
| TinyAya-Fire | South Asian languages | Hindi, Bengali, Tamil, etc. |
| TinyAya-Water | Asia-Pacific, West Asia, Europe | Regional deployment |
Key technical facts:
- 3.35 billion parameters (small enough for consumer GPUs)
- Trained on 64 H100 GPUs — modest by frontier model standards
- Open-weight — download, modify, deploy however you want
- Offline-capable — no cloud dependency, no API costs
- Supports Bengali, Hindi, Punjabi, Urdu, Gujarati, Tamil, Telugu, Marathi, and dozens more
Why This Matters
Most multilingual models are too big to run locally. Meta’s LLaMA models support many languages but start at 7B+ parameters. Google’s Gemma is efficient but English-dominant. Tiny Aya hits a sweet spot: small enough for a laptop, broad enough for real multilingual work.
The regional variant strategy is smart. Instead of one model that’s mediocre everywhere, Cohere built specialized versions. If you serve the Indian market, TinyAya-Fire doesn’t waste capacity on Finnish. If you serve Africa, TinyAya-Earth focuses there. This is the kind of practical optimization that matters for deployment.
Offline means zero marginal cost. No API calls. No per-token pricing. No data leaving the device. For businesses in regions with unreliable internet — which is exactly where multilingual support matters most — this is a genuine unlock.
Cohere’s timing is strategic. The company hit $240M ARR in 2025 with 50% quarter-over-quarter growth. An IPO is coming. Open-sourcing Tiny Aya builds developer goodwill and ecosystem lock-in before they go public.
What This Means for You
If you’re a developer building multilingual products: Tiny Aya is your new baseline for offline translation, content moderation, and language-aware features. Download TinyAya-Global via Ollama, test it against your target languages, and benchmark against whatever you’re currently using. The regional variants mean you can ship a smaller, faster model tailored to your users.
If you’re a business serving non-English markets: This is the cheapest path to multilingual AI. No API costs, no data leaving your infrastructure, no dependency on cloud providers. Deploy on edge devices, embed in mobile apps, or run on local servers. Particularly relevant for healthcare, education, and government applications in linguistically diverse countries like India, Nigeria, or Indonesia.
What to Do Next
- Try it now: Pull the model via Ollama (
ollama run tinyaya) or download from HuggingFace. Test with your target languages. - Benchmark against your stack: If you’re paying for multilingual API calls (Google Translate, DeepL, GPT-4), run a comparison. At 3.35B parameters, inference is cheap enough that the ROI math might surprise you.
- Watch for the technical report: Cohere said they’ll release detailed training methodology. That’s where you’ll find benchmark numbers, dataset composition, and known limitations. Don’t deploy in production until you’ve read it.
Tiny Aya is available now on HuggingFace, Kaggle, and Ollama. The models are open-weight, meaning you can use, modify, and deploy them freely.
Related reads:
- DeepSeek AI 2026: Complete Guide to the $5.9M Model — Another open-source model shaking up the industry
- Best AI Tools for Small Business Under $50/Month — Where Tiny Aya could fit in your stack



