AMD and Liquid AI announced a breakthrough collaboration on January 5, 2026, that brings cloud-quality AI meeting summarization directly to consumer laptops completely offline. The partnership showcases Liquid AI’s LFM2-2.6B model running on AMD Ryzen AI 400 Series processors, delivering production-grade summaries in under 16 seconds while using less than 3GB of RAM. This marks the first time a specialized AI model runs across all three compute engines (CPU, GPU, and NPU) on a mainstream AI PC.
What’s New in This Partnership
AMD and Liquid AI developed a custom 2.6-billion-parameter model fine-tuned specifically for meeting transcript summarization. The model was deployed from concept to production in under two weeks, demonstrating the speed advantage of Liquid’s LFM2 architecture over traditional transformer-based models.
The collaboration targets AMD Ryzen AI 400 Series and Ryzen AI Max+ 395 processors, both optimized for on-device workloads. AMD became the first and only AI PC platform to verify full tri-engine inference support for Liquid Foundation Models, meaning the same model can run on CPU, GPU, or NPU depending on battery, latency, or performance needs.
Liquid AI released its LFM2.5-1.2B model family on January 4, 2026, further expanding its on-device AI portfolio. The timing aligns with AMD’s broader push into “AI everywhere” computing, announced at CES 2026.
Why This Matters for Privacy and Performance
On-device AI eliminates the need to upload sensitive meeting data to cloud servers. The specialized LFM2-2.6B model processes up to 10,000 tokens (roughly a 60-minute meeting) entirely on a 16GB RAM laptop, with zero internet dependency.
This approach solves two critical problems with cloud AI: data privacy and ongoing API costs. Organizations can now run unlimited meeting summaries without paying per-request fees or worrying about compliance issues related to third-party data handling. The model also runs faster than cloud alternatives due to zero network latency summarizing hour-long meetings in 16 seconds on Ryzen AI Max+ 395 and 41 seconds on Ryzen AI 9 HX 470.
Liquid AI’s hybrid architecture uses only 20% attention mechanisms and relies on fast 1D short convolutions for most computation, dramatically reducing memory footprint compared to traditional LLMs. This design allows the quantized Q4_K_M version of LFM2-2.6B to use just 2.7GB of RAM at full 10K-token context, versus 15.2GB for comparable Qwen3-30B models.
How LFM2 Compares to Cloud Models
AMD tested the fine-tuned LFM2-2.6B against open-source and cloud models using its GAIA Eval-Judge framework. On short 1,000-token meeting transcripts, the 2.6B model achieved 86% accuracy rating, nearly matching Claude Sonnet 4 (90%) and Qwen3-30B (88%) despite being 10x smaller.
| Model | Size | Short Transcript Accuracy | Long Transcript Accuracy | RAM Usage (10K tokens) |
|---|---|---|---|---|
| Claude Sonnet 4 | Cloud | 90% | 93% | N/A (cloud-only) |
| Qwen3-30B | 30B | 88% | 92% | 15.2 GB |
| LFM2-2.6B | 2.6B | 86% | 77% | 2.7 GB |
| GPT-OSS-20B | 20B | 83% | 71% | 9.7 GB |
| Qwen3-8B | 8B | 65% | 72% | 6.2 GB |
The specialized LFM2 model also runs 30–63% faster than larger baseline models on Ryzen AI hardware, completing 60-minute summaries in seconds rather than minutes. Speed gains come from both the efficient architecture and hardware-aware optimization across AMD’s AI PC stack.
Technical Architecture and Deployment
Liquid AI’s LFM2 represents a fundamental shift from cloud-first AI design. Rather than compressing large models through quantization, LFM2 was architected from the ground up for efficiency. The hybrid model uses 10 double-gated short-range convolution blocks and only 6 grouped query attention blocks, reducing memory overhead compared to pure transformer architectures.
The model family spans 350M to 2.6B parameters for text, plus an 8B mixture-of-experts (MoE) variant with just 1B active parameters. Liquid also offers multimodal models for vision and audio, along with ultra-compact “nano” models for constrained environments.
Fine-tuning speed is a key advantage: LFM2 delivers 300% better GPU efficiency during training compared to the first-generation LFM1 architecture. This allows teams to specialize models for specific workflows in hours or days rather than weeks.
What’s Next for On-Device AI
AMD Ryzen AI 400 and PRO 400 Series processors began shipping in January 2026, with broader OEM availability throughout Q1 2026. The processors deliver up to 60 NPU TOPS for Copilot+ PCs and other AI experiences.
Liquid AI has not announced public availability of the meeting summarization model, but the company’s LFM2.5-1.2B family launched January 4, 2026, for edge AI deployment. Developers can access Liquid’s LEAP platform for model customization and deployment.
The broader industry trend points toward specialized, application-specific models rather than one-size-fits-all general LLMs. On-device AI eliminates cloud dependency for sensitive workflows like legal document review, medical transcription, and financial analysis all scenarios where privacy and cost control matter more than broad general knowledge.
AMD’s demonstration proves that 16GB RAM laptops can handle production-grade AI workloads without requiring 32GB+ developer machines or cloud infrastructure. This opens enterprise deployment to standard business hardware rather than specialized AI workstations.
Featured Snippet Boxes
How does AMD Ryzen AI run meeting summaries without the internet?
AMD Ryzen AI processors include dedicated NPU, GPU, and CPU compute engines that run Liquid AI’s fine-tuned LFM2-2.6B model entirely on-device. The model processes meeting transcripts up to 10,000 tokens (60 minutes) using under 3GB of RAM, fitting within typical 16GB laptop configurations. No cloud connection or API calls are required.
What makes Liquid AI LFM2 different from ChatGPT or Claude?
LFM2 uses a hybrid architecture with 80% short convolutions and only 20% attention mechanisms, versus pure transformer models like GPT. This design reduces memory usage by up to 476% compared to similar-quality models. LFM2 is also fine-tuned for specific tasks rather than serving as a general-purpose chatbot, delivering specialized performance at a fraction of the size.
Can AMD Ryzen AI 400 Series work offline for AI tasks?
Yes. AMD Ryzen AI 400 Series and Max+ processors are designed for fully offline AI workloads. The tri-engine architecture (CPU, GPU, NPU) allows models to run without the internet, making them suitable for privacy-sensitive environments like legal, healthcare, and finance. Battery life and performance can be optimized by selecting the appropriate compute engine.
How much RAM is needed for on-device AI meeting summaries?
Liquid AI’s LFM2-2.6B model requires just 2.7GB of RAM to process 10,000-token meeting transcripts, making it practical for standard 16GB RAM laptops. In contrast, comparable-quality models like Qwen3-30B need 15.2GB of RAM for the same workload. General-purpose AI PCs with 32GB or more RAM can run larger models or multiple concurrent tasks.

