
On Saturday, Meta introduced its next generation of language models — Llama 4 Scout and Llama 4 Maverick. Available on WhatsApp, Messenger, Instagram Direct, and also via Meta.ai, these are the first two open-source AI models within the Llama 4 family, with two more models expected to launch at a later date.
“Llama 4 is a milestone for Meta AI and for open source. For the first time, the best small, mid-sized, and potentially soon, frontier models will be open source,” said Mark Zuckerberg, Meta CEO, in an Instagram Reel published on Saturday.
Of the two, Llama 4 Scout is a small multimodal model with a 10 million token context length that can run on a single GPU such as the Nvidia H100. With 17 billion parameters across 16 experts, it features a Mixture of Experts (MoE) architecture and is considered the highest-performing model in its class.
In the same Instagram Reel, Zuckerberg described it as a “workhorse” that outperforms GPT-4o and Gemini Flash 2 in all benchmarks. It is also said to be smaller and more efficient than DeepSeek V3, which is another 17-billion-parameter model using 128 experts. Llama 4 Scout can also run on a single host for easy inferencing.