French AI startup Mistral has released its first model that can process images as well as text.
Called Pixtral 12B, the 12-billion-parameter model is roughly 24GB size. (Parameters roughly correspond to a model’s problem-solving skills, and models with more parameters generally perform better than those with fewer parameters.) Available on GitHub as well as the AI and machine learning development platform Hugging Face, the model can be downloaded, fine-tuned and used under Mistral’s standard license, which requires a paid license for any commercial applications but not research and academic ones.
Built on Mistral’s text model Nemo 12B, Pixtral 12B can answer questions about an arbitrary number of images of an arbitrary size given either image URLs or images encoded using the binary-to-text encoding scheme base64. Like other multimodal models (e.g. Anthropic’s Claude family, GPT-4o and so on), Pixtral 12B should — at least in theory — be able to perform tasks like captioning images and counting the number of objects in photo.
This writer wasn’t able to take Pixtral 12B for a spin, unfortunately — there weren’t any working web demos as of publication time. In a post on X, Sophia Yang, head of Mistral developer relations, said that Pixtral 12B will be available for testing on Mistral’s chatbot and API-serving platforms, Le Chat and Le Platforme, “soon.”
Unclear is which image data Mistral might’ve used to develop Pixtral 12B.
Most generative AI models, including Mistral’s other models, are trained on vast quantities of public — and often copyrighted — data from around the web. Some model vendors argue that fair use entitles them to scape any public data. Many copyright holders disagree — and have filed lawsuits against the larger vendors, including OpenAI and Midjourney, to attempt to put a stop to the practice.
The release of Pixtral 12B comes shortly after Mistral closed a $645 million funding round led by General Catalyst that valued the company at $6 billion. Just over a year old, Mistral is seen by many in the AI community as Europe’s answer to OpenAI; its strategy thus far has involved releasing free “open” models, charging for managed versions of those models and providing consulting services to corporate customers.
Source : Techcrunch