Apple Introduces Open Source Multimodal LLM, Ferret

The multimodal LLM can use parts of images as queries using the GRIT Dataset consists around 1.1Mn examples.

Apple Inc. in collaboration with Columbia University’s AI researchers has quietly introduced an open-source multimodal large language model named “Ferret.” This model, unveiled on GitHub in October, gained significant attention from the AI research community, despite no official announcement.

Ferret is trained on 8 A100 GPUs with 80GB memory. The dataset used in the project is governed by the CC BY NC 4.0 licence, which permits non-commercial use only. The key contributions of the project include the Ferret model, GRIT dataset and Ferret-Bench.

- Advertisement -

The Ferret model combines a hybrid region representation with a spatial-aware visual sampler to enable fine-grained and open-vocabulary referring and grounding within a multimodal large language model (MLLM). This capability enhances the model’s ability to understand and respond to complex queries that involve both text and images.

- Advertisement -

The project introduces the GRIT Dataset, which consists of approximately 1.1 million examples. This dataset is designed to support large-scale, hierarchical, and robust instruction tuning for grounding and referring tasks. It serves as a valuable resource for training and evaluating AI models in tasks related to understanding and responding to instructions.

Ferret-Bench is a multimodal evaluation benchmark created as part of the project. It is designed to assess the performance of AI models across various dimensions, including Referring/Grounding, Semantics, Knowledge, and Reasoning. This benchmark provides a comprehensive testing ground for evaluating the capabilities of models like Ferret in real-world scenarios.

Ferret is described as a model that can use parts of images as queries, making it a powerful multimodal AI system. Its working involves examination of a specific region of an image. It then identifies elements within that region that could be relevant to a query and draws bounding boxes around these elements. Then it uses the identified elements as part of a query to provide responses in a traditional language model manner.

This means if a user highlights an image of an animal within a larger image and asks what the animal is, Ferret identifies the species of the creature and can use context from other elements in the image to provide further information or context.

The release of Ferret is seen as significant because it represents an unexpected level of openness from Apple, a company known for its secrecy. This open-source approach contrasts with Apple’s traditional practices.

One reason for this openness may be Apple’s need to compete in the AI industry, where it faces challenges from rivals like Microsoft and Google. Apple’s infrastructure is not optimised for serving large language models (LLMs) at scale, which puts it at a disadvantage. To address this, Apple must choose between partnering with cloud hyperscalers for AI or sharing its work with the open-source community, a strategy similar to what Meta Platforms Inc. (formerly Facebook) has adopted.

Ferret’s release demonstrates Apple’s willingness to collaborate and contribute to the AI research community, reflecting a shift in its approach to AI development.

- Advertisement -

Industry's Buzz

EV battery (representational image)

Toyota Expands Line Up With First Battery Electric Urban Cruiser Ebella

0
Urban Cruiser Ebella strengthens Toyota’s multipathway strategy, expanding electrified choices while supporting India’s energy security and decarbonisation goals. Toyota Kirloskar Motor is expanding its product...
Lithium-ion battery (Representational Image)

Portugal’s Lifthium Secures $210 Million Grant For Lithium Refinery

0
Europe’s top lithium producer, Portugal is now moving beyond ceramics to develop battery grade lithium production from its reserves. Portugal’s Lifthium Energy has secured a...
US Tariffs

US Semiconductor Firms Back Trump’s 25% Chip Tariff

0
Despite the risk of higher costs, US semiconductor firms back Trump’s 25% tariff on AI chips. Global supply chains and investments may be impacted...
Electric Bus (Representational Image)

Tata Motors Eyes 6,000 E-Bus Tender, Dismisses Price War

0
Tata Motors aims to regain market share in e-buses without sacrificing long term viability or engaging in aggressive price competition. Tata Motors will participate in...
Tata Motors e-bus

Tata Motors To Bid For 6000 E-Buses

0
Testing its long-term strategy, Tata Motors plans to bid for 6000 electric buses, sticking to disciplined pricing despite recent low-bid competition. Tata Motors Ltd. is...

Learn From Leaders

Himanshu Dave, Founder and CEO at Infitron Advanced Systems Pvt Ltd

“Our Systems Were Deployed During Operation Sindoor By The Gujarat Government In Sensitive Border...

0
How are AI-powered systems stopping drones in real time and changing the way security forces operate? Himanshu Dave from Infitron Advanced Systems, tells EFY’s...
Dr Suraj Rengarajan, MD, Head of Semiconductor Products Group, Applied Materials India

“All The Design Work That Supports Our Manufacturing Is Carried Out From Our India...

0
How vital is India to Applied Materials after two decades of operations in the country? From expansion plans to navigating rare-earth shortages, Dr Suraj...
Shashwath T.R. Co-Founder and CEO, Mindgrove Technologies

“It’s A Misconception That Startups Universally Pay Less” – Shashwath T. R. Of Mindgrove...

0
For key roles and skill sets, compensation at startups often matches or even exceeds corporate standards, says Shashwath T.R. of Mindgrove Technologies in conversation...
(Left to right): Bhagyesh Tajne CTO, Rajesh Thakre MD, Hitendra Valhe CEO, Nilesh wagh CMO

“We Are Bridging The Gap Between College Labs And Real World Robotics”- Hitendra Valhe,...

0
Transforming basic robotics kits, a student-led startup is redefining a complete learning path, from beginner projects to AI-powered machines. How is that possible? Hitendra...
Steve Sanghi, CEO and President of Microchip Technology

Building Fully Independent Semiconductor Supply Chains Within Each Geopolitical Bloc Is Impractical – Steve...

0
What lies ahead for semiconductors in 2026? With AI-driven momentum, resilient supply chains, a pragmatic ESG outlook, and sustained innovation, Microchip Technology’s CEO, Steve...

Startups

Himanshu Dave, Founder and CEO at Infitron Advanced Systems Pvt Ltd

“Our Systems Were Deployed During Operation Sindoor By The Gujarat Government In Sensitive Border...

0
How are AI-powered systems stopping drones in real time and changing the way security forces operate? Himanshu Dave from Infitron Advanced Systems, tells EFY’s...
Shashwath T.R. Co-Founder and CEO, Mindgrove Technologies

“It’s A Misconception That Startups Universally Pay Less” – Shashwath T. R. Of Mindgrove...

0
For key roles and skill sets, compensation at startups often matches or even exceeds corporate standards, says Shashwath T.R. of Mindgrove Technologies in conversation...

A Full Diagonistic Lab That Easily Fits Your Pocket

0
By combining deep electronics engineering with point-of-care design, the system delivers fast, affordable diagnostics at the point of need—without bulky machines, complex infrastructure, or...
Team RoadGrid

EV Charging Startup RoadGrid Bags ₹120 Million, Eyes Expansion

0
As India races to plug its massive EV charging gap, startup RoadGrid secures ₹120 million to scale its universal chargers, strengthen its software, and...
(left to right): Nishita Baliarsingh and Nikita Baliarsingh Founders, Nexus Power

“You Could Directly Replace Lithium Batteries With Ours Without Changing The System” – Nishita...

0
Turning crop residue into power, a new battery is promising lithium-level performance, safer charging, lower costs, and full biodegradability! Can this innovation reshape clean...