Google Open Source Blog: Gemma

Showing posts with label Gemma. Show all posts

Gemma 4: Expanding the Gemmaverse with Apache 2.0

Thursday, April 2, 2026

by Nia Castelly & amanda casari, Google Open Source & Olivier Lacombe, Google DeepMind

Gemma 4: Expanding the Gemmaverse with Apache 2.0

For over 20 years, Google has maintained an unwavering commitment to the open-source community. Our belief has been simple: open technology is good for our company, good for our users, and good for our world. This commitment to fostering collaborative learning and rigorous testing has consistently proven more effective than pursuing isolated improvements. It's been our approach ever since the 2005 launch of Google Summer of Code, and through our open-sourcing of Kubernetes, Android, and Go, and it remains central to our ongoing, daily work alongside maintainers and organizations.

Today, we are taking a significant step forward in that journey. Since first launch, the community has downloaded Gemma models over 400 million times and built a vibrant universe of over 100,000 inspiring variants, known in the community as the Gemmaverse.

The release of Gemma 4 under the Apache 2.0 license — our most capable open models ranging from edge devices to 31B parameters — provides cutting-edge AI models for this community of developers. The industry-standard Apache license broadens the horizon for Gemma 4's applicability and usefulness, providing well-understood terms for modification, reuse, and further development.

A long legacy of open research

We are committed to making helpful, accessible AI technology and research so that everyone can innovate and grow. That's why many of our innovations are freely available, easy to deploy, and useful to developers across the globe. We have a long history of making our foundational machine-learning research, including word2vec, Jax, and the seminal Transformers paper, publicly available for anyone to use and study.

We accelerated this commitment last year. By sharing models that interpret complex genomic data and identify tumor variants, we contributed to the "magic cycle" of research breakthroughs that translate into real-world impact. This week, however, marks a pivotal moment — Gemma 4 models are the first in the Gemmaverse to be released under the OSI-approved Apache 2.0 license.

Empowering developers and researchers to deliver breakthrough innovations

Since we first launched Gemma in 2024, the community of early adopters has grown into a vast ecosystem of builders, researchers, and problem solvers. Gemma is already supporting sovereign digital infrastructure, from automating state licensing in Ukraine to scaling Project Navarasa across India's 22 official languages. And we know that developers need autonomy, control, and clarity in licensing for further AI innovation to reach its full potential.

Gemma 4 brings three essential elements of free and open-source software directly to the community:

Autonomy: By letting people build on and modify the Gemma 4 models, we are empowering researchers and developers with the freedom to advance their own breakthrough innovations however they see fit.
Control: We understand that many developers require precise control over their development and deployment environments. Gemma 4 allows for local, private execution that doesn't rely on cloud-only infrastructure.
Clarity: By applying the industry-standard Apache 2.0 license terms, we are providing clarity about developers' rights and responsibilities so that they can build freely and confidently from the ground up without the need to navigate prescriptive terms of service.

Building together to drive real-world impact

Gemma 4, as a release, is an invitation. Whether you are a scientific researcher exploring the language of dolphins, an industry developer building the next generation of open AI agents, or a public institution looking to provide more effective, efficient, and localized services to your citizens, Google is excited to continue building with you. The Gemmaverse is your playground, and with Apache 2.0, the possibilities are more boundless than ever.

We can't wait to see what you build.

Empowering app developers: Fine-tuning Gemma 3 for mobile with Tunix in Google Colab

Thursday, December 11, 2025

by Henry Ndubuaku & Noah Cyclich, Cactus Compute and Lance Wang & Srikanth Kilaru, Google ML Frameworks

In the rapidly evolving world of AI models for mobile devices, a persistent challenge is how to bring SOTA LLMs to smartphones without compromising on privacy or requiring App developers to be Machine Learning engineers.

Today, we are excited to talk about how Cactus, a startup building a next-gen inference engine for mobile devices, fine-tunes the open-source Gemma 3 model. By leveraging Tunix, the LLM post-training library in the JAX ML ecosystem, they achieved this entirely on Google Colab's Free Tier.

The Challenge: Making Small Models "Expert"

For app developers, running Large Language Models (LLMs) in the cloud isn't always an option due to privacy concerns (like GDPR) and latency requirements. The solution lies in running models locally on the device. However, most smartphones globally lack specialized MPUs (Micro Processing Units), meaning developers need highly efficient, smaller models.

While compact models like Gemma (270M or 1B parameters) are incredibly efficient, they are often "generalists." To be useful for specific mobile applications—such as a medical imaging assistant or a legal document analyzer—they need to be fine-tuned to become domain experts.

The problem? Most app developers are not ML infrastructure experts. Setting up complex training pipelines, managing dependencies, and navigating steep learning curves creates too much friction.

The Solution: SFT via Tunix on Google Colab

To solve this, Cactus created a simplified "Low-Friction" workflow by implementing a Python script using Supervised Fine Tuning (SFT) APIs of Tunix in a Colab.

1. The Engine: Tunix

Cactus utilized Tunix, Google's lightweight and modular LLM post-training library, which supports both SFT and leading RL algorithms, and executes natively on TPUs. Tunix strips away the complexity of heavy frameworks, offering a simplified path to Supervised Fine-Tuning (SFT).

2. The Access: Google Colab Free Tier

Accessibility was a key requirement. Instead of requiring developers to set up complex cloud billing and project IDs immediately, the workflow operates entirely within a Google Colab Notebook. By utilizing the free tier of Colab, developers can:

Load the Gemma 3 model.
Upload their specific dataset (e.g., medical data or customer service logs).
Run an SFT (Supervised Fine-Tuning) job using Tunix.
Export the weights for conversion.

3. The Deployment: Cactus

Once tuned, the model is converted into the Cactus graph format. This allows the now-specialized Gemma 3 model to be deployed directly into a Flutter or native mobile app with just a few lines of code, running efficiently on a wide range of smartphone hardware.

Why This Matters

"Our users are app developers, not ML engineers," explains Henry Ndubuaku, co-founder of Cactus. "They want to pick a model, upload data, and click 'tune.' By using Tunix and Colab, we can give them a 'clone-and-run' experience that removes the intimidation factor from fine-tuning."

This workflow represents the "lowest hanging fruit" in democratizing AI:

No complex local environment setup.
No upfront infrastructure costs.
High-performance JAX native Tunix library to tune a leading OSS model (Gemma).

What's Next?

While the Colab notebook provides an immediate, accessible solution, Cactus is exploring a future plan to build a full GUI-based portal for fine-tuning and quantization of LLMs with the back end compute as Google Cloud TPUs, allowing for scalable training of larger models and even more seamless integration into the mobile development lifecycle.

Get Started

Ready to turn your mobile app into an AI powerhouse? Check out the Tunix SFT Notebook for Cactus and start fine-tuning Gemma 3 for your device today:

You can explore Tunix sample scripts, documentation and repo at:

Get ready for Google I/O: Program lineup revealed

Wednesday, April 23, 2025

The Google I/O agenda is live. We're excited to share Google’s biggest announcements across AI, Android, Web, and Cloud May 20-21. Tune in to learn how we’re making development easier so you can build faster.

We'll kick things off with the Google Keynote at 10:00 AM PT on May 20th, followed by the Developer Keynote at 1:30 PM PT. This year, we're livestreaming two days of sessions directly from Mountain View, bringing more of the I/O experience to you, wherever you are.

Here’s a sneak peek of what we’ll cover:

AI advancements: Learn how Gemini models enable you to build new applications and unlock new levels of productivity. Explore the flexibility offered by options like our Gemma open models and on-device capabilities.

Build excellent apps, across devices with Android: Crafting exceptional app experiences across devices is now even easier with Android. Dive into sessions focused on building intelligent apps with Google AI and boosting your productivity, alongside creating adaptive user experiences and leveraging the power of Google Play.

Powerful web, made easier: Exciting new features continue to accelerate web development, helping you to build richer, more reliable web experiences. We’ll share the latest innovations in web UI, Baseline progress, new multimodal built-in AI APIs using Gemini Nano, and how AI in DevTools streamline building innovative web experiences.

Plan your I/O

Join us online for livestreams May 20-21, followed by on-demand sessions and codelabs on May 22. Register today and explore the full program for sessions like these:

What's new in Android

Demis Hassabis on the frontiers of AI

What's new in Chrome

What's new in Google Cloud

We're excited to share what's next and see what you build!

By the Google I/O team

The Power of Open Source

Thursday, April 11, 2024

At the day 1 keynote of Open Source Summit North America, Timothy Jordan, Director of Developer Relations and Open Source at Google, will talk about the landscape of open source and AI, the importance of a responsible approach, and the transformative impact of community collaboration. In anticipation of this talk, let’s break down the AI open source ecosystem, and how Google approaches it.

Google believes in the power of open technology to drive innovation and benefit everyone. It fosters creativity and collaboration, while ensuring technology access for developers and allowing customization to fit unique use cases. Open source licenses give developers full creative autonomy without restriction. It is this ecosystem of open source and open technology, shaped by ML frameworks like TensorFlow, Keras, and JAX, that has enabled so many incredible advances in AI in recent years.

The open source community has been in discussion on how to apply the Open Source Definition to carry forward the open principles of the OSD while addressing concepts like derived work and author attribution in AI. During Timothy’s keynote, he’ll speak to his own philosophy on Open Source and AI, and share how his assumptions about how we apply open source to AI have evolved. The immediate availability of AI models, powered by the open source ecosystem of ML frameworks, means it’s more important than ever that we establish a shared definition for open source and AI.

While that definition is in development, at Google we’re using precise language to describe our openly available models like Gemma. The definition and license is only one part of this open ML/AI future; advancements in safety tooling, policies, and developer knowledge are all part of creating a responsible and open future for AI. Those advancements are all fueled by a dedication to collaboration. Whether sharing innovations and improvements with the community, or having conversations with policymakers and open source leaders, collaboration is key to a responsible approach to AI in the open ecosystem. AI can only be safe and responsible if everyone’s experiences and perspectives are brought to the forefront as it’s built.

To demonstrate how open source has made AI readily available, Timothy will also take the audience through a “low code” demo of how to run large language models in-browser for web applications. Using MediaPipe, the LLM Inference API, and Gemma, users can quickly add genAI capabilities like document summarization and text generation.

Join us at Open Source Summit North America for this keynote, and visit opensource.google to learn more.

By the Google Open Source team