AI & RoboticsNews

Sakana AI’s evolutionary algorithm discovers new architectures for generative models

A new technique developed by much-hyped Tokyo, Japan startup Sakana AI automatically creates generative models. The technique, called Evolutionary Model Merge, is inspired by the process of natural selection and combines parts of existing models to create more capable ones.

Sakana AI first announced its existence in August 2023, co-founded by esteemed AI researchers including former Googlers David Ha and “Attention Is All You Need” co-author Llion Jones (the paper that launched the current generative AI era).

Sakana’s new Evolutionary Model Merge technique can enable developers and organizations to create and discover new models through cost-effective methods and without the need to spend huge amounts to train and fine-tune their own models.

Sakana has released a large language model (LLM) and a vision-language model (VLM) created through Evolutionary Model Merge.

Introducing Evolutionary Model Merge: A new approach bringing us closer to automating foundation model development. We use evolution to find great ways of combining open-source models, building new powerful foundation models with user-specified abilities!https://t.co/G0EyM7pztr pic.twitter.com/msOokvqGbR

Training generative models is an expensive and complicated process that most organizations can’t afford. But with the release of open models such as Llama 2 and Mistral, developers have found innovative ways to improve them at low costs. 

One of these methods is “model merging,” where different components of two or more pre-trained models are combined to create a new one. If done correctly, the merged model can potentially inherit the strengths and capabilities of the source models.

Interestingly, merged models do not need additional training, making it very cost-effective. In fact, many of the top-performing models on Open LLM leaderboards are merged versions of popular base models. 

“What we are witnessing is a large community of researchers, hackers, enthusiasts and artists alike going about their own ways of developing new foundation models by fine-tuning existing models on specialized datasets, or merging existing models together,” Sakana AI’s researchers write on the company’s blog.

With more than 500,000 models available on Hugging Face, model merging offers vast possibilities for researchers, developers, and organizations to explore and create new models at a very low cost. However, model merging relies heavily on intuition and domain knowledge. 

Sakana AI’s new technique aims to provide a more systematic approach to discovering efficient model merges.

“We believe evolutionary algorithms, inspired by natural selection, can unlock more effective merging solutions,” Sakana AI’s researchers write.

Evolutionary algorithms are population-based optimization techniques inspired by biological evolution processes. They iteratively create candidate solutions by combining elements of the existing population and selecting the best solutions through a fitness function. Evolutionary algorithms can explore a vast space of possibilities, discovering novel and unintuitive combinations that traditional methods and human intuition might miss.

“The ability to evolve new models with new emergent capabilities, from a large variety of existing, diverse models with various capabilities have important implications,” Sakana AI founder David Ha told VentureBeat. “With the rising costs and resource requirement for training foundation models, by leveraging the rich variety of foundation models in the rich open-source ecosystem, large institutions or governments may consider the cheaper evolutionary approach for developing proof-of-concept prototype models quickly, before committing substantial capital or tapping into the nation’s resources to develop entirely custom models from scratch, if that is even needed at all.”

Sakana AI’s Evolutionary Model Merge is a general method that uses evolutionary techniques to discover the best ways to combine different models. Instead of relying on human intuition, Evolutionary Model Merge automatically combines the layers and weights of existing models to create and evaluate new architectures.

“By working with the vast collective intelligence of existing open models, our method is able to automatically create new foundation models with desired capabilities specified by the user,” according to Sakana’s blog.

Given the impressive advances in manually created merged models, the researchers wanted to see how far an evolutionary algorithm can go in finding new ways to combine the large pool of open-source foundation models.

They found that Evolutionary Model Merging discovered non-trivial ways to merge different models from vastly different domains such as non-English language and math or non-English language and vision.

“To test our approach, we initially tested our method to automatically evolve for us a Japanese Large Language Model (LLM) capable of Math reasoning, and a Japanese Vision-Language Model (VLM),” the researchers write.

The resulting models achieved state-of-the-art performance on several LLM and vision benchmarks without being explicitly optimized for them. For the LLM, they used the evolutionary algorithm to merge the Japanese LLM Shisa-Gamma and math-specific LLMs WizardMath and Abel. 

EvoLLM-JP, their 7-billion-parameter Japanese math LLM, achieved high performance on several Japanese LLM benchmarks, and even outperformed some state-of-the-art 70-billion-parameter Japanese LLMs.

“We believe our experimental Japanese Math LLM is good enough to be a general purpose Japanese LLM,” the researchers write.

For the Japanese VLM, they used LLaVa-1.6-Mistral-7B, a popular open-source VLM, and Shisa-Gamma 7B. EvoVLM-JP, the resulting model, was able to achieve higher scores than not only LLaVa-1.6-Mistral-7B but also JSVLM, an existing Japanese VLM. They released both models on Hugging Face and GitHub.

The team is also making progress in applying evolutionary model merge methods to image-generation diffusion models. They are creating a new version of Stable Diffusion XL that has high-quality results on Japanese prompts and can generate images very fast.

“We just had the EvoSDXL-JP results a few days before release, so we haven’t done a proper release / writeup for that model. Hopefully we can release that one in the next 1-2 months,” Ha said.

Ha, the former head of research at Stability AI and a former Google Brain researcher, founded Sakana AI with Llion Jones, one of the co-authors of the seminal 2017 research paper that introduced the Transformer architecture used in generative models.

Sakana AI focuses on applying nature-inspired ideas, such as evolution and collective intelligence, to create new foundation models.

“The future of AI will not consist of a single, gigantic, all-knowing AI system that requires enormous energy to train, run, and maintain, but rather a vast collection of small AI systems–each with their own niche and specialty, interacting with each other, with newer AI systems developed to fill a particular niche,” the researchers wrote.

Join us in Atlanta on April 10th and explore the landscape of security workforce. We will explore the vision, benefits, and use cases of AI for security teams. Request an invite here.


A new technique developed by much-hyped Tokyo, Japan startup Sakana AI automatically creates generative models. The technique, called Evolutionary Model Merge, is inspired by the process of natural selection and combines parts of existing models to create more capable ones.

Sakana AI first announced its existence in August 2023, co-founded by esteemed AI researchers including former Googlers David Ha and “Attention Is All You Need” co-author Llion Jones (the paper that launched the current generative AI era).

Sakana’s new Evolutionary Model Merge technique can enable developers and organizations to create and discover new models through cost-effective methods and without the need to spend huge amounts to train and fine-tune their own models.

Sakana has released a large language model (LLM) and a vision-language model (VLM) created through Evolutionary Model Merge.

VB Event

The AI Impact Tour – Atlanta

Continuing our tour, we’re headed to Atlanta for the AI Impact Tour stop on April 10th. This exclusive, invite-only event, in partnership with Microsoft, will feature discussions on how generative AI is transforming the security workforce. Space is limited, so request an invite today.


Request an invite

Model merging

Training generative models is an expensive and complicated process that most organizations can’t afford. But with the release of open models such as Llama 2 and Mistral, developers have found innovative ways to improve them at low costs. 

One of these methods is “model merging,” where different components of two or more pre-trained models are combined to create a new one. If done correctly, the merged model can potentially inherit the strengths and capabilities of the source models.

Interestingly, merged models do not need additional training, making it very cost-effective. In fact, many of the top-performing models on Open LLM leaderboards are merged versions of popular base models. 

“What we are witnessing is a large community of researchers, hackers, enthusiasts and artists alike going about their own ways of developing new foundation models by fine-tuning existing models on specialized datasets, or merging existing models together,” Sakana AI’s researchers write on the company’s blog.

With more than 500,000 models available on Hugging Face, model merging offers vast possibilities for researchers, developers, and organizations to explore and create new models at a very low cost. However, model merging relies heavily on intuition and domain knowledge. 

Evolutionary Model Merge

Sakana AI’s new technique aims to provide a more systematic approach to discovering efficient model merges.

“We believe evolutionary algorithms, inspired by natural selection, can unlock more effective merging solutions,” Sakana AI’s researchers write.

Evolutionary algorithms are population-based optimization techniques inspired by biological evolution processes. They iteratively create candidate solutions by combining elements of the existing population and selecting the best solutions through a fitness function. Evolutionary algorithms can explore a vast space of possibilities, discovering novel and unintuitive combinations that traditional methods and human intuition might miss.

“The ability to evolve new models with new emergent capabilities, from a large variety of existing, diverse models with various capabilities have important implications,” Sakana AI founder David Ha told VentureBeat. “With the rising costs and resource requirement for training foundation models, by leveraging the rich variety of foundation models in the rich open-source ecosystem, large institutions or governments may consider the cheaper evolutionary approach for developing proof-of-concept prototype models quickly, before committing substantial capital or tapping into the nation’s resources to develop entirely custom models from scratch, if that is even needed at all.”

Sakana AI’s Evolutionary Model Merge is a general method that uses evolutionary techniques to discover the best ways to combine different models. Instead of relying on human intuition, Evolutionary Model Merge automatically combines the layers and weights of existing models to create and evaluate new architectures.

Credit: Sakana AI

“By working with the vast collective intelligence of existing open models, our method is able to automatically create new foundation models with desired capabilities specified by the user,” according to Sakana’s blog.

Evolutionary merging in action

Given the impressive advances in manually created merged models, the researchers wanted to see how far an evolutionary algorithm can go in finding new ways to combine the large pool of open-source foundation models.

They found that Evolutionary Model Merging discovered non-trivial ways to merge different models from vastly different domains such as non-English language and math or non-English language and vision.

“To test our approach, we initially tested our method to automatically evolve for us a Japanese Large Language Model (LLM) capable of Math reasoning, and a Japanese Vision-Language Model (VLM),” the researchers write.

The resulting models achieved state-of-the-art performance on several LLM and vision benchmarks without being explicitly optimized for them. For the LLM, they used the evolutionary algorithm to merge the Japanese LLM Shisa-Gamma and math-specific LLMs WizardMath and Abel. 

EvoLLM-JP, their 7-billion-parameter Japanese math LLM, achieved high performance on several Japanese LLM benchmarks, and even outperformed some state-of-the-art 70-billion-parameter Japanese LLMs.

“We believe our experimental Japanese Math LLM is good enough to be a general purpose Japanese LLM,” the researchers write.

For the Japanese VLM, they used LLaVa-1.6-Mistral-7B, a popular open-source VLM, and Shisa-Gamma 7B. EvoVLM-JP, the resulting model, was able to achieve higher scores than not only LLaVa-1.6-Mistral-7B but also JSVLM, an existing Japanese VLM. They released both models on Hugging Face and GitHub.

The team is also making progress in applying evolutionary model merge methods to image-generation diffusion models. They are creating a new version of Stable Diffusion XL that has high-quality results on Japanese prompts and can generate images very fast.

“We just had the EvoSDXL-JP results a few days before release, so we haven’t done a proper release / writeup for that model. Hopefully we can release that one in the next 1-2 months,” Ha said.

Sakana AI’s vision

Ha, the former head of research at Stability AI and a former Google Brain researcher, founded Sakana AI with Llion Jones, one of the co-authors of the seminal 2017 research paper that introduced the Transformer architecture used in generative models.

Sakana AI focuses on applying nature-inspired ideas, such as evolution and collective intelligence, to create new foundation models.

“The future of AI will not consist of a single, gigantic, all-knowing AI system that requires enormous energy to train, run, and maintain, but rather a vast collection of small AI systems–each with their own niche and specialty, interacting with each other, with newer AI systems developed to fill a particular niche,” the researchers wrote.


Author: Ben Dickson
Source: Venturebeat
Reviewed By: Editorial Team

Related posts
AI & RoboticsNews

Nvidia and DataStax just made generative AI smarter and leaner — here’s how

AI & RoboticsNews

OpenAI opens up its most powerful model, o1, to third-party developers

AI & RoboticsNews

UAE’s Falcon 3 challenges open-source leaders amid surging demand for small AI models

DefenseNews

Army, Navy conduct key hypersonic missile test

Sign up for our Newsletter and
stay informed!