AI & Robotics News

Realtime generative AI art is here thanks to LCM-LoRA

November 18, 2023

Generative AI art has quickly emerged as one of the most interesting and popular applications of the new technology, with models such as Stable Diffusion and Midjourney claiming millions of users, not to mention OpenAI’s move to bundle its DALL-E 3 image generation model directly into its popular ChatGPT service earlier this fall. Simply by typing in a description and waiting a few short moments, users can see an image from their imagination rendered on screen by AI algorithms trained to do exactly that.

Yet, the fact that the user has to wait those “few short moments,” anywhere between a second or two to minutes for the AI to generate their image, is not ideal for our fast-paced, instant gratification modern world.

That’s why this week, the online AI art community is collectively freaking out about a new machine learning technique — LCM-LoRA, short for “Latent Consistency Model- Low-Rank Adaptation” developed by researchers at the Institute for Interdisciplinary Information Sciences (IIIS) at Tsinghua University in China and the AI code sharing platform HuggingFace, and described in a paper published on the pre-review open access research site arXiv.org — that finally brings generative AI art creation into realtime.

What does this mean, in a practical sense? Well, take a look at some of the videos shared by AI artists on X and LinkedIn below, and you’ll get an idea.

LCMs are insane.

moving the sun in real-time with AI pic.twitter.com/PGidr5iz3O

So, Generative AI in REAL TIME is here ?

Only 4 days have passed and nothing will ever be the same again. All the design tools, all the work processes, EVERYTHING is going to change.

? Here is everything you need to know about LCM-LoRA ??pic.twitter.com/eK3uoxXBlC

real-time changes everything.

launching soon @krea_ai pic.twitter.com/idw4n3bDTP

i’ve only had access for a short while, but the chokehold this new real-time LCM tool has on my dopamine supply right now is intense.

a whole new era of generative AI is about to be unleashed. pic.twitter.com/M701YoWBpp

you can now use @krea_ai with any 2D, 3D, animation, vector, or painting software in the world

Photoshop, Blender, Figma… pic.twitter.com/Rkw5giuDO8

Essentially, thanks to the LCM-LoRA technique, users can now move their cursors or paint simple, almost stick-figure like drawings or apply just a few shapes, alongside descriptive text, and AI art creation applications such as Krea.AI and Fal.AI will automatically render different, new, generated art instantaneously, even swapping out the imagery in fractions of a second as the user moves their shapes or paints simple lines on their digital canvas.

You can try it for yourself here at Fal.AI (permitting it stays up with increased use).

The technique works not only for flat, 2D images, but 3D assets as well, meaning artists could theoretically quickly create immersive environments instantly for use in mixed reality (AR/VR/XR), computer and video games, and other experiences. Theoretically, they could also be used in films, as well, drastically speeding up and reducing the costs of production.

“Everything is going to change,” commented one startup founder and former Google AI engineer on LinkedIn, about LCM-LoRA, a sentiment echoed by many in the AI arts community.

“A whole new era of generative AI is about to be unleashed,” commented another user on X.

University of Pennsylvania Wharton School of Business professor Ethan Mollick, one of the most active and vocal influencers and proponents of generative AI, opined that “we are going to see a lot of new user experiences soon,” thanks to the technique.

Text is only one interface with AI, we are going to see a lot of new user experiences soon. I had some fun with this demo that lets you edit images by just drawing (badly). https://t.co/zkPSxpyWw1 pic.twitter.com/1enJk7bv3K

The early demos of LCM-LoRA integrations into apps are undeniably captivating and do suggest to this author at VentureBeat/AI artist, to be a new watershed moment for generative AI in visual arts.

But what is the technological advancement at the heart of LCM-LoRA and can it scale across apps and different uses, as the early users imply?

According to the paper describing the technique published by researchers at IIIS Tsinghua University and HuggingFace, LCM-LoRA is ultimately a “universal training-free acceleration module that can be directly plugged into various Stable Diffusion fine-tuned models or SD LoRAs.”

It’s a mouthful for anyone not in the machine learning community, but to decode it into more layperson English, it’s essentially an algorithm that speeds up the process of turning text or source imagery into new AI generated artwork using the popular open-source Stable Diffusion AI model, and its fine-tuned, or altered, variants.

LCM-LoRA does this by reducing the number of “required sampling steps,” that is, processes the AI model must undergo to transform the source text or image — whether it be a description or a stick figure — into a higher-quality, higher-detailed image based on the learnings of the Stable Diffusion model from millions of images.

This means LCM-LoRA allows Stable Diffusion models to work faster, with fewer computational resources, so they don’t need to take up as much working memory or cycles on a person’s computer. This is what enables them to produce eye-popping results in realtime.

The fact that it is “universal,” means it can be plugged into a variety of apps that rely on Stable Diffusion or its variants to generate imagery. Whether it can be extended beyond Stable Diffusion, to proprietary models like OpenAI’s DALL-E 3 or Midjourney, remains to be seen.

We’ve reached out to one of the LCM-LoRA paper authors and will update this piece from them with more information when we hear back.

Are you ready to bring more awareness to your brand? Consider becoming a sponsor for The AI Impact Tour. Learn more about the opportunities here.

Generative AI art has quickly emerged as one of the most interesting and popular applications of the new technology, with models such as Stable Diffusion and Midjourney claiming millions of users, not to mention OpenAI’s move to bundle its DALL-E 3 image generation model directly into its popular ChatGPT service earlier this fall. Simply by typing in a description and waiting a few short moments, users can see an image from their imagination rendered on screen by AI algorithms trained to do exactly that.

Yet, the fact that the user has to wait those “few short moments,” anywhere between a second or two to minutes for the AI to generate their image, is not ideal for our fast-paced, instant gratification modern world.

That’s why this week, the online AI art community is collectively freaking out about a new machine learning technique — LCM-LoRA, short for “Latent Consistency Model- Low-Rank Adaptation” developed by researchers at the Institute for Interdisciplinary Information Sciences (IIIS) at Tsinghua University in China and the AI code sharing platform HuggingFace, and described in a paper published on the pre-review open access research site arXiv.org — that finally brings generative AI art creation into realtime.

What does this mean, in a practical sense? Well, take a look at some of the videos shared by AI artists on X and LinkedIn below, and you’ll get an idea.

VB Event

The AI Impact Tour

Connect with the enterprise AI community at VentureBeat’s AI Impact Tour coming to a city near you!

LCMs are insane.

moving the sun in real-time with AI pic.twitter.com/PGidr5iz3O

— titus (@TitusTeatus) November 13, 2023

So, Generative AI in REAL TIME is here ?

Only 4 days have passed and nothing will ever be the same again. All the design tools, all the work processes, EVERYTHING is going to change.

? Here is everything you need to know about LCM-LoRA ??pic.twitter.com/eK3uoxXBlC

— Javi Lopez ⛩️ (@javilopen) November 14, 2023

real-time changes everything.

launching soon @krea_ai pic.twitter.com/idw4n3bDTP

— vicc (@viccpoes) November 14, 2023

i’ve only had access for a short while, but the chokehold this new real-time LCM tool has on my dopamine supply right now is intense.

a whole new era of generative AI is about to be unleashed. pic.twitter.com/M701YoWBpp

— surea.i (@sureailabs) November 14, 2023

you can now use @krea_ai with any 2D, 3D, animation, vector, or painting software in the world

Photoshop, Blender, Figma… pic.twitter.com/Rkw5giuDO8

— titus (@TitusTeatus) November 16, 2023

Essentially, thanks to the LCM-LoRA technique, users can now move their cursors or paint simple, almost stick-figure like drawings or apply just a few shapes, alongside descriptive text, and AI art creation applications such as Krea.AI and Fal.AI will automatically render different, new, generated art instantaneously, even swapping out the imagery in fractions of a second as the user moves their shapes or paints simple lines on their digital canvas.

You can try it for yourself here at Fal.AI (permitting it stays up with increased use).

The technique works not only for flat, 2D images, but 3D assets as well, meaning artists could theoretically quickly create immersive environments instantly for use in mixed reality (AR/VR/XR), computer and video games, and other experiences. Theoretically, they could also be used in films, as well, drastically speeding up and reducing the costs of production.

“Everything is going to change,” commented one startup founder and former Google AI engineer on LinkedIn, about LCM-LoRA, a sentiment echoed by many in the AI arts community.

“A whole new era of generative AI is about to be unleashed,” commented another user on X.

University of Pennsylvania Wharton School of Business professor Ethan Mollick, one of the most active and vocal influencers and proponents of generative AI, opined that “we are going to see a lot of new user experiences soon,” thanks to the technique.

Text is only one interface with AI, we are going to see a lot of new user experiences soon. I had some fun with this demo that lets you edit images by just drawing (badly). https://t.co/zkPSxpyWw1 pic.twitter.com/1enJk7bv3K

— Ethan Mollick (@emollick) November 15, 2023

What is LCM-LoRA and how does it work?

The early demos of LCM-LoRA integrations into apps are undeniably captivating and do suggest to this author at VentureBeat/AI artist, to be a new watershed moment for generative AI in visual arts.

But what is the technological advancement at the heart of LCM-LoRA and can it scale across apps and different uses, as the early users imply?

According to the paper describing the technique published by researchers at IIIS Tsinghua University and HuggingFace, LCM-LoRA is ultimately a “universal training-free acceleration module that can be directly plugged into various Stable Diffusion fine-tuned models or SD LoRAs.”

It’s a mouthful for anyone not in the machine learning community, but to decode it into more layperson English, it’s essentially an algorithm that speeds up the process of turning text or source imagery into new AI generated artwork using the popular open-source Stable Diffusion AI model, and its fine-tuned, or altered, variants.

LCM-LoRA does this by reducing the number of “required sampling steps,” that is, processes the AI model must undergo to transform the source text or image — whether it be a description or a stick figure — into a higher-quality, higher-detailed image based on the learnings of the Stable Diffusion model from millions of images.

This means LCM-LoRA allows Stable Diffusion models to work faster, with fewer computational resources, so they don’t need to take up as much working memory or cycles on a person’s computer. This is what enables them to produce eye-popping results in realtime.

The fact that it is “universal,” means it can be plugged into a variety of apps that rely on Stable Diffusion or its variants to generate imagery. Whether it can be extended beyond Stable Diffusion, to proprietary models like OpenAI’s DALL-E 3 or Midjourney, remains to be seen.

We’ve reached out to one of the LCM-LoRA paper authors and will update this piece from them with more information when we hear back.

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.

Author: Carl Franzen
Source: Venturebeat
Reviewed By: Editorial Team

1294

0