
Adobe Photoshop is among the most recognizable pieces of software ever created, used by more than 90% of the world’s creative professionals, according to Photutorial.
So the fact that a new open source AI model — Qwen-Image Edit, released yesterday by Chinese e-commerce giant Alibaba’s Qwen Team of AI researchers — is now able to accomplish a huge number of Photoshop-like editing jobs with text inputs alone, is a notable achievement.
Built on the 20-billion-parameter Qwen-Image foundation model released earlier this month, Qwen-Image-Edit extends the system’s unique strengths in text rendering to cover a wide spectrum of editing tasks, from subtle appearance changes to broader semantic transformations.
Simply upload a starting image — I tried one of myself from VentureBeat’s last annual Transform conference in San Francisco — and then type instructions of what you want to change, and Qwen-Image-Edit will return a new image with those edits applied.
Input image example:
Output image example with prompt: “Make the man wearing a tuxedo.”
The model is available now across several platforms, including Qwen Chat, Hugging Face, ModelScope, GitHub, and through the Alibaba Cloud application programming interface (API), the latter which allows any third-party developer or enterprise to integrate this new model into their own applications and workflows.
I created my examples above on Qwen Chat, the Qwen Team’s rival to OpenAI’s ChatGPT, however, it should be noted for any aspiring users that generations are limited to about 8 free jobs (input/outputs) per 12 hour period before it resets. Paying users can have access to more jobs.
With support for both English and Chinese inputs, and a dual focus on both semantic meaning and visual fidelity, Qwen-Image-Edit aims to lower barriers to professional-grade visual content creation.
And given that the model is available as an open source code under an Apache 2.0 license, it’s safe for enterprises to take, download and set up for free on their own hardware or virtual clouds/machines, potentially resulting in a huge cost savings from proprietary software like Photoshop.
As Junyang Lin, a Qwen Team researcher wrote on X, “it can remove a strand of hair, very delicate image modification.”
Finally, the final piece of Qwen-Image, Qwen-Image-Edit. OMG it can remove a strand of hair, very delicate image modification! Come and play with it on Qwen Chat! https://t.co/iYWR5u7KUI https://t.co/Fu3ofwsHqL
— Junyang Lin (@JustinLin610) August 18, 2025
Finally, the final piece of Qwen-Image, Qwen-Image-Edit. OMG it can remove a strand of hair, very delicate image modification! Come and play with it on Qwen Chat! https://t.co/iYWR5u7KUI https://t.co/Fu3ofwsHqL
The team’s announcement echoes this sentiment, presenting Qwen-Image-Edit not as an entirely new system, but as a natural extension of Qwen-Image that applies its unique text rendering and dual-encoding approach directly to editing tasks.
Dual encodings allow for edits preserving style and content of original image
Qwen-Image-Edit builds on the foundation established by Qwen-Image, which was introduced earlier this year as a large-scale model specializing in both image generation and text rendering.
Qwen-Image’s technical report highlighted its ability to handle complex tasks like paragraph-level text rendering, Chinese and English characters, and multi-line layouts with accuracy.
The report also emphasized a dual-encoding mechanism, feeding images simultaneously into Qwen2.5-VL for semantic control and a variational autoencoder (VAE) for reconstructive detail. This approach allows edits that remain faithful to both the intent of the prompt and the look of the original image.
Those same architectural choices underpin Qwen-Image-Edit. By leveraging dual encodings, the model can adjust at two levels: semantic edits that change the meaning or structure of a scene, and appearance edits that introduce or remove elements while keeping the rest untouched.
Semantic editing includes creating new intellectual property, rotating objects 90 or 180 degrees to reveal different views, or transforming an input into another style such as Studio Ghibli-inspired art. These edits typically modify many pixels but preserve the underlying identity of objects.
Here’s an example of semantic editing from Shridhar Athinarayanan, an engineer at AI applications platform Replicate, who used a Replicate-hosted implementation or “inference” of Qwen to reskin a photo of Manhattan to look like a toy Lego set.
Qwen has solved image editing
$0.03, 3 seconds per edit on Replicate
Everyone go find a different problem to solve lolhttps://t.co/5VKBiWBEqj pic.twitter.com/Xgw8UJxgOh
— Shridhar (@shridharathi) August 18, 2025
Qwen has solved image editing
$0.03, 3 seconds per edit on Replicate
Everyone go find a different problem to solve lolhttps://t.co/5VKBiWBEqj pic.twitter.com/Xgw8UJxgOh
Appearance editing focuses on precise, local changes. In these cases, most of the image remains unchanged while specific objects are altered. Demonstrations include adding a signboard that generates a reflection in water, removing stray hair strands from a portrait, and changing the color of a single letter in a text image.
One good example of appearance editing with Qwen-Image Edit comes from AnswerAI co-founder and CEO Thomas Hill who posted a side-by-side on X showing his wife in her wedding dress below an archway and another with the same archway covered with graffiti:
Wife asked me to make a wedding photo more edgy. This is fun! @replicate https://t.co/BcuvDIUhfA pic.twitter.com/0reVjLVH3E
— Thomas Hill (@TomAnswerAi) August 19, 2025
Wife asked me to make a wedding photo more edgy. This is fun! @replicate https://t.co/BcuvDIUhfA pic.twitter.com/0reVjLVH3E
Combined with Qwen’s established strength in rendering Chinese and English text, the editing-focused system is positioned as a flexible tool for creators who need more than simple generative imagery.
The dual control over semantic scope and appearance fidelity means the same tool can serve very different needs, from creative IP development to production-level photo retouching.
Adding or removing text to images
Another standout capability is bilingual text editing. Qwen-Image-Edit allows users to add, remove, or modify text in both Chinese and English while preserving font, size, and style.
This expands on Qwen-Image’s reputation for strong text rendering, particularly in challenging scenarios like intricate Chinese characters.
In practice, this allows for accurate editing of posters, signs, T-shirts, or calligraphy artworks where small text details matter, as seen in another example from Replicate below.
We updated Qwen-Image-Edit to have better quality and consistency and by popular demand aspect ratios https://t.co/0D0JaJMu68
— John (@johnrachwan) August 19, 2025
We updated Qwen-Image-Edit to have better quality and consistency and by popular demand aspect ratios https://t.co/0D0JaJMu68
One demonstration involved correcting errors in a piece of generated Chinese calligraphy through a step-by-step chained editing process.
Users could highlight incorrect regions, instruct the system to fix them, and then further refine details until the correct characters were rendered. This iterative approach shows how the model can be applied to high-stakes editing tasks where precision is essential.
Applications and use cases
The Qwen team has highlighted a range of potential applications:
By bridging fine-grained editing with broader creative transformations, Qwen-Image-Edit caters to professionals who need control while remaining approachable for casual experimentation.
Benchmarking and performance
According to the Qwen team, evaluations across public benchmarks indicate that Qwen-Image-Edit delivers state-of-the-art performance in image editing.
This follows from the broader Qwen-Image technical evaluations, where the base model achieved leading results in both general image generation and text rendering tasks.
While specific editing benchmark figures were not detailed in the release, Qwen-Image itself ranked highly in independent evaluations such as AI Arena, where human raters compared outputs across models from different providers.
API pricing and availability
Through Alibaba Cloud Model Studio, developers can access Qwen-Image-Edit as an API. Pricing is set at $0.045 per image, with a free quota of 100 images valid for 180 days after activation.
The service is initially available in the Singapore region, with a rate limit of five requests per second and up to two concurrent tasks per account.
To use the API, developers must obtain a Model Studio API key and can call the model via HTTP or through the DashScope SDK in Python or Java.
Images can be submitted as URLs or in Base64 format, with supported resolutions ranging from 512 to 4,096 pixels and file sizes up to 10 MB. Output images are hosted on Alibaba Cloud Object Storage with links valid for 24 hours, requiring users to download and save results promptly.
What’s next for Qwen?
Qwen positions Image-Edit as a step toward lowering barriers for visual content creation. By making precise, style-consistent editing more accessible, the model could support applications from design studios to casual users refining personal projects.
The system also signals a broader trend in AI development: moving beyond single-purpose generation toward tools that integrate editing, correction, and refinement.
With both semantic flexibility and appearance-level precision, Qwen-Image-Edit reflects this shift, blending the generative strengths of large models with the reliability required for professional editing.
Author: Carl Franzen
Source: Venturebeat
Reviewed By: Editorial Team