Elevate your enterprise data technology and strategy at Transform 2021.
Facebook today introduced TextStyleBrush, an AI research project that can copy the style of text in a photo from just a single word. The company claims that TextStyleBrush, which can edit and replace arbitrary text in images, is the first “unsupervised” system of its kind that can recognize both typefaces and handwriting.
AI-generated images have been advancing at a breakneck pace, and they have obvious business applications, like photorealistic translation of languages in augmented reality (AR). (The AR market was anticipated to be worth $18.8 billion by the end of 2020, according to Statista.) But building a system that’s flexible enough to understand the nuances of text and handwriting is a difficult challenge, because it means comprehending styles for not just typography and calligraphy but for transformations like rotations, curved text, deformations, background clutter, and image noise.
TextStyleBrush works similar to the way style brush tools work in word processors but for text aesthetics in images, according to Facebook. Unlike previous approaches, which define specific parameters such as typeface or target style supervision, it takes a more holistic training approach and disentangles the content of a text image from all aspects of its appearance.
Unsupervised learning
The “unsupervised” part of the system refers to unsupervised learning, the process by which the system was subjected to “unknown” data for which no previously defined categories or labels existed. TextStyleBrush had to teach itself to classify data, processing the unlabeled data to learn from its inherent structure.
As Facebook notes, typically, systems like TextStyleBrusht involve training with annotated data that teach the system to classify individual pixels as either “foreground” or “background” objects. But it’s tough to apply this to images captured in the real world. Handwriting can be one pixel in width or less, and collecting high-quality training data requires labeling the foregrounds and backgrounds.
By contrast, given a detected “text box” containing a source style, TextStyleBrush renders new content in the style of the source text using a single sample. While it occasionally struggles with text written in metallic objects and characters in different colors, Facebook says that TextStyleBrush proves it’s possible to build systems that can learn to transfer text aesthetics with more flexibility than what was possible before.
“We hope this work will continue to lower barriers to photorealistic translation [and] creative self-expression,” Facebook said in a blog post. “While this technology is research, it can power a variety of useful applications in the future, like translating text in images to different languages, creating personalized messaging and captions, and maybe one day facilitating real-world translation of street signs using AR.”
The capabilities, methods, and results of the work on TextStyleBrush are available on Facebook’s developer portal. The company plans to submit it to a peer-reviewed journal in the future, it says.
VentureBeat
VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact.
Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:
- up-to-date information on the subjects of interest to you
- our newsletters
- gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
- networking features, and more
Author: Kyle Wiggers
Source: Venturebeat