Hear from CIOs, CTOs, and other C-level and senior execs on data and AI strategies at the Future of Work Summit this January 12, 2022. Learn more
Austria-based Mostly AI, a startup that simulates synthetic data for AI model training and testing, today announced it has raised $25 million in a series B round from Molten Ventures. The company plans to use the investment to accelerate its work in setting the groundwork for responsible and unbiased AI, hiring fresh talent, and strengthening its presence across Europe and North America.
For any modern-day enterprise, the biggest challenge associated with leveraging data for AI/ML is ensuring the privacy of its consumers — the original source of the data — and eliminating the possibility of any sort of bias due to historical or social inequities in that data. Organizations often find a hard time dealing with the two problems and either end up facing fines for privacy violations (under regulations such as GDPR) or train a model which is unfair on one or more parameters.
Mostly AI synthetic data generator
To tackle challenges, data scientists Michael Platzer, Klaudius Kalcher, and Roland Boubela started Mostly AI in 2017. The startup uses AI to create a realistic & representative synthetic dataset, one that retains the information required for the data value chain – from AI model training and advanced analytics to software testing – but no original personal data points.
This gives data scientists and engineers as-good-as-real, yet fully anonymous, data in their hands to work with.
The solution works by leveraging a state-of-the-art generative deep neural network with an in-built privacy mechanism. It learns valuable statistical patterns, structures, and variations from the original data and recreates these patterns using a population of fictional characters to give out a synthetic copy that is privacy compliant, de-biased, and just as useful as the original dataset – reflecting behaviors and patterns with up to 99% accuracy.
In addition to ensuring privacy-safe and fair AI/ML projects, Mostly AI’s platform also accelerates enterprises’ overall time to data. This is because, unlike original datasets, synthetic data can be generated quickly in abundance. The company claims its technology has already been proven to reduce time-to-data by 90%, saving larger companies more than $10 million annually on data provisioning and internal overhead and boosting available data by 85%.
With the latest funding, which saw the participation of Earlybird, 42CAP, and Citi Ventures, Mostly AI plans to expand its footprint in the U.S. and Europe and build out its customer base in the banking and insurance sector. The company has already roped in multiple Fortune 100 banks and insurers.
However, it is not the only player in this space. Tonic.ai, Synthesis AI, Hazy, and Gretel are a few other startups that are working to create fake datasets to help enterprises accelerate their AI projects.
Demand for synthetic AI
Given the growing concerns and regulations around data privacy and the surging need for data-driven solutions, synthetic data is expected to be a major driver for enterprises in the near future. According to Gartner, by 2024, 60% of the data used for the development of AI and analytics projects will be synthetically generated. In a separate study, nine out of ten technical decision-makers using vision data said synthetic data is a new and innovative technology and critical to staying ahead of the curve.
“2022 will be the year of synthetic data. Synthetic data helps solve some of the industry’s most vexing issues when it comes to AI. It eliminates concerns about data privacy, it can be freely shaped and formed in order to accelerate AI initiatives, and it enables enterprises to augment and de-bias their data sets,” Tobias Hann, CEO of Mostly AI, said in a statement. “We’re extremely excited about the future of synthetic data, and to partner with Molten Ventures, which shares our vision for fundamentally changing how companies work with data.”
VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact.
Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:
- up-to-date information on the subjects of interest to you
- our newsletters
- gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
- networking features, and more
Author: Shubham Sharma