Last year, researchers at Data Science Nigeria noted that engineers looking to train computer-vision algorithms could choose from a wealth of data sets featuring Western clothing, but there were none for African clothing. The team addressed the imbalance by using AI to generate artificial images of African fashion—a whole new data set from scratch.
Such synthetic data sets—computer-generated samples with the same statistical characteristics as the genuine article—are growing more and more common in the data-hungry world of machine learning. These fakes can be used to train AIs in areas where real data is scarce or too sensitive to use, as in the case of medical records or personal financial data.
The idea of synthetic data isn’t new: driverless cars have been trained on virtual streets. But in the last year the technology has become widespread, with a raft of startups and universities offering such services. Datagen and Synthesis AI, for example, supply digital human faces on demand. Others provide synthetic data for finance and insurance. And the Synthetic Data Vault, a project launched in 2021 by MIT’s Data to AI Lab, provides open-source tools for creating a wide range of data types.
- How to Start a Hardware Store
- The Pros and Cons of Plastic Greenhouses: What You Need to Know
- How to Design and 3D Print a Mashup Model
- Imperial College London & Microsoft Propose a Cheap and Accessible Method for Upgrading 3D Printers to 5 Axes
- Northern B.C. drone study aims to improve access to healthcare supplies for Indigenous communities
- Everything You Need to Know about Tower Cranes
- Safety And Security Drones Market Size, Trends, Comprehensive Analysis, 2022-2030
- Stigmergy versus behavioral flexibility and planning in honeybee comb construction Stigmergy versus behavioral flexibility and planning in honeybee comb construction