Notebookcheck Logo

Elon Musk claims AI has exhausted real-world training data

Elon Musk: AI has consumed humanity's knowledge; synthetic data is the future (Image source: Dall-E 3)
Elon Musk: AI has consumed humanity's knowledge; synthetic data is the future (Image source: Dall-E 3)
Elon Musk claims AI has exhausted available real-world training data since 2024, advocating for synthetic data generation as the future of AI development. Major tech companies already embrace this approach, though researchers warn of potential risks like model collapse and bias amplification.

In a recent interview at CES, Elon Musk mentioned that artificial intelligence has basically used up all the real-world training data available, pointing to synthetic data generation as the primary way forward. This idea aligns with what former OpenAI chief scientist Ilya Sutskever said about hitting "peak data" in AI development.

Musk believes we ran out of human-produced data back in 2024. As the CEO of Tesla and the owner of xAI, he stressed that getting AI to create its own training data is the most practical solution for moving AI ahead. This method lets AI systems check on themselves and learn as they go.

Plenty of big tech companies have already hopped on the synthetic data train. Microsoft’s newly open-sourced Phi-4 model, for instance, relies on a combo of synthetic and real-world information, while Google is using a similar strategy for its Gemma models. Anthropic’s Claude 3.5 Sonnet and Meta’s latest Llama series also rely on AI-generated data.

Meanwhile, analysts at Gartner predict that by 2024, around 60 percent of the data used in AI and analytics projects will be synthetic. One big reason for the shift is cost. AI startup Writer says it spent about $700,000 developing its Palmyra X 004 model—way cheaper than the estimated $4.6 million to build a comparable OpenAI model.

But synthetic data isn’t without its issues. Researchers warn about the risk of “model collapse,” where AI can become less inventive and more biased. This problem might crop up if any biases in the original dataset get amplified when the AI starts churning out fresh data on its own.

Source(s)

Fast Technology (in Chinese)

Read all 6 comments / answer
static version load dynamic
Loading Comments
Comment on this article
Please share our article, every link counts!
Mail Logo
> Expert Reviews and News on Laptops, Smartphones and Tech Innovations > News > News Archive > Newsarchive 2025 01 > Elon Musk claims AI has exhausted real-world training data
Nathan Ali, 2025-01-13 (Update: 2025-01-13)