Schools brief | Artificial intelligence

AI firms will soon exhaust most of the internet’s data

Can they create more?

A mining train going into the mine, full of 0s and 1s.

image: Mike Haddad

Jul 23rd 2024

In 2006 fei-fei li, then at the University of Illinois, now at Stanford University, saw how mining the internet might help to transform ai research. Linguistic research had identified 80,000 “noun synonym sets”, or synsets: groups of synonyms that described the same sort of thing. The billions of images on the internet, Dr Li reckoned, must offer hundreds of examples of each synset. Assemble enough of them and you would have an ai training resource far beyond anything the field had ever seen. “A lot of people are paying attention to models,” she said. “Let’s pay attention to data.” The result was ImageNet.

Explore more

This article appeared in the Schools brief section of the print edition under the headline “Mining the net”

Schools brief July 27th 2024

AI firms will soon exhaust most of the internet’s data

From the July 27th 2024 edition

Discover stories from this section and more in the list of contents

Explore the edition

Reuse this content

More from Schools brief

A computer covered in hazard tape.

AI needs regulation, but what kind, and how much?

Different countries are taking different approaches to regulating artificial intelligence

A toolbox filled with regular tools and speech bubbles.

LLMs will transform medicine, media and more

But not without a helping (human) hand

A flamme under a container diffusing letters turned into a speech bubble.

How AI models are getting smarter

Deep neural networks are learning diffusion and other tricks

The race is on to control the global supply chain for AI chips

The focus is no longer just on faster chips, but on more chips clustered together

A short history of AI

In the first of six weekly briefs, we ask how AI overcame decades of underdelivering

Finding living planets

Life evolves on planets. And planets with life evolve