International Circuit

Davos 2024: Global AI experts warn of depleting internet data resources

January 17, 2024

After the first boom of generative models and generative artificial intelligence (AI) in the past couple of years, the world is running out of data on the internet to capture and process, but it’s not the end as visual data is yet to be understood, said AI experts at the World Economic Forum in Davos on January 16.

The experts were in a panel discussion on ‘The Expanding Universe of Generative Models’. The panel included Yann LeCun, Turing Award winning computer scientist often called the one of the Godfathers of AI; Daphne Koller, founder and CEO of Insitro Inc and a co-founder of Coursera; Andrew NG, founder of DeepLearning.AI, co-founder of Google Brain and Coursera; Kai Fu Lee, Computer Scientist and founder of 01.AI Pte and Aidan Gomez, co-founder and CEO of Cohere Inc. The panel was moderated by Nicholas Thompson, CEO, The Altantic.

While Thompson began asking that data seems to have been saturating right now and what will this mean for the next revolution of generative models, LeCun said that there was “no question of saturating”.

Indeed, the world is running out of data having used all the public data on the internet. It would take a 150,000-200,000 years for anyone to read this, LeCun said.

“Data has been the single biggest enabler to the incredible amount of Gen AI progress we have seen so far. We have just scratched the surface of the data that are going to become available. We have agents in place with augmented reality. There is data hidden in silos, we have to make it more interactive,” Koller added.

According to Andrew Ng, the ceiling will get harder and harder to break in the next phase. “We saw the text revolution happen last year. I think this year we will see the image processing revolution take place. I am seeing a lot of revolution happen in automation rather than prompt LLM, get response etc,” he said.

He added, “We are used to running LLMs (large language models) in the Cloud but now with more open source, we will be running a lot LLMs on our own devices. We have all of these vectors of innovation and I am optimistic on the opportunity.”

Lee believes the pace of innovation that was seen in generative AI last year will likely slowdown a little bit, but it will still continue to grow. After adding more compute and data, there’s a scope for more fine tuning into different aspects as more and more entrepreneurs get into the game, he said.

He added that one shouldn’t be missing on the incredible commercial value text-based LLMs have.

“They give an incredible pretence of logical reasoning, common sense, and dramatically improve our productivity. They are getting deployed everywhere; so as I put my entrepreneur hat, there so much value to be reaped,” he said.

Gomez concurred, adding “We have huge bottlenecks today, we already know the limitations in the architecture we have and the methods we are using.”

“That’s going to get easier and at the same time hardware is getting better. Next generation of GPUs are going to be big step over the generation we have today, it will unlock a much bigger scale. I am optimistic,” Gomez said.

LLMs not as smart as a toddler

As people would conclude that data on the internet has saturated and fully accessed, LeCun cited an example of a 4-year-old child who is seeing and gathering data through his vision, and no one has really put a number on the data that child has gathered in his lifetime.

“It’s about 20 MB of data going through the optical nerves for 16,000 waking hours for the first four years of life. Now that is 3600 secs per hours, and then calculate that,” LeCun said.

“It tells you that a 4-year-old child has seen 50 times more information than the biggest LLMs that we have. The amount of knowledge accumulated is small because its into different form. There’s ultimately enormous amount of data for the kid has about how the world works,” he added.

While there is no known LLMs or technology, science and architecture in place yet to capture day to day sensory information, it will require more scientific and technological breakthrough to do so and may be another 5,10 or more years to accomplish, LeCun said.

According to LeCun, the models that will work best to manage and capture video data “will not be generative.”

“They do not generate images, reconstruct or predict. What they do is, they predict but in space of abstract representation. What’s necessary is to make predictions in abstract space than pixel space,” he said. Moneycontrol

Communications Today

Davos 2024: Global AI experts warn of depleting internet data resources

International Circuit

Davos 2024: Global AI experts warn of depleting internet data resources

Leave a Reply