Lekun highlighted that the current emphasis on language models and textual data within the technology industry falls short in creating advanced artificial intelligence systems that truly mimic human capabilities, a longstanding dream among researchers.
He pointed out that “text is an extremely limited source of information,” suggesting that humans might require 20,000 years to read the vast amount of text used to train contemporary language models. Despite training the system with the equivalent of 20,000 years of reading material, Lekun stated that the system still struggles to understand basic concepts, like the relationship between A and B when both are identical. He stressed that fundamental information about the world is missing from this type of training.
Lekun, alongside other Meta AI executives, engaged in extensive research on designing transformer models, essential for applications such as GPT Chat, enabling them to work seamlessly with diverse data types, including audio, image, and video.
Transformers play a vital role in enabling artificial intelligence systems to perform more complex tasks by uncovering countless potential hidden connections between various data types.
Among Meta’s research projects, there are programs designed to assist individuals in improving their tennis skills, utilizing the company’s augmented reality glasses that integrate digital graphics into the real-world environment.
Meta’s demonstration showcased a person wearing augmented reality glasses during a tennis match, receiving visual cues on proper racket handling and swinging techniques.
For training its artificial intelligence program, Llama AI, Meta utilized 16,000 Nvidia A100 processors. The models required for running digital tennis assistants integrate three-dimensional visual data, text, and sound.
The frontier of artificial intelligence lies in multimedia-level systems, but their development comes at a considerable cost. Advanced artificial intelligence models could provide Nvidia with a significant advantage, particularly if there’s no emergence of a competing alternative. Nvidia has notably benefited from generative artificial intelligence, with its high-end graphics processing units becoming the standard for training large language models. In Meta’s case, they relied on 16,000 Nvidia A100 graphics processing units for training the Llama AI program.
Leave a Reply