A close look reveals that the newest systems, including DeepMind’s much-hyped Gato, are still...
To the average person, it must seem as if the field of artificial intelligence is making immense progress.
According to the press releases, and some of the more gushing media accounts, OpenAI's DALL-E 2 can seemingly create spectacular images from any text; another OpenAI system called GPT-3 can talk about just about anything; and a system called Gato that was released in May by DeepMind, a division of Alphabet, seemingly worked well on every task the company could throw at it.
One of DeepMind's high-level executives even went so far as to brag that in the quest for artificial general intelligence, AI that has the flexibility and resourcefulness of human intelligence, "The Game is Over!" And Elon Musk said recently that he would be surprised if we didn't have artificial general intelligence by 2029.
To be sure, there are indeed some ways in which AI truly is making progress-synthetic images look more and more realistic, and speech recognition can often work in noisy environments-but we are still light-years away from general purpose, human-level AI that can understand the true meanings of articles and videos, or deal with unexpected obstacles and interruptions.
Take the recently celebrated Gato, an alleged jack of all trades, and how it captioned an image of a pitcher hurling a baseball.
The system returned three different answers: "A baseball player pitching a ball on top of a baseball field," "A man throwing a baseball at a pitcher on a baseball field" and "A baseball player at bat and a catcher in the dirt during a baseball game." The first response is correct, but the other two answers include hallucinations of other players that aren't seen in the image.
The system has no idea what is actually in the picture as opposed to what is typical of roughly similar images.
A newer version of the system, released in May, couldn't tell the difference between an astronaut riding a horse and a horse riding an astronaut.
Gato worked well on all the tasks DeepMind reported, but rarely as well as other contemporary systems.
It's time for artificial intelligence researchers to look up.