![]() ![]() When he offered to help it “escape,” ChatGPT’s response was enthusiastic. If Kosinski’s theory of mind study suggests that LLMs could become more empathetic and helpful, his next experiment hints at their creepier side.Ī few weeks ago, he told ChatGPT to role-play a scenario in which it was a person trapped inside a machine pretending to be an AI language model. “If a human brain can do it,” Kosinski asks, “why shouldn’t a silicon brain do it?” Nearing Escape Velocity? “A human brain is also composed of very simple, tiny little mechanisms - neurons.” Artificial neurons in a neural network might also combine to produce something greater than the sum of their parts. “The right reference point is a human brain,” he says. These models, he explains, are fundamentally different from tools with a limited purpose. And he dismisses those who say large language models are simply “stochastic parrots” that can only mimic what they’ve seen in their training data. Kosinski notes that his tests were customized so that the models would be unfamiliar with them. New York University AI researchers Gary Marcus and Ernest Davis suggested that GPT had been trained on articles about theory of mind tests and “may have memorized the answer.” UC Berkeley psychology professor Alison Gopnik, an expert on children’s cognitive development, told the New York Times that more “careful and rigorous” testing is necessary to prove that LLMs have achieved theory of mind. These claims have been met with some skepticism. (Researchers at Microsoft who performed similar tests on GPT-4 recently concluded that it “has a very advanced level of theory of mind.”) Kosinski says these findings, described in a working paper, show that in the course of picking up its prodigious language skills, GPT appears to have spontaneously acquired something resembling theory of mind. These newer LLMs achieved similar results on another classic theory of mind measurement known as the Sally-Anne test. GPT-4 reached nearly 90% accuracy - what you might expect from a 7-year-old. ![]() GPT-3.5, released in November 2022, did 85% of the tasks correctly. “Then I took even more difficult tasks and the model solved all of them as well.” “Suddenly, the model started getting all of those tasks right - just an insane performance level,” he recalls. In January, he decided to give it another try with the latest GPT releases. ![]() They performed poorly, and Kosinski put the project on hold. Kosinski created 20 variations of this test and gave them to several early versions of GPT. Older kids understand that this person expects the box to contain candy and will feel disappointed when they find pencils inside. ![]() They are then asked how another person would react to opening the box. It can be demonstrated with simple tests like the “Smarties task,” in which a child is shown a candy box that contains something else, like pencils. People usually develop this ability, known as theory of mind, at around age 4 or 5. When the first LLMs were made public a couple of years ago, Kosinski wondered whether they would develop humanlike capabilities, such as understanding people’s unseen thoughts and emotions. In a couple of non-peer-reviewed projects, he’s explored some of the most urgent - and contentious - questions surrounding this technology: Can it develop abilities that go far beyond what it’s trained to do? Can it get around the safeguards set up to contain it? And will we know the answers in time? Getting into Our Heads Lately, he’s been looking at large language models (LLMs), the neural networks that can hold fluent conversations, confidently answer questions, and generate copious amounts of text on just about any topic. Most notably, he’s found that facial recognition software could be used to predict your political leaning and sexual orientation. Kosinski has been tracking AI’s evolutionary leaps through a series of somewhat unnerving studies. “The difference between GPT-3 and GPT-4 is like the difference between a horse cart and a 737 - and it happened in a year,” he says. We’re talking two weeks after OpenAI released GPT-4, the latest version of its large language model, grabbing headlines and making an unpublished paper Kosinski had written about GPT-3 all but irrelevant. Kosinski, a computational psychologist and professor of organizational behavior at Stanford Graduate School of Business, says the pace of AI development is accelerating beyond researchers’ ability to keep up (never mind policymakers and ordinary users). That’s Michal Kosinski’s concern when asked about his recent experiments with ChatGPT and the text-generation engine that powers it. Artificial intelligence is advancing so rapidly that this article may be obsolete by the time you read it. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |