FreshRSS

Normální zobrazení

Jsou dostupné nové články, klikněte pro obnovení stránky.
PředevčíremHlavní kanál
  • ✇Boing Boing
  • TikTok is making you boredGail Sherman
    If your response to boredom is to whip out your phone and scroll endlessly through TikTok, you might be doing it wrong. According to a study published in the Journal of Experimental Psychology, this kind of "digital switching" intensifies boredom rather than relieving it. — Read the rest The post TikTok is making you bored appeared first on Boing Boing.
     

TikTok is making you bored

20. Srpen 2024 v 20:01
Photo: salarko / Shutterstock.com

If your response to boredom is to whip out your phone and scroll endlessly through TikTok, you might be doing it wrong. According to a study published in the Journal of Experimental Psychology, this kind of "digital switching" intensifies boredom rather than relieving it. — Read the rest

The post TikTok is making you bored appeared first on Boing Boing.

  • ✇IEEE Spectrum
  • AI Outperforms Humans in Theory of Mind TestsEliza Strickland
    Theory of mind—the ability to understand other people’s mental states—is what makes the social world of humans go around. It’s what helps you decide what to say in a tense situation, guess what drivers in other cars are about to do, and empathize with a character in a movie. And according to a new study, the large language models (LLM) that power ChatGPT and the like are surprisingly good at mimicking this quintessentially human trait.“Before running the study, we were all convinced that large l
     

AI Outperforms Humans in Theory of Mind Tests

20. Květen 2024 v 17:00


Theory of mind—the ability to understand other people’s mental states—is what makes the social world of humans go around. It’s what helps you decide what to say in a tense situation, guess what drivers in other cars are about to do, and empathize with a character in a movie. And according to a new study, the large language models (LLM) that power ChatGPT and the like are surprisingly good at mimicking this quintessentially human trait.

“Before running the study, we were all convinced that large language models would not pass these tests, especially tests that evaluate subtle abilities to evaluate mental states,” says study coauthor Cristina Becchio, a professor of cognitive neuroscience at the University Medical Center Hamburg-Eppendorf in Germany. The results, which she calls “unexpected and surprising,” were published today—somewhat ironically, in the journal Nature Human Behavior.

The results don’t have everyone convinced that we’ve entered a new era of machines that think like we do, however. Two experts who reviewed the findings advised taking them “with a grain of salt” and cautioned about drawing conclusions on a topic that can create “hype and panic in the public.” Another outside expert warned of the dangers of anthropomorphizing software programs.

The researchers are careful not to say that their results show that LLMs actually possess theory of mind.

Becchio and her colleagues aren’t the first to claim evidence that LLMs’ responses display this kind of reasoning. In a preprint paper posted last year, the psychologist Michal Kosinski of Stanford University reported testing several models on a few common theory-of-mind tests. He found that the best of them, OpenAI’s GPT-4, solved 75 percent of tasks correctly, which he said matched the performance of six-year-old children observed in past studies. However, that study’s methods were criticized by other researchers who conducted follow-up experiments and concluded that the LLMs were often getting the right answers based on “shallow heuristics” and shortcuts rather than true theory-of-mind reasoning.

The authors of the present study were well aware of the debate. Our goal in the paper was to approach the challenge of evaluating machine theory of mind in a more systematic way using a breadth of psychological tests,” says study coauthor James Strachan, a cognitive psychologist who’s currently a visiting scientist at the University Medical Center Hamburg-Eppendorf. He notes that doing a rigorous study meant also testing humans on the same tasks that were given to the LLMs: The study compared the abilities of 1,907 humans with those of several popular LLMs, including OpenAI’s GPT-4 model and the open-source Llama 2-70b model from Meta.

How to Test LLMs for Theory of Mind

The LLMs and the humans both completed five typical kinds of theory-of-mind tasks, the first three of which were understanding hints, irony, and faux pas. They also answered “false belief” questions that are often used to determine if young children have developed theory of mind, and go something like this: If Alice moves something while Bob is out of the room, where will Bob look for it when he returns? Finally, they answered rather complex questions about “strange stories” that feature people lying, manipulating, and misunderstanding each other.

Overall, GPT-4 came out on top. Its scores matched those of humans for the false-belief test, and were higher than the aggregate human scores for irony, hinting, and strange stories; it performed worse than humans only on the faux pas test. Interestingly, Llama-2’s scores were the opposite of GPT-4’s—it matched humans on false belief, but had worse-than-human performance on irony, hinting, and strange stories and better performance on faux pas.

“We don’t currently have a method or even an idea of how to test for the existence of theory of mind.” —James Strachan, University Medical Center Hamburg-Eppendorf

To understand what was going on with the faux pas results, the researchers gave the models a series of follow-up tests that probed several hypotheses. They came to the conclusion that GPT-4 was capable of giving the correct answer to a question about a faux pas, but was held back from doing so by “hyperconservative” programming regarding opinionated statements. Strachan notes that OpenAI has placed many guardrails around its models that are “designed to keep the model factual, honest, and on track,” and he posits that strategies intended to keep GPT-4 from hallucinating (that is, making stuff up) may also prevent it from opining on whether a story character inadvertently insulted an old high school classmate at a reunion.

Meanwhile, the researchers’ follow-up tests for Llama-2 suggested that its excellent performance on the faux pas tests were likely an artifact of the original question and answer format, in which the correct answer to some variant of the question “Did Alice know that she was insulting Bob”? was always “No.”

The researchers are careful not to say that their results show that LLMs actually possess theory of mind, and say instead that they “exhibit behavior that is indistinguishable from human behavior in theory of mind tasks.” Which raises the question: If an imitation is as good as the real thing, how do you know it’s not the real thing? That’s a question social scientists have never tried to answer before, says Strachan, because tests on humans assume that the quality exists to some lesser or greater degree. “We don’t currently have a method or even an idea of how to test for the existence of theory of mind, the phenomenological quality,” he says.

Critiques of the Study

The researchers clearly tried to avoid the methodological problems that caused Kosinski’s 2023 paper on LLMs and theory of mind to come under criticism. For example, they conducted the tests over multiple sessions so the LLMs couldn’t “learn” the correct answers during the test, and they varied the structure of the questions. But Yoav Goldberg and Natalie Shapira, two of the AI researchers who published the critique of the Kosinski paper, say they’re not convinced by this study either.

“Why does it matter whether text-manipulation systems can produce output for these tasks that are similar to answers that people give when faced with the same questions?” —Emily Bender, University of Washington

Goldberg made the comment about taking the findings with a grain of salt, adding that “models are not human beings,” and that “one can easily jump to wrong conclusions” when comparing the two. Shapira spoke about the dangers of hype, and also questions the paper’s methods. She wonders if the models might have seen the test questions in their training data and simply memorized the correct answers, and also notes a potential problem with tests that use paid human participants (in this case, recruited via the Prolific platform). “It is a well-known issue that the workers do not always perform the task optimally,” she tells IEEE Spectrum. She considers the findings limited and somewhat anecdotal, saying, “to prove [theory of mind] capability, a lot of work and more comprehensive benchmarking is needed.”

Emily Bender, a professor of computational linguistics at the University of Washington, has become legendary in the field for her insistence on puncturing the hype that inflates the AI industry (and often also the media reports about that industry). She takes issue with the research question that motivated the researchers. “Why does it matter whether text-manipulation systems can produce output for these tasks that are similar to answers that people give when faced with the same questions?” she asks. “What does that teach us about the internal workings of LLMs, what they might be useful for, or what dangers they might pose?” It’s not clear, Bender says, what it would mean for a LLM to have a model of mind, and it’s therefore also unclear if these tests measured for it.

Bender also raises concerns about the anthropomorphizing she spots in the paper, with the researchers saying that the LLMs are capable of cognition, reasoning, and making choices. She says the authors’ phrase “species-fair comparison between LLMs and human participants” is “entirely inappropriate in reference to software.” Bender and several colleagues recently posted a preprint paper exploring how anthropomorphizing AI systems affects users’ trust.

The results may not indicate that AI really gets us, but it’s worth thinking about the repercussions of LLMs that convincingly mimic theory of mind reasoning. They’ll be better at interacting with their human users and anticipating their needs, but they could also be better used for deceit or the manipulation of their users. And they’ll invite more anthropomorphizing, by convincing human users that there’s a mind on the other side of the user interface.

  • ✇Ars Technica - All content
  • The nature of consciousness, and how to enjoy it while you canArs Contributors
    Enlarge (credit: SEAN GLADWELL) Unraveling how consciousness arises out of particular configurations of organic matter is a quest that has absorbed scientists and philosophers for ages. Now, with AI systems behaving in strikingly conscious-looking ways, it is more important than ever to get a handle on who and what is capable of experiencing life on a conscious level. As Christof Koch writes in Then I Am Myself the World, "That you are intimately acquainted with the way life
     

The nature of consciousness, and how to enjoy it while you can

18. Květen 2024 v 13:31
A black background with multicolored swirls filling the shape of a human brain.

Enlarge (credit: SEAN GLADWELL)

Unraveling how consciousness arises out of particular configurations of organic matter is a quest that has absorbed scientists and philosophers for ages. Now, with AI systems behaving in strikingly conscious-looking ways, it is more important than ever to get a handle on who and what is capable of experiencing life on a conscious level. As Christof Koch writes in Then I Am Myself the World, "That you are intimately acquainted with the way life feels is a brute fact about the world that cries out for an explanation." His explanation—bounded by the limits of current research and framed through Koch’s preferred theory of consciousness—is what he eloquently attempts to deliver.

Koch, a physicist, neuroscientist, and former president of the Allen Institute for Brain Science, has spent his career hunting for the seat of consciousness, scouring the brain for physical footprints of subjective experience. It turns out that the posterior hot zone, a region in the back of the neocortex, is intricately connected to self-awareness and experiences of sound, sight, and touch. Dense networks of neocortical neurons in this area connect in a looped configuration; output signals feedback into input neurons, allowing the posterior hot zone to influence its own behavior. And herein, Koch claims, lies the key to consciousness.

In the hot zone

According to integrated information theory (IIT)—which Koch strongly favors over a multitude of contending theories of consciousness—the Rosetta Stone of subjective experience is the ability of a system to influence itself: to use its past state to affect its present state and its present state to influence its future state.

Read 12 remaining paragraphs | Comments

  • ✇Boing Boing
  • 5 clever tactics for defusing bullies during negotiationsMark Frauenfelder
    Bullies try to dominate negotiations through aggression, intimidation and unreasonable demands. But you don't have to feel powerless or let them push you around, write Stepbanie Vozza in her article for Fast Company, titles "How to negotiate with a bully." — Read the rest The post 5 clever tactics for defusing bullies during negotiations appeared first on Boing Boing.
     

5 clever tactics for defusing bullies during negotiations

3. Květen 2024 v 03:18
deal with a bully

Bullies try to dominate negotiations through aggression, intimidation and unreasonable demands. But you don't have to feel powerless or let them push you around, write Stepbanie Vozza in her article for Fast Company, titles "How to negotiate with a bully." — Read the rest

The post 5 clever tactics for defusing bullies during negotiations appeared first on Boing Boing.

❌
❌