• chevron_right

      AI Mistakes Are Very Different from Human Mistakes

      news.movim.eu / Schneier · Friday, 24 January - 15:38 · 6 minutes

    Humans make mistakes all the time. All of us do, every day, in tasks both new and routine. Some of our mistakes are minor and some are catastrophic. Mistakes can break trust with our friends, lose the confidence of our bosses, and sometimes be the difference between life and death.

    Over the millennia, we have created security systems to deal with the sorts of mistakes humans commonly make. These days, casinos rotate their dealers regularly, because they make mistakes if they do the same task for too long. Hospital personnel write on limbs before surgery so that doctors operate on the correct body part, and they count surgical instruments to make sure none were left inside the body. From copyediting to double-entry bookkeeping to appellate courts, we humans have gotten really good at correcting human mistakes.

    Humanity is now rapidly integrating a wholly different kind of mistake-maker into society: AI. Technologies like large language models (LLMs) can perform many cognitive tasks traditionally fulfilled by humans, but they make plenty of mistakes. It seems ridiculous when chatbots tell you to eat rocks or add glue to pizza. But it’s not the frequency or severity of AI systems’ mistakes that differentiates them from human mistakes. It’s their weirdness. AI systems do not make mistakes in the same ways that humans do.

    Much of the friction—and risk—associated with our use of AI arise from that difference. We need to invent new security systems that adapt to these differences and prevent harm from AI mistakes.

    Human Mistakes vs AI Mistakes

    Life experience makes it fairly easy for each of us to guess when and where humans will make mistakes. Human errors tend to come at the edges of someone’s knowledge: Most of us would make mistakes solving calculus problems. We expect human mistakes to be clustered: A single calculus mistake is likely to be accompanied by others. We expect mistakes to wax and wane, predictably depending on factors such as fatigue and distraction. And mistakes are often accompanied by ignorance: Someone who makes calculus mistakes is also likely to respond “I don’t know” to calculus-related questions.

    To the extent that AI systems make these human-like mistakes, we can bring all of our mistake-correcting systems to bear on their output. But the current crop of AI models—particularly LLMs—make mistakes differently.

    AI errors come at seemingly random times, without any clustering around particular topics. LLM mistakes tend to be more evenly distributed through the knowledge space. A model might be equally likely to make a mistake on a calculus question as it is to propose that cabbages eat goats.

    And AI mistakes aren’t accompanied by ignorance. A LLM will be just as confident when saying something completely wrong—and obviously so, to a human—as it will be when saying something true. The seemingly random inconsistency of LLMs makes it hard to trust their reasoning in complex, multi-step problems. If you want to use an AI model to help with a business problem, it’s not enough to see that it understands what factors make a product profitable; you need to be sure it won’t forget what money is.

    How to Deal with AI Mistakes

    This situation indicates two possible areas of research. The first is to engineer LLMs that make more human-like mistakes. The second is to build new mistake-correcting systems that deal with the specific sorts of mistakes that LLMs tend to make.

    We already have some tools to lead LLMs to act in more human-like ways. Many of these arise from the field of “ alignment ” research, which aims to make models act in accordance with the goals and motivations of their human developers. One example is the technique that was arguably responsible for the breakthrough success of ChatGPT : reinforcement learning with human feedback . In this method, an AI model is (figuratively) rewarded for producing responses that get a thumbs-up from human evaluators. Similar approaches could be used to induce AI systems to make more human-like mistakes, particularly by penalizing them more for mistakes that are less intelligible.

    When it comes to catching AI mistakes, some of the systems that we use to prevent human mistakes will help. To an extent, forcing LLMs to double-check their own work can help prevent errors. But LLMs can also confabulate seemingly plausible, but truly ridiculous, explanations for their flights from reason.

    Other mistake mitigation systems for AI are unlike anything we use for humans. Because machines can’t get fatigued or frustrated in the way that humans do, it can help to ask an LLM the same question repeatedly in slightly different ways and then synthesize its multiple responses. Humans won’t put up with that kind of annoying repetition, but machines will.

    Understanding Similarities and Differences

    Researchers are still struggling to understand where LLM mistakes diverge from human ones. Some of the weirdness of AI is actually more human-like than it first appears. Small changes to a query to an LLM can result in wildly different responses, a problem known as prompt sensitivity . But, as any survey researcher can tell you, humans behave this way, too. The phrasing of a question in an opinion poll can have drastic impacts on the answers.

    LLMs also seem to have a bias towards repeating the words that were most common in their training data; for example, guessing familiar place names like “America” even when asked about more exotic locations. Perhaps this is an example of the human “ availability heuristic ” manifesting in LLMs, with machines spitting out the first thing that comes to mind rather than reasoning through the question. And like humans, perhaps, some LLMs seem to get distracted in the middle of long documents; they’re better able to remember facts from the beginning and end. There is already progress on improving this error mode, as researchers have found that LLMs trained on more examples of retrieving information from long texts seem to do better at retrieving information uniformly.

    In some cases, what’s bizarre about LLMs is that they act more like humans than we think they should. For example, some researchers have tested the hypothesis that LLMs perform better when offered a cash reward or threatened with death. It also turns out that some of the best ways to “ jailbreak ” LLMs (getting them to disobey their creators’ explicit instructions) look a lot like the kinds of social engineering tricks that humans use on each other: for example, pretending to be someone else or saying that the request is just a joke. But other effective jailbreaking techniques are things no human would ever fall for. One group found that if they used ASCII art (constructions of symbols that look like words or pictures) to pose dangerous questions, like how to build a bomb, the LLM would answer them willingly.

    Humans may occasionally make seemingly random, incomprehensible, and inconsistent mistakes, but such occurrences are rare and often indicative of more serious problems. We also tend not to put people exhibiting these behaviors in decision-making positions. Likewise, we should confine AI decision-making systems to applications that suit their actual abilities—while keeping the potential ramifications of their mistakes firmly in mind.

    This essay was written with Nathan E. Sanders, and originally appeared in IEEE Spectrum .

    EDITED TO ADD (1/24): Slashdot thread .

    • chevron_right

      Third Interdisciplinary Workshop on Reimagining Democracy (IWORD 2024)

      news.movim.eu / Schneier · Thursday, 23 January - 14:58 · 1 minute

    Last month, Henry Farrell and I convened the Third Interdisciplinary Workshop on Reimagining Democracy ( IWORD 2024 ) at Johns Hopkins University’s Bloomberg Center in Washington DC. This is a small, invitational workshop on the future of democracy. As with the previous two workshops, the goal was to bring together a diverse set of political scientists, law professors, philosophers, AI researchers and other industry practitioners, political activists, and creative types (including science fiction writers) to discuss how democracy might be reimagined in the current century.

    The goal of the workshop is to think very broadly. Modern democracy was invented in the mid-eighteenth century, using mid-eighteenth-century technology. If democracy were to be invented today, it would look very different. Elections would look different. The balance between representation and direct democracy would look different. Adjudication and enforcement would look different. Everything would look different, because our conceptions of fairness, justice, equality, and rights are different, and we have much more powerful technology to bring to bear on the problems. Also, we could start from scratch without having to worry about evolving our current democracy into this imagined future system.

    We can’t do that, of course, but it’s still still valuable to speculate. Of course we need to figure out how to reform our current systems, but we shouldn’t limit our thinking to incremental steps. We also need to think about discontinuous changes as well. I wrote about the philosophy more in this essay about IWORD 2022.

    IWORD 2024 was easily the most intellectually stimulating two days of my year. It’s also intellectually exhausting; the speed and intensity of ideas is almost too much. I wrote the format in my blog post on IWORD 2023.

    Summaries of all the IWORD 2024 talks are in the first set of comments below. And here are links to the previous IWORDs:

    IWORD 2025 will be held either in New York or New Haven; still to be determined.

    • chevron_right

      Justice Department Indicts Tech CEO for Falsifying Security Certifications

      news.movim.eu / Schneier · Friday, 18 October - 13:58

    The Wall Street Journal is reporting that the CEO of a still unnamed company has been indicted for creating a fake auditing company to falsify security certifications in order to win government business.

    • chevron_right

      FBI Shuts Down Chinese Botnet

      news.movim.eu / Schneier · Thursday, 19 September - 15:40

    The FBI has shut down a botnet run by Chinese hackers:

    The botnet malware infected a number of different types of internet-connected devices around the world, including home routers, cameras, digital video recorders, and NAS drives. Those devices were used to help infiltrate sensitive networks related to universities, government agencies, telecommunications providers, and media organizations…. The botnet was launched in mid-2021, according to the FBI, and infected roughly 260,000 devices as of June 2024.

    The operation to dismantle the botnet was coordinated by the FBI, the NSA, and the Cyber National Mission Force (CNMF), according to a press release dated Wednesday . The U.S. Department of Justice received a court order to take control of the botnet infrastructure by sending disabling commands to the malware on infected devices. The hackers tried to counterattack by hitting FBI infrastructure but were “ultimately unsuccessful,” according to the law enforcement agency.

    • chevron_right

      Remotely Exploding Pagers

      news.movim.eu / Schneier · Wednesday, 18 September - 17:16

    Wow .

    It seems they all exploded simultaneously, which means they were triggered.

    Were they each tampered with physically, or did someone figure out how to trigger a thermal runaway remotely? Supply chain attack? Malicious code update, or natural vulnerability?

    I have no idea, but I expect we will all learn over the next few days.

    EDITED TO ADD: I’m reading nine killed and 2,800 injured. That’s a lot of collateral damage. (I haven’t seen a good number as to the number of pagers yet.)

    EDITED TO ADD: Reuters writes : “The pagers that detonated were the latest model brought in by Hezbollah in recent months, three security sources said.” That implies supply chain attack. And it seems to be a large detonation for an overloaded battery.

    This reminds me of the 1996 assassination of Yahya Ayyash using a booby trapped cellphone.

    EDITED TO ADD: I am deleting political comments. On this blog, let’s stick to the tech and the security ramifications of the threat.

    EDITED TO ADD (9/18): More explosions today, this time radios. Good New York Times explainer . And a Wall Street Journal<i? article . Clearly a physical supply chain attack.

    • chevron_right

      LLMs Acting Deceptively

      news.movim.eu / Schneier · Friday, 14 June, 2024 - 03:12 · 1 minute

    New research: “ Deception abilities emerged in large language models “:

    Abstract: Large language models (LLMs) are currently at the forefront of intertwining AI systems with human communication and everyday life. Thus, aligning them with human values is of great importance. However, given the steady increase in reasoning abilities, future LLMs are under suspicion of becoming able to deceive human operators and utilizing this ability to bypass monitoring efforts. As a prerequisite to this, LLMs need to possess a conceptual understanding of deception strategies. This study reveals that such strategies emerged in state-of-the-art LLMs, but were nonexistent in earlier LLMs. We conduct a series of experiments showing that state-of-the-art LLMs are able to understand and induce false beliefs in other agents, that their performance in complex deception scenarios can be amplified utilizing chain-of-thought reasoning, and that eliciting Machiavellianism in LLMs can trigger misaligned deceptive behavior. GPT-4, for instance, exhibits deceptive behavior in simple test scenarios 99.16% of the time (P < 0.001). In complex second-order deception test scenarios where the aim is to mislead someone who expects to be deceived, GPT-4 resorts to deceptive behavior 71.46% of the time (P < 0.001) when augmented with chain-of-thought reasoning. In sum, revealing hitherto unknown machine behavior in LLMs, our study contributes to the nascent field of machine psychology.

    • chevron_right

      Security and Human Behavior (SHB) 2024

      news.movim.eu / Schneier · Friday, 14 June, 2024 - 03:11 · 1 minute

    This week, I hosted the seventeenth Workshop on Security and Human Behavior at the Harvard Kennedy School. This is the first workshop since our co-founder, Ross Anderson, died unexpectedly .

    SHB is a small, annual, invitational workshop of people studying various aspects of the human side of security. The fifty or so attendees include psychologists, economists, computer security researchers, criminologists, sociologists, political scientists, designers, lawyers, philosophers, anthropologists, geographers, neuroscientists, business school professors, and a smattering of others. It’s not just an interdisciplinary event; most of the people here are individually interdisciplinary.

    Our goal is always to maximize discussion and interaction. We do that by putting everyone on panels, and limiting talks to six to eight minutes, with the rest of the time for open discussion. Short talks limit presenters’ ability to get into the boring details of their work, and the interdisciplinary audience discourages jargon.

    Since the beginning, this workshop has been the most intellectually stimulating two days of my professional year. It influences my thinking in different and sometimes surprising ways—and has resulted in some new friendships and unexpected collaborations. This is why some of us have been coming back every year for over a decade.

    This year’s schedule is here . This page lists the participants and includes links to some of their work. Kami Vaniea liveblogged both days .

    Here are my posts on the first , second , third , fourth , fifth , sixth , seventh , eighth , ninth , tenth , eleventh , twelfth , thirteenth , fourteenth , fifteenth and sixteenth SHB workshops. Follow those links to find summaries, papers, and occasionally audio/video recordings of the sessions. Ross maintained a good webpage of psychology and security resources—it’s still up for now.

    Next year we will be in Cambridge, UK, hosted by Frank Stajano .