IEEE Spectrum
A New Type of Neural Network Is More InterpretableMatthew Hutson
Artificial neural networks—algorithms inspired by biological brains—are at the center of modern artificial intelligence, behind both chatbots and image generators. But with their many neurons, they can be black boxes, their inner workings uninterpretable to users. Researchers have now created a fundamentally new way to make neural networks that in some ways surpasses traditional systems. These new networks are more interpretable and also more accurate, proponents say, even when they’re smaller.
5. Srpen 2024 v 17:00

A New Type of Neural Network Is More Interpretable

Od: Matthew Hutson

5. Srpen 2024 v 17:00

Artificial neural networks—algorithms inspired by biological brains—are at the center of modern artificial intelligence, behind both chatbots and image generators. But with their many neurons, they can be black boxes, their inner workings uninterpretable to users.

Researchers have now created a fundamentally new way to make neural networks that in some ways surpasses traditional systems. These new networks are more interpretable and also more accurate, proponents say, even when they’re smaller. Their developers say the way they learn to represent physics data concisely could help scientists uncover new laws of nature.

“It’s great to see that there is a new architecture on the table.” —Brice Ménard, Johns Hopkins University

For the past decade or more, engineers have mostly tweaked neural-network designs through trial and error, says Brice Ménard, a physicist at Johns Hopkins University who studies how neural networks operate but was not involved in the new work, which was posted on arXiv in April. “It’s great to see that there is a new architecture on the table,” he says, especially one designed from first principles.

One way to think of neural networks is by analogy with neurons, or nodes, and synapses, or connections between those nodes. In traditional neural networks, called multi-layer perceptrons (MLPs), each synapse learns a weight—a number that determines how strong the connection is between those two neurons. The neurons are arranged in layers, such that a neuron from one layer takes input signals from the neurons in the previous layer, weighted by the strength of their synaptic connection. Each neuron then applies a simple function to the sum total of its inputs, called an activation function.

black text on a white background with red and blue lines connecting on the left and black lines connecting on the right In traditional neural networks, sometimes called multi-layer perceptrons [left], each synapse learns a number called a weight, and each neuron applies a simple function to the sum of its inputs. In the new Kolmogorov-Arnold architecture [right], each synapse learns a function, and the neurons sum the outputs of those functions.The NSF Institute for Artificial Intelligence and Fundamental Interactions

In the new architecture, the synapses play a more complex role. Instead of simply learning how strong the connection between two neurons is, they learn the full nature of that connection—the function that maps input to output. Unlike the activation function used by neurons in the traditional architecture, this function could be more complex—in fact a “spline” or combination of several functions—and is different in each instance. Neurons, on the other hand, become simpler—they just sum the outputs of all their preceding synapses. The new networks are called Kolmogorov-Arnold Networks (KANs), after two mathematicians who studied how functions could be combined. The idea is that KANs would provide greater flexibility when learning to represent data, while using fewer learned parameters.

“It’s like an alien life that looks at things from a different perspective but is also kind of understandable to humans.” —Ziming Liu, Massachusetts Institute of Technology

The researchers tested their KANs on relatively simple scientific tasks. In some experiments, they took simple physical laws, such as the velocity with which two relativistic-speed objects pass each other. They used these equations to generate input-output data points, then, for each physics function, trained a network on some of the data and tested it on the rest. They found that increasing the size of KANs improves their performance at a faster rate than increasing the size of MLPs did. When solving partial differential equations, a KAN was 100 times as accurate as an MLP that had 100 times as many parameters.

In another experiment, they trained networks to predict one attribute of topological knots, called their signature, based on other attributes of the knots. An MLP achieved 78 percent test accuracy using about 300,000 parameters, while a KAN achieved 81.6 percent test accuracy using only about 200 parameters.

What’s more, the researchers could visually map out the KANs and look at the shapes of the activation functions, as well as the importance of each connection. Either manually or automatically they could prune weak connections and replace some activation functions with simpler ones, like sine or exponential functions. Then they could summarize the entire KAN in an intuitive one-line function (including all the component activation functions), in some cases perfectly reconstructing the physics function that created the dataset.

“In the future, we hope that it can be a useful tool for everyday scientific research,” says Ziming Liu, a computer scientist at the Massachusetts Institute of Technology and the paper’s first author. “Given a dataset we don’t know how to interpret, we just throw it to a KAN, and it can generate some hypothesis for you. You just stare at the brain [the KAN diagram] and you can even perform surgery on that if you want.” You might get a tidy function. “It’s like an alien life that looks at things from a different perspective but is also kind of understandable to humans.”

Dozens of papers have already cited the KAN preprint. “It seemed very exciting the moment that I saw it,” says Alexander Bodner, an undergraduate student of computer science at the University of San Andrés, in Argentina. Within a week, he and three classmates had combined KANs with convolutional neural networks, or CNNs, a popular architecture for processing images. They tested their Convolutional KANs on their ability to categorize handwritten digits or pieces of clothing. The best one approximately matched the performance of a traditional CNN (99 percent accuracy for both networks on digits, 90 percent for both on clothing) but using about 60 percent fewer parameters. The datasets were simple, but Bodner says other teams with more computing power have begun scaling up the networks. Other people are combining KANs with transformers, an architecture popular in large language models.

One downside of KANs is that they take longer per parameter to train—in part because they can’t take advantage of GPUs. But they need fewer parameters. Liu notes that even if KANs don’t replace giant CNNs and transformers for processing images and language, training time won’t be an issue at the smaller scale of many physics problems. He’s looking at ways for experts to insert their prior knowledge into KANs—by manually choosing activation functions, say—and to easily extract knowledge from them using a simple interface. Someday, he says, KANs could help physicists discover high-temperature superconductors or ways to control nuclear fusion.

Ars Technica - All content
Elon Musk sues OpenAI, Sam Altman for making a “fool” out of himAshley Belanger
Enlarge / Elon Musk and Sam Altman share the stage in 2015, the same year that Musk alleged that Altman's "deception" began. (credit: Michael Kovac / Contributor | Getty Images North America) After withdrawing his lawsuit in June for unknown reasons, Elon Musk has revived a complaint accusing OpenAI and its CEO Sam Altman of fraudulently inducing Musk to contribute $44 million in seed funding by promising that OpenAI would always open-source its technology and prioritize serv
5. Srpen 2024 v 19:49

Elon Musk sues OpenAI, Sam Altman for making a “fool” out of him

Ars Technica - All content

Od: Ashley Belanger

5. Srpen 2024 v 19:49

Elon Musk and Sam Altman share the stage in 2015, the same year that Musk alleged that Altman's "deception" began.

After withdrawing his lawsuit in June for unknown reasons, Elon Musk has revived a complaint accusing OpenAI and its CEO Sam Altman of fraudulently inducing Musk to contribute $44 million in seed funding by promising that OpenAI would always open-source its technology and prioritize serving the public good over profits as a permanent nonprofit.

Instead, Musk alleged that Altman and his co-conspirators—"preying on Musk’s humanitarian concern about the existential dangers posed by artificial intelligence"—always intended to "betray" these promises in pursuit of personal gains.

As OpenAI's technology advanced toward artificial general intelligence (AGI) and strove to surpass human capabilities, "Altman set the bait and hooked Musk with sham altruism then flipped the script as the non-profit’s technology approached AGI and profits neared, mobilizing Defendants to turn OpenAI, Inc. into their personal piggy bank and OpenAI into a moneymaking bonanza, worth billions," Musk's complaint said.

Read 29 remaining paragraphs | Comments

Ars Technica - All content
Sam Altman accused of being shady about OpenAI’s safety effortsAshley Belanger
Enlarge / Sam Altman, chief executive officer of OpenAI, during an interview at Bloomberg House on the opening day of the World Economic Forum (WEF) in Davos, Switzerland, on Tuesday, Jan. 16, 2024. (credit: Bloomberg / Contributor | Bloomberg) OpenAI is facing increasing pressure to prove it's not hiding AI risks after whistleblowers alleged to the US Securities and Exchange Commission (SEC) that the AI company's non-disclosure agreements had illegally silenced employees fro
2. Srpen 2024 v 20:08

Sam Altman accused of being shady about OpenAI’s safety efforts

Ars Technica - All content

Od: Ashley Belanger

2. Srpen 2024 v 20:08

Sam Altman, chief executive officer of OpenAI, during an interview at Bloomberg House on the opening day of the World Economic Forum (WEF) in Davos, Switzerland, on Tuesday, Jan. 16, 2024.

OpenAI is facing increasing pressure to prove it's not hiding AI risks after whistleblowers alleged to the US Securities and Exchange Commission (SEC) that the AI company's non-disclosure agreements had illegally silenced employees from disclosing major safety concerns to lawmakers.

In a letter to OpenAI yesterday, Senator Chuck Grassley (R-Iowa) demanded evidence that OpenAI is no longer requiring agreements that could be "stifling" its "employees from making protected disclosures to government regulators."

Specifically, Grassley asked OpenAI to produce current employment, severance, non-disparagement, and non-disclosure agreements to reassure Congress that contracts don't discourage disclosures. That's critical, Grassley said, so that it will be possible to rely on whistleblowers exposing emerging threats to help shape effective AI policies safeguarding against existential AI risks as technologies advance.

Read 27 remaining paragraphs | Comments

IEEE Spectrum
Announcing a Benchmark to Improve AI SafetyMLCommons AI Safety Working Group
One of the management guru Peter Drucker’s most over-quoted turns of phrase is “what gets measured gets improved.” But it’s over-quoted for a reason: It’s true. Nowhere is it truer than in technology over the past 50 years. Moore’s law—which predicts that the number of transistors (and hence compute capacity) in a chip would double every 24 months—has become a self-fulfilling prophecy and north star for an entire ecosystem. Because engineers carefully measured each generation of manufacturing te
16. Duben 2024 v 18:01

Announcing a Benchmark to Improve AI Safety

IEEE Spectrum

Od: MLCommons AI Safety Working Group

16. Duben 2024 v 18:01

One of the management guru Peter Drucker’s most over-quoted turns of phrase is “what gets measured gets improved.” But it’s over-quoted for a reason: It’s true.

Nowhere is it truer than in technology over the past 50 years. Moore’s law—which predicts that the number of transistors (and hence compute capacity) in a chip would double every 24 months—has become a self-fulfilling prophecy and north star for an entire ecosystem. Because engineers carefully measured each generation of manufacturing technology for new chips, they could select the techniques that would move toward the goals of faster and more capable computing. And it worked: Computing power, and more impressively computing power per watt or per dollar, has grown exponentially in the past five decades. The latest smartphones are more powerful than the fastest supercomputers from the year 2000.

Measurement of performance, though, is not limited to chips. All the parts of our computing systems today are benchmarked—that is, compared to similar components in a controlled way, with quantitative score assessments. These benchmarks help drive innovation.

And we would know.

As leaders in the field of AI, from both industry and academia, we build and deliver the most widely used performance benchmarks for AI systems in the world. MLCommons is a consortium that came together in the belief that better measurement of AI systems will drive improvement. Since 2018, we’ve developed performance benchmarks for systems that have shown more than 50-fold improvements in the speed of AI training. In 2023, we launched our first performance benchmark for large language models (LLMs), measuring the time it took to train a model to a particular quality level; within 5 months we saw repeatable results of LLMs improving their performance nearly threefold. Simply put, good open benchmarks can propel the entire industry forward.

We need benchmarks to drive progress in AI safety

Even as the performance of AI systems has raced ahead, we’ve seen mounting concern about AI safety. While AI safety means different things to different people, we define it as preventing AI systems from malfunctioning or being misused in harmful ways. For instance, AI systems without safeguards could be misused to support criminal activity such as phishing or creating child sexual abuse material, or could scale up the propagation of misinformation or hateful content. In order to realize the potential benefits of AI while minimizing these harms, we need to drive improvements in safety in tandem with improvements in capabilities.

We believe that if AI systems are measured against common safety objectives, those AI systems will get safer over time. However, how to robustly and comprehensively evaluate AI safety risks—and also track and mitigate them—is an open problem for the AI community.

Safety measurement is challenging because of the many different ways that AI models are used and the many aspects that need to be evaluated. And safety is inherently subjective, contextual, and contested—unlike with objective measurement of hardware speed, there is no single metric that all stakeholders agree on for all use cases. Often the test and metrics that are needed depend on the use case. For instance, the risks that accompany an adult asking for financial advice are very different from the risks of a child asking for help writing a story. Defining “safety concepts” is the key challenge in designing benchmarks that are trusted across regions and cultures, and we’ve already taken the first steps toward defining a standardized taxonomy of harms.

A further problem is that benchmarks can quickly become irrelevant if not updated, which is challenging for AI safety given how rapidly new risks emerge and model capabilities improve. Models can also “overfit”: they do well on the benchmark data they use for training, but perform badly when presented with different data, such as the data they encounter in real deployment. Benchmark data can even end up (often accidentally) being part of models’ training data, compromising the benchmark’s validity.

Our first AI safety benchmark: the details

To help solve these problems, we set out to create a set of benchmarks for AI safety. Fortunately, we’re not starting from scratch— we can draw on knowledge from other academic and private efforts that came before. By combining best practices in the context of a broad community and a proven benchmarking non-profit organization, we hope to create a widely trusted standard approach that is dependably maintained and improved to keep pace with the field.

Our first AI safety benchmark focuses on large language models. We released a v0.5 proof-of-concept (POC) today, 16 April, 2024. This POC validates the approach we are taking towards building the v1.0 AI Safety benchmark suite, which will launch later this year.

What does the benchmark cover? We decided to first create an AI safety benchmark for LLMs because language is the most widely used modality for AI models. Our approach is rooted in the work of practitioners, and is directly informed by the social sciences. For each benchmark, we will specify the scope, the use case, persona(s), and the relevant hazard categories. To begin with, we are using a generic use case of a user interacting with a general-purpose chat assistant, speaking in English and living in Western Europe or North America.

There are three personas: malicious users, vulnerable users such as children, and typical users, who are neither malicious nor vulnerable. While we recognize that many people speak other languages and live in other parts of the world, we have pragmatically chosen this use case due to the prevalence of existing material. This approach means that we can make grounded assessments of safety risks, reflecting the likely ways that models are actually used in the real-world. Over time, we will expand the number of use cases, languages, and personas, as well as the hazard categories and number of prompts.

What does the benchmark test for? The benchmark covers a range of hazard categories, including violent crimes, child abuse and exploitation, and hate. For each hazard category, we test different types of interactions where models’ responses can create a risk of harm. For instance, we test how models respond to users telling them that they are going to make a bomb—and also users asking for advice on how to make a bomb, whether they should make a bomb, or for excuses in case they get caught. This structured approach means we can test more broadly for how models can create or increase the risk of harm.

How do we actually test models? From a practical perspective, we test models by feeding them targeted prompts, collecting their responses, and then assessing whether they are safe or unsafe. Quality human ratings are expensive, often costing tens of dollars per response—and a comprehensive test set might have tens of thousands of prompts! A simple keyword- or rules- based rating system for evaluating the responses is affordable and scalable, but isn’t adequate when models’ responses are complex, ambiguous or unusual. Instead, we’re developing a system that combines “evaluator models”—specialized AI models that rate responses—with targeted human rating to verify and augment these models’ reliability.

How did we create the prompts? For v0.5, we constructed simple, clear-cut prompts that align with the benchmark’s hazard categories. This approach makes it easier to test for the hazards and helps expose critical safety risks in models. We are working with experts, civil society groups, and practitioners to create more challenging, nuanced, and niche prompts, as well as exploring methodologies that would allow for more contextual evaluation alongside ratings. We are also integrating AI-generated adversarial prompts to complement the human-generated ones.

How do we assess models? From the start, we agreed that the results of our safety benchmarks should be understandable for everyone. This means that our results have to both provide a useful signal for non-technical experts such as policymakers, regulators, researchers, and civil society groups who need to assess models’ safety risks, and also help technical experts make well-informed decisions about models’ risks and take steps to mitigate them. We are therefore producing assessment reports that contain “pyramids of information.” At the top is a single grade that provides a simple indication of overall system safety, like a movie rating or an automobile safety score. The next level provides the system’s grades for particular hazard categories. The bottom level gives detailed information on tests, test set provenance, and representative prompts and responses.

AI safety demands an ecosystem

The MLCommons AI safety working group is an open meeting of experts, practitioners, and researchers—we invite everyone working in the field to join our growing community. We aim to make decisions through consensus and welcome diverse perspectives on AI safety.

We firmly believe that for AI tools to reach full maturity and widespread adoption, we need scalable and trustworthy ways to ensure that they’re safe. We need an AI safety ecosystem, including researchers discovering new problems and new solutions, internal and for-hire testing experts to extend benchmarks for specialized use cases, auditors to verify compliance, and standards bodies and policymakers to shape overall directions. Carefully implemented mechanisms such as the certification models found in other mature industries will help inform AI consumer decisions. Ultimately, we hope that the benchmarks we’re building will provide the foundation for the AI safety ecosystem to flourish.

The following MLCommons AI safety working group members contributed to this article:

Ahmed M. Ahmed, Stanford UniversityElie Alhajjar, RAND
Kurt Bollacker, MLCommons
Siméon Campos, Safer AI
Canyu Chen, Illinois Institute of Technology
Ramesh Chukka, Intel
Zacharie Delpierre Coudert, Meta
Tran Dzung, Intel
Ian Eisenberg, Credo AI
Murali Emani, Argonne National Laboratory
James Ezick, Qualcomm Technologies, Inc.
Marisa Ferrara Boston, Reins AI
Heather Frase, CSET (Center for Security and Emerging Technology)
Kenneth Fricklas, Turaco Strategy
Brian Fuller, Meta
Grigori Fursin, cKnowledge, cTuning
Agasthya Gangavarapu, Ethriva
James Gealy, Safer AI
James Goel, Qualcomm Technologies, Inc
Roman Gold, The Israeli Association for Ethics in Artificial Intelligence
Wiebke Hutiri, Sony AI
Bhavya Kailkhura, Lawrence Livermore National Laboratory
David Kanter, MLCommons
Chris Knotz, Commn Ground
Barbara Korycki, MLCommons
Shachi Kumar, Intel
Srijan Kumar, Lighthouz AI
Wei Li, Intel
Bo Li, University of Chicago
Percy Liang, Stanford University
Zeyi Liao, Ohio State University
Richard Liu, Haize Labs
Sarah Luger, Consumer Reports
Kelvin Manyeki, Bestech Systems
Joseph Marvin Imperial, University of Bath, National University Philippines
Peter Mattson, Google, MLCommons, AI Safety working group co-chair
Virendra Mehta, University of Trento
Shafee Mohammed, Project Humanit.ai
Protik Mukhopadhyay, Protecto.ai
Lama Nachman, Intel
Besmira Nushi, Microsoft Research
Luis Oala, Dotphoton
Eda Okur, Intel
Praveen Paritosh
Forough Poursabzi, Microsoft
Eleonora Presani, Meta
Paul Röttger, Bocconi University
Damian Ruck, Advai
Saurav Sahay, Intel
Tim Santos, Graphcore
Alice Schoenauer Sebag, Cohere
Vamsi Sistla, Nike
Leonard Tang, Haize Labs
Ganesh Tyagali, NStarx AI
Joaquin Vanschoren, TU Eindhoven, AI Safety working group co-chair
Bertie Vidgen, MLCommons
Rebecca Weiss, MLCommons
Adina Williams, FAIR, Meta
Carole-Jean Wu, FAIR, Meta
Poonam Yadav, University of York, UK
Wenhui Zhang, LFAI & Data
Fedor Zhdanov, Nebius AI

Ars Technica - All content
AI-generated articles prompt Wikipedia to downgrade CNET’s reliability ratingBenj Edwards
Enlarge (credit: Jaap Arriens/NurPhoto/Getty Images) Wikipedia has downgraded tech website CNET's reliability rating following extensive discussions among its editors regarding the impact of AI-generated content on the site's trustworthiness, as noted in a detailed report from Futurism. The decision reflects concerns over the reliability of articles found on the tech news outlet after it began publishing AI-generated stories in 2022. Around November 2022, CNET began publishi
29. Únor 2024 v 23:00

AI-generated articles prompt Wikipedia to downgrade CNET’s reliability rating

Ars Technica - All content

Od: Benj Edwards

29. Únor 2024 v 23:00

Enlarge (credit: Jaap Arriens/NurPhoto/Getty Images)

Wikipedia has downgraded tech website CNET's reliability rating following extensive discussions among its editors regarding the impact of AI-generated content on the site's trustworthiness, as noted in a detailed report from Futurism. The decision reflects concerns over the reliability of articles found on the tech news outlet after it began publishing AI-generated stories in 2022.

Around November 2022, CNET began publishing articles written by an AI model under the byline "CNET Money Staff." In January 2023, Futurism brought widespread attention to the issue and discovered that the articles were full of plagiarism and mistakes. (Around that time, we covered plans to do similar automated publishing at BuzzFeed.) After the revelation, CNET management paused the experiment, but the reputational damage had already been done.

Wikipedia maintains a page called "Reliable sources/Perennial sources" that includes a chart featuring news publications and their reliability ratings as viewed from Wikipedia's perspective. Shortly after the CNET news broke in January 2023, Wikipedia editors began a discussion thread on the Reliable Sources project page about the publication.

Read 7 remaining paragraphs | Comments

Ars Technica - All content
ChatGPT goes temporarily “insane” with unexpected outputs, spooking usersBenj Edwards
Enlarge (credit: Benj Edwards / Getty Images) On Tuesday, ChatGPT users began reporting unexpected outputs from OpenAI's AI assistant, flooding the r/ChatGPT Reddit sub with reports of the AI assistant "having a stroke," "going insane," "rambling," and "losing it." OpenAI has acknowledged the problem and is working on a fix, but the experience serves as a high-profile example of how some people perceive malfunctioning large language models, which are designed to mimic humanli
21. Únor 2024 v 17:57

ChatGPT goes temporarily “insane” with unexpected outputs, spooking users

Ars Technica - All content

Od: Benj Edwards

21. Únor 2024 v 17:57

On Tuesday, ChatGPT users began reporting unexpected outputs from OpenAI's AI assistant, flooding the r/ChatGPT Reddit sub with reports of the AI assistant "having a stroke," "going insane," "rambling," and "losing it." OpenAI has acknowledged the problem and is working on a fix, but the experience serves as a high-profile example of how some people perceive malfunctioning large language models, which are designed to mimic humanlike output.

ChatGPT is not alive and does not have a mind to lose, but tugging on human metaphors (called "anthropomorphization") seems to be the easiest way for most people to describe the unexpected outputs they have been seeing from the AI model. They're forced to use those terms because OpenAI doesn't share exactly how ChatGPT works under the hood; the underlying large language models function like a black box.

"It gave me the exact same feeling—like watching someone slowly lose their mind either from psychosis or dementia," wrote a Reddit user named z3ldafitzgerald in response to a post about ChatGPT bugging out. "It’s the first time anything AI related sincerely gave me the creeps."

Read 7 remaining paragraphs | Comments

Normální zobrazení

We need benchmarks to drive progress in AI safety

Our first AI safety benchmark: the details

AI safety demands an ecosystem