IEEE Spectrum
Nvidia Conquers Latest AI TestsSamuel K. Moore
For years, Nvidia has dominated many machine learning benchmarks, and now there are two more notches in its belt. MLPerf, the AI benchmarking suite sometimes called “the Olympics of machine learning,” has released a new set of training tests to help make more and better apples-to-apples comparisons between competing computer systems. One of MLPerf’s new tests concerns fine-tuning of large language models, a process that takes an existing trained model and trains it a bit more with specialized
12. Červen 2024 v 17:00

Nvidia Conquers Latest AI Tests

Od: Samuel K. Moore

12. Červen 2024 v 17:00

For years, Nvidia has dominated many machine learning benchmarks, and now there are two more notches in its belt.

MLPerf, the AI benchmarking suite sometimes called “the Olympics of machine learning,” has released a new set of training tests to help make more and better apples-to-apples comparisons between competing computer systems. One of MLPerf’s new tests concerns fine-tuning of large language models, a process that takes an existing trained model and trains it a bit more with specialized knowledge to make it fit for a particular purpose. The other is for graph neural networks, a type of machine learning behind some literature databases, fraud detection in financial systems, and social networks.

Even with the additions and the participation of computers using Google’s and Intel’s AI accelerators, systems powered by Nvidia’s Hopper architecture dominated the results once again. One system that included 11,616 Nvidia H100 GPUs—the largest collection yet—topped each of the nine benchmarks, setting records in five of them (including the two new benchmarks).

“If you just throw hardware at the problem, it’s not a given that you’re going to improve.” —Dave Salvator, Nvidia

The 11,616-H100 system is “the biggest we’ve ever done,” says Dave Salvator, director of accelerated computing products at Nvidia. It smashed through the GPT-3 training trial in less than 3.5 minutes. A 512-GPU system, for comparison, took about 51 minutes. (Note that the GPT-3 task is not a full training, which could take weeks and cost millions of dollars. Instead, the computers train on a representative portion of the data, at an agreed-upon point well before completion.)

Compared to Nvidia’s largest entrant on GPT-3 last year, a 3,584 H100 computer, the 3.5-minute result represents a 3.2-fold improvement. You might expect that just from the difference in the size of these systems, but in AI computing that isn’t always the case, explains Salvator. “If you just throw hardware at the problem, it’s not a given that you’re going to improve,” he says.

“We are getting essentially linear scaling,” says Salvator. By that he means that twice as many GPUs lead to a halved training time. “[That] represents a great achievement from our engineering teams,” he adds.

Competitors are also getting closer to linear scaling. This round Intel deployed a system using 1,024 GPUs that performed the GPT-3 task in 67 minutes versus a computer one-fourth the size that took 224 minutes six months ago. Google’s largest GPT-3 entry used 12-times the number of TPU v5p accelerators as its smallest entry and performed its task nine times as fast.

Linear scaling is going to be particularly important for upcoming “AI factories” housing 100,000 GPUs or more, Salvator says. He says to expect one such data center to come online this year, and another, using Nvidia’s next architecture, Blackwell, to startup in 2025.

Nvidia’s streak continues

Nvidia continued to boost training times despite using the same architecture, Hopper, as it did in last year’s training results. That’s all down to software improvements, says Salvator. “Typically, we’ll get a 2-2.5x [boost] from software after a new architecture is released,” he says.

For GPT-3 training, Nvidia logged a 27 percent improvement from the June 2023 MLPerf benchmarks. Salvator says there were several software changes behind the boost. For example, Nvidia engineers tuned up Hopper’s use of less accurate, 8-bit floating point operations by trimming unnecessary conversions between 8-bit and 16-bit numbers and better targeting of which layers of a neural network could use the lower precision number format. They also found a more intelligent way to adjust the power budget of each chip’s compute engines, and sped communication among GPUs in a way that Salvator likened to “buttering your toast while it’s still in the toaster.”

Additionally, the company implemented a scheme called flash attention. Invented in the Stanford University laboratory of Samba Nova founder Chris Re, flash attention is an algorithm that speeds transformer networks by minimizing writes to memory. When it first showed up in MLPerf benchmarks, flash attention shaved as much as 10 percent from training times. (Intel, too, used a version of flash attention but not for GPT-3. It instead used the algorithm for one of the new benchmarks, fine-tuning.)

Using other software and network tricks, Nvidia delivered an 80 percent speedup in the text-to-image test, Stable Diffusion, versus its submission in November 2023.

New benchmarks

MLPerf adds new benchmarks and upgrades old ones to stay relevant to what’s happening in the AI industry. This year saw the addition of fine-tuning and graph neural networks.

Fine tuning takes an already trained LLM and specializes it for use in a particular field. Nvidia, for example took a trained 43-billion-parameter model and trained it on the GPU-maker’s design files and documentation to create ChipNeMo, an AI intended to boost the productivity of its chip designers. At the time, the company’s chief technology officer Bill Dally said that training an LLM was like giving it a liberal arts education, and fine tuning was like sending it to graduate school.

The MLPerf benchmark takes a pretrained Llama-2-70B model and asks the system to fine tune it using a dataset of government documents with the goal of generating more accurate document summaries.

There are several ways to do fine-tuning. MLPerf chose one called low-rank adaptation (LoRA). The method winds up training only a small portion of the LLM’s parameters leading to a 3-fold lower burden on hardware and reduced use of memory and storage versus other methods, according to the organization.

The other new benchmark involved a graph neural network (GNN). These are for problems that can be represented by a very large set of interconnected nodes, such as a social network or a recommender system. Compared to other AI tasks, GNNs require a lot of communication between nodes in a computer.

The benchmark trained a GNN on a database that shows relationships about academic authors, papers, and institutes—a graph with 547 million nodes and 5.8 billion edges. The neural network was then trained to predict the right label for each node in the graph.

Future fights

Training rounds in 2025 may see head-to-head contests comparing new accelerators from AMD, Intel, and Nvidia. AMD’s MI300 series was launched about six months ago, and a memory-boosted upgrade the MI325x is planned for the end of 2024, with the next generation MI350 slated for 2025. Intel says its Gaudi 3, generally available to computer makers later this year, will appear in MLPerf’s upcoming inferencing benchmarks. Intel executives have said the new chip has the capacity to beat H100 at training LLMs. But the victory may be short-lived, as Nvidia has unveiled a new architecture, Blackwell, which is planned for late this year.

Ars Technica - All content
Slack users horrified to discover messages used for AI trainingAshley Belanger
Enlarge (credit: Tim Robberts | DigitalVision) After launching Slack AI in February, Slack appears to be digging its heels in, defending its vague policy that by default sucks up customers' data—including messages, content, and files—to train Slack's global AI models. According to Slack engineer Aaron Maurer, Slack has explained in a blog that the Salesforce-owned chat service does not train its large language models (LLMs) on customer data. But Slack's policy may need updati
17. Květen 2024 v 20:10

Slack users horrified to discover messages used for AI training

Ars Technica - All content

Od: Ashley Belanger

17. Květen 2024 v 20:10

Slack users horrified to discover messages used for AI training

After launching Slack AI in February, Slack appears to be digging its heels in, defending its vague policy that by default sucks up customers' data—including messages, content, and files—to train Slack's global AI models.

According to Slack engineer Aaron Maurer, Slack has explained in a blog that the Salesforce-owned chat service does not train its large language models (LLMs) on customer data. But Slack's policy may need updating "to explain more carefully how these privacy principles play with Slack AI," Maurer wrote on Threads, partly because the policy "was originally written about the search/recommendation work we've been doing for years prior to Slack AI."

Maurer was responding to a Threads post from engineer and writer Gergely Orosz, who called for companies to opt out of data sharing until the policy is clarified, not by a blog, but in the actual policy language.

Read 34 remaining paragraphs | Comments

Ars Technica - All content
Sony Music opts out of AI training for its entire catalogFinancial Times
Enlarge / The Sony Music letter expressly prohibits artificial intelligence developers from using its music — which includes artists such as Beyoncé. (credit: Kevin Mazur/WireImage for Parkwood via Getty Images) Sony Music is sending warning letters to more than 700 artificial intelligence developers and music streaming services globally in the latest salvo in the music industry’s battle against tech groups ripping off artists. The Sony Music letter, which has been seen by t
17. Květen 2024 v 15:16

Sony Music opts out of AI training for its entire catalog

Ars Technica - All content

Od: Financial Times

17. Květen 2024 v 15:16

Sony Music is sending warning letters to more than 700 artificial intelligence developers and music streaming services globally in the latest salvo in the music industry’s battle against tech groups ripping off artists.

The Sony Music letter, which has been seen by the Financial Times, expressly prohibits AI developers from using its music—which includes artists such as Harry Styles, Adele and Beyoncé—and opts out of any text and data mining of any of its content for any purposes such as training, developing or commercializing any AI system.

Sony Music is sending the letter to companies developing AI systems including OpenAI, Microsoft, Google, Suno, and Udio, according to those close to the group.

Read 12 remaining paragraphs | Comments

Raspberry Pi Foundation
Our T Level resources to support vocational education in EnglandJan Ander
You can now access classroom resources created by us for the T Level in Digital Production, Design and Development. T Levels are a type of vocational qualification young people in England can gain after leaving school, and we are pleased to be able to support T Level teachers and students. With our new resources, we aim to empower more young people to develop their digital skills and confidence while studying, meaning they can access more jobs and opportunities for further study once they
8. Únor 2024 v 15:11

Our T Level resources to support vocational education in England

Raspberry Pi Foundation

Od: Jan Ander

8. Únor 2024 v 15:11

You can now access classroom resources created by us for the T Level in Digital Production, Design and Development. T Levels are a type of vocational qualification young people in England can gain after leaving school, and we are pleased to be able to support T Level teachers and students.

With our new resources, we aim to empower more young people to develop their digital skills and confidence while studying, meaning they can access more jobs and opportunities for further study once they finish their T Levels.

We worked collaboratively with the Gatsby Charitable Foundation on this pilot project as part of their Technical Education Networks Programme, the first time that we have created classroom resources for post-16 vocational education.

Post-16 vocational training and T Levels

T Levels are Technical Levels, 2-year courses for 16- to 18-year-old school leavers. Launched in England in September 2020, T Levels cover a range of subjects and have been developed in collaboration with employers, education providers, and other organisations. The aim is for T Levels to specifically prepare young people for entry into skilled employment, an apprenticeship, or related technical study in further or higher education.

A group of young people in a lecture hall.

For us, this T Level pilot project follows on from work we did in 2022 to learn more about post-16 vocational training and identify gaps where we could make a difference.

Something interesting we found was the relatively low number of school-age young people who started apprenticeships in the UK in 2019/20. For example, a 2021 Worldskills UK report stated that only 18% of apprentices were young people aged 19 and under. 39% were aged 19-24, and the remaining 43% were people aged 25 and over.

To hear from young people about their thoughts directly, we spoke to a group of year 10 students (ages 14 to 15) at Gladesmore School in Tottenham. Two thirds of these students said that digital skills were ‘very important’ to them, and that they would consider applying for a digital apprenticeship or T Level. When we asked them why, one of the key reasons they gave was the opportunity to work and earn money, rather than moving into further study in higher education and paying tuition fees. One student’s answer was for example, “It’s a good way to learn new skills while getting paid, and also gives effective work experience.”

T Level curriculum materials and project brief

To support teachers in delivering the Digital Production, Design and Development T Level qualification, we created a new set of resources: curriculum materials as well a project brief with examples to support the Occupational Specialism component of the qualification.

A girl in a university computing classroom.

The curriculum materials on the topic ‘Digital environments’ cover content related to computer systems including hardware, software, networks, and cloud environments. They are designed for teachers to use in the classroom and consist of a complete unit of work: lesson plans, slide decks, activities, a progression chart, and assessment materials. The materials are designed in line with our computing content framework and pedagogy principles, on which the whole of our Computing Curriculum is based.

The project brief is a real-world scenario related to our work and gives students the opportunity to problem-solve as though they are working in an industry job.

Access the T Level resources

The T Level project brief materials are available for download now, with the T Level classroom materials coming in the next few weeks.

We hope T Level teachers and students find the resources useful and interesting — if you’re using them, please let us know your thoughts and feedback.

Our thanks to the Gatsby Foundation for collaborating with us on this work to empower more young people to fulfil their potential through the power of computing and digital technologies.

The post Our T Level resources to support vocational education in England appeared first on Raspberry Pi Foundation.

Normální zobrazení