AI-Musement Park and MONDO Vanilli’s Blockchain Busting Musical Experience “R.U. Cyber.. R.U. Against NFTs?”
Immediate release from: 03/03/2023
“AI-Musement Park comprises a cornucopia of performances / talks / happenings / documentary & discussion about AI, Intelligences, technocapitalism’s more than pressing-ongoing urgencies.” -Eleanor Dare, Cambridge University & AI-Musement Park
R.U. Cyber.. R.U. Against NFTs? An original AI-Musement Park, PlayLa.bZ & MONDO 2000 History Project human and machine learning co-creation, taking the perspective of an AI that is training itself on the R.U. Sirius & MONDO Vanilli ‘I’m Against NFT’s’ song lyrics, exploring a surreal, mind melting and multi-dimensional 360 world of paradoxes and conflicting rules.
“Mondo Vanilli was originally intended to be a virtual reality band exploding all assumptions about property and propriety in the 1990s. Today fabrication becomes de rigueur as the connection to the real is intentionally confused by the banal political tricksters of power and profitability… while storms pound our all-too-human bodies and communities. I am thrilled to finally see MONDO Vanilli in it’s appropriate context. Immersive. Come play in the simulacra one more time” -R.U. Sirius, MONDO 2000
R.U. Cyber.. R.U. Against NFTs? Is a satirical, irreverent block-chain busting commentary on the propaganda relations fueled ‘Web 3’ hype around non-fungible tokens and the broader issues that underpin our algorithmically massaged hyper-connected infinite scrolls and trolls age. Challenging our assumptions about the nature of technology, creativity, and value, reminding us that the digital world is shaped by powerful forces that determine what is valued and what is not, and a click is not always for free.
Join Us! On Spring Solstice 2023 For “R.U. Cyber? :// Mondo 2000 History Project Salon” at MozFest Virtual Plaza & Mozilla Hubs: AI-Musement Park 20th March / 8.30pm EU / GMT
About R.U.Sirius & Mondo 2000 #Mondo2000 #RUSirius
R.U. Sirius is an American writer, editor, and media pioneer. Known for being one of key psychedelic & cyberpunk movement figures. Best known as Mondo 2000 editor-in-chief and at forefront of 1990s underground cyberculture movement.
About Mozilla Festival #TrustworthyAI #AIMusementPark
Since 2010, MozFest has fueled the movement to ensure the internet benefits humanity, rather than harms it. This year, your part in the story is critical to our community’s mission: a better, healthier internet and more Trustworthy AI.
About PlayLa.bZ CIC #PlayLabZ #SpatialCadetZ
Co-founded by PsychFi, FreekMinds & Squire Studios we’re a next generation multipotentiality multi-award-winning, multi-dimensional motion arts experience design laboratory, developing DIY changemaking createch immersive experiences & software applications for social good storycraft. Supporters & Friends: Mozilla Festival, Jisc: Digifest, Beyond Games, Tate Modern, Furtherfield, Boomtown Festival, Sci-Fi-London, Ravensbourne University London, UAL, East London Dance, NESTA, Modern Panic, ArtFutura, Kimatica, National Gallery X, Kings College London, Looking Glass Factory, SubPac, Ecologi, The JUMP, BOM Labs, Mondo 2000
PR Contact: James E. Marks, Tel: 07921 523438 @: jem@playla.bz Twitter: @GoGenieMo
From cutting-edge robotics, design, and bioengineering to sustainable energy solutions, ocean engineering, nanotechnology, and innovative materials science, MechE students and their advisors are doing incredibly innovative work. The graduate students highlighted here represent a snapshot of the great work in progress this spring across the Department of Mechanical Engineering, and demonstrate the ways the future of this field is as limitless as the imaginations of its practitioners.
Democratizing design through AI
Lyle Regenwetter Hometown: Champaign, Illinois Advisor: Assistant Professor Faez Ahmed Interests: Food, climbing, skiing, soccer, tennis, cooking
Lyle Regenwetter finds excitement in the prospect of generative AI to "democratize" design and enable inexperienced designers to tackle complex design problems. His research explores new training methods through which generative AI models can be taught to implicitly obey design constraints and synthesize higher-performing designs. Knowing that prospective designers often have an intimate knowledge of the needs of users, but may otherwise lack the technical training to create solutions, Regenwetter also develops human-AI collaborative tools that allow AI models to interact and support designers in popular CAD software and real design problems.
Solving a whale of a problem
Loïcka Baille Hometown: L’Escale, France Advisor: Daniel Zitterbart Interests: Being outdoors — scuba diving, spelunking, or climbing. Sailing on the Charles River, martial arts classes, and playing volleyball
Loïcka Baille’s research focuses on developing remote sensing technologies to study and protect marine life. Her main project revolves around improving onboard whale detection technology to prevent vessel strikes, with a special focus on protecting North Atlantic right whales. Baille is also involved in an ongoing study of Emperor penguins. Her team visits Antarctica annually to tag penguins and gather data to enhance their understanding of penguin population dynamics and draw conclusions regarding the overall health of the ecosystem.
Water, water anywhere
Carlos Díaz-Marín Hometown: San José, Costa Rica Advisor: Professor Gang Chen | Former Advisor: Professor Evelyn Wang Interests: New England hiking, biking, and dancing
Carlos Díaz-Marín designs and synthesizes inexpensive salt-polymer materials that can capture large amounts of humidity from the air. He aims to change the way we generate potable water from the air, even in arid conditions. In addition to water generation, these salt-polymer materials can also be used as thermal batteries, capable of storing and reusing heat. Beyond the scientific applications, Díaz-Marín is excited to continue doing research that can have big social impacts, and that finds and explains new physical phenomena. As a LatinX person, Díaz-Marín is also driven to help increase diversity in STEM.
Scalable fabrication of nano-architected materials
Somayajulu Dhulipala Hometown: Hyderabad, India Advisor: Assistant Professor Carlos Portela Interests: Space exploration, taekwondo, meditation.
Somayajulu Dhulipala works on developing lightweight materials with tunable mechanical properties. He is currently working on methods for the scalable fabrication of nano-architected materials and predicting their mechanical properties. The ability to fine-tune the mechanical properties of specific materials brings versatility and adaptability, making these materials suitable for a wide range of applications across multiple industries. While the research applications are quite diverse, Dhulipala is passionate about making space habitable for humanity, a crucial step toward becoming a spacefaring civilization.
Ingestible health-care devices
Jimmy McRae Hometown: Woburn, Massachusetts Advisor: Associate Professor Giovani Traverso Interests: Anything basketball-related: playing, watching, going to games, organizing hometown tournaments
Jimmy McRae aims to drastically improve diagnostic and therapeutic capabilities through noninvasive health-care technologies. His research focuses on leveraging materials, mechanics, embedded systems, and microfabrication to develop novel ingestible electronic and mechatronic devices. This ranges from ingestible electroceutical capsules that modulate hunger-regulating hormones to devices capable of continuous ultralong monitoring and remotely triggerable actuations from within the stomach. The principles that guide McRae’s work to develop devices that function in extreme environments can be applied far beyond the gastrointestinal tract, with applications for outer space, the ocean, and more.
Freestyle BMX meets machine learning
Eva Nates Hometown: Narberth, Pennsylvania Advisor: Professor Peko Hosoi Interests: Rowing, running, biking, hiking, baking
Eva Nates is working with the Australian Cycling Team to create a tool to classify Bicycle Motocross Freestyle (BMX FS) tricks. She uses a singular value decomposition method to conduct a principal component analysis of the time-dependent point-tracking data of an athlete and their bike during a run to classify each trick. The 2024 Olympic team hopes to incorporate this tool in their training workflow, and Nates worked alongside the team at their facilities on the Gold Coast of Australia during MIT’s Independent Activities Period in January.
Augmenting Astronauts with Wearable Limbs
Erik Ballesteros Hometown: Spring, Texas Advisor: Professor Harry Asada Interests: Cosplay, Star Wars, Lego bricks
Erik Ballesteros’s research seeks to support astronauts who are conducting planetary extravehicular activities through the use of supernumerary robotic limbs (SuperLimbs). His work is tailored toward design and control manifestation to assist astronauts with post-fall recovery, human-leader/robot-follower quadruped locomotion, and coordinated manipulation between the SuperLimbs and the astronaut to perform tasks like excavation and sample handling.
This article appeared in the Spring 2024 edition of the Department of Mechanical Engineering's magazine, MechE Connects.
Benjamin Warf, a renowned neurosurgeon at Boston Children’s Hospital, stands in the MIT.nano Immersion Lab. More than 3,000 miles away, his virtual avatar stands next to Matheus Vasconcelos in Brazil as the resident practices delicate surgery on a doll-like model of a baby’s brain.
With a pair of virtual-reality goggles, Vasconcelos is able to watch Warf’s avatar demonstrate a brain surgery procedure before replicating the technique himself and while asking questions of Warf’s digital twin.
“It’s an almost out-of-body experience,” Warf says of watching his avatar interact with the residents. “Maybe it’s how it feels to have an identical twin?”
And that’s the goal: Warf’s digital twin bridged the distance, allowing him to be functionally in two places at once. “It was my first training using this model, and it had excellent performance,” says Vasconcelos, a neurosurgery resident at Santa Casa de São Paulo School of Medical Sciences in São Paulo, Brazil. “As a resident, I now feel more confident and comfortable applying the technique in a real patient under the guidance of a professor.”
Warf’s avatar arrived via a new project launched by medical simulator and augmented reality (AR) company EDUCSIM. The company is part of the 2023 cohort of START.nano, MIT.nano’s deep-tech accelerator that offers early-stage startups discounted access to MIT.nano’s laboratories.
In March 2023, Giselle Coelho, EDUCSIM’s scientific director and a pediatric neurosurgeon at Santa Casa de São Paulo and Sabará Children’s Hospital, began working with technical staff in the MIT.nano Immersion Lab to create Warf’s avatar. By November, the avatar was training future surgeons like Vasconcelos.
“I had this idea to create the avatar of Dr. Warf as a proof of concept, and asked, ‘What would be the place in the world where they are working on technologies like that?’” Coelho says. “Then I found MIT.nano.”
Capturing a Surgeon
As a neurosurgery resident, Coelho was so frustrated by the lack of practical training options for complex surgeries that she built her own model of a baby brain. The physical model contains all the structures of the brain and can even bleed, “simulating all the steps of a surgery, from incision to skin closure,” she says.
She soon found that simulators and virtual reality (VR) demonstrations reduced the learning curve for her own residents. Coelho launched EDUCSIM in 2017 to expand the variety and reach of the training for residents and experts looking to learn new techniques.
Those techniques include a procedure to treat infant hydrocephalus that was pioneered by Warf, the director of neonatal and congenital neurosurgery at Boston Children’s Hospital. Coelho had learned the technique directly from Warf and thought his avatar might be the way for surgeons who couldn’t travel to Boston to benefit from his expertise.
To create the avatar, Coelho worked with Talis Reks, the AR/VR/gaming/big data IT technologist in the Immersion Lab.
“A lot of technology and hardware can be very expensive for startups to access as they start their company journey,” Reks explains. “START.nano is one way of enabling them to utilize and afford the tools and technologies we have at MIT.nano’s Immersion Lab.”
Coelho and her colleagues needed high-fidelity and high-resolution motion-capture technology, volumetric video capture, and a range of other VR/AR technologies to capture Warf’s dexterous finger motions and facial expressions. Warf visited MIT.nano on several occasions to be digitally “captured,” including performing an operation on the physical baby model while wearing special gloves and clothing embedded with sensors.
“These technologies have mostly been used for entertainment or VFX [visual effects] or CGI [computer-generated imagery],” says Reks, “But this is a unique project, because we’re applying it now for real medical practice and real learning.”
One of the biggest challenges, Reks says, was helping to develop what Coelho calls “holoportation”— transmitting the 3D, volumetric video capture of Warf in real-time over the internet so that his avatar can appear in transcontinental medical training.
The Warf avatar has synchronous and asynchronous modes. The training that Vasconcelos received was in the asynchronous mode, where residents can observe the avatar’s demonstrations and ask it questions. The answers, delivered in a variety of languages, come from AI algorithms that draw from previous research and an extensive bank of questions and answers provided by Warf.
In the synchronous mode, Warf operates his avatar from a distance in real time, Coelho says. “He could walk around the room, he could talk to me, he could orient me. It’s amazing.”
Coelho, Warf, Reks, and other team members demonstrated a combination of the modes in a second session in late December. This demo consisted of volumetric live video capture between the Immersion Lab and Brazil, spatialized and visible in real-time through AR headsets. It significantly expanded upon the previous demo, which had only streamed volumetric data in one direction through a two-dimensional display.
Powerful impacts
Warf has a long history of training desperately needed pediatric neurosurgeons around the world, most recently through his nonprofit Neurokids. Remote and simulated training has been an increasingly large part of training since the pandemic, he says, although he doesn’t feel it will ever completely replace personal hands-on instruction and collaboration.
“But if in fact one day we could have avatars, like this one from Giselle, in remote places showing people how to do things and answering questions for them, without the cost of travel, without the time cost and so forth, I think it could be really powerful,” Warf says.
The avatar project is especially important for surgeons serving remote and underserved areas like the Amazon region of Brazil, Coelho says. “This is a way to give them the same level of education that they would get in other places, and the same opportunity to be in touch with Dr. Warf.”
One baby treated for hydrocephalus at a recent Amazon clinic had traveled by boat 30 hours for the surgery, according to Coelho.
Training surgeons with the avatar, she says, “can change reality for this baby and can change the future.”
The Ansys SimAI™ cloud-enabled generative artificial intelligence (AI) platform combines the predictive accuracy of Ansys simulation with the speed of generative AI. Because of the software’s versatile underlying neural networks, it can extend to many types of simulation, including structural applications.
This white paper shows how the SimAI cloud-based software applies to highly nonlinear, transient structural simulations, such as automobile crashes, and includes:
Vehicle kinematics and deformation
Forces acting upon the vehicle
How it interacts with its environment
How understanding the changing and rapid sequence of events helps predict outcomes
These simulations can reduce the potential for occupant injuries and the severity of vehicle damage and help understand the crash’s overall dynamics. Ultimately, this leads to safer automotive design.
In the two years since Arati Prabhakar was appointed director of the White House Office of Science and Technology Policy, she has set the United States on a course toward regulating artificial intelligence. The IEEE Fellow advised the U.S. President Joe Biden in writing the executive order he issued to accomplish the goal just six months after she began her new role in 2022.
Director of the White House Office of Science and Technology Policy
Member grade
Fellow
Alma maters
Texas Tech University; Caltech
Working in the public sector wasn’t initially on her radar. Not until she became a DARPA program manager in 1986, she says, did she really understand what she could accomplish as a government official.
“What I have come to love about [public service] is the opportunity to shape policies at a scale that is really unparalleled,” she says.
Prabhakar’s passion for tackling societal challenges by developing technology also led her to take leadership positions at companies including Raychem (now part of TE Connectivity), Interval Research Corp., and U.S. Venture Partners. In 2019 she helped found Actuate, a nonprofit in Palo Alto, Calif., that seeks to create technology to help address climate change, data privacy, health care access, and other pressing issues.
“I really treasure having seen science, technology, and innovation from all different perspectives,” she says. “But the part I have loved most is public service because of the impact and reach that it can have.”
Discovering her passion for electrical engineering
Prabhakar, who was born in India and raised in Texas, says she decided to pursue a STEM career because when she was growing up, her classmates said women weren’t supposed to work in science, technology, engineering or mathematics.
“Them saying that just made me want to pursue it more,” she says. Her parents, who had wanted her to become a doctor, supported her pursuit of engineering, she adds.
After earning a bachelor’s degree in electrical engineering in 1979 from Texas Tech University, in Lubbock, she moved to California to continue her education at Caltech. She graduated with a master’s degree in EE in 1980, then earned a doctorate in applied physics in 1984. Her doctoral thesis focused on understanding deep-level defects and impurities in semiconductors that affect device performance.
After acquiring her Ph.D., she says, she wanted to make a bigger impact with her research than academia would allow, so she applied for a policy fellowship from the American Association for the Advancement of Science to work at the congressional Office of Technology Assessment. The office examines issues involving new or expanding technologies, assesses their impact, and studies whether new policies are warranted.
“We have huge aspirations for the future—such as mitigating climate change—that science and technology have to be part of achieving.”
“I wanted to share my research in semiconductor manufacturing processes with others,” Prabhakar says. “That’s what felt exciting and valuable to me.”
While there, she worked with people who were passionate about public service and government, but she didn’t feel the same, she says, until she joined DARPA. As program manager, Prabhakar established and led several projects including a microelectronics office that invests in developing new technologies in areas such as lithography, optoelectronics, infrared imaging, and neural networks.
In 1993 an opportunity arose that she couldn’t refuse, she says: President Bill Clinton nominated her to direct the National Institute of Standards and Technology. NIST develops technical guidelines and conducts research to create tools that improve citizens’ quality of life. At age 34, she became the first woman to lead the agency.
Believing in IEEE’s Mission
Like many IEEE members, Prabhakar says, she joined IEEE as a student member while attending Texas Tech University because the organization’s mission aligned with her belief that engineering is about creating value in the world.
She continues to renew her membership, she says, because IEEE emphasizes that technology should benefit humanity.
“It really comes back to this idea of the purpose of engineering and the role that it plays in the world,” she says.
After leading NIST through the first Clinton administration, she left for the private sector, including stints as CTO at appliance-component maker Raychem in Menlo Park, Calif., and president of private R&D lab Interval Research of Palo Alto, Calif. In all, she spent the next 14 years in the private sector, mostly as a partner at U.S. Venture Partners, in Menlo Park, where she invested in semiconductor and clean-tech startups.
In 2012 she returned to DARPA and became its first female director.
“When I received the call offering me the job, I stopped breathing,” Prabhakar says. “It was a once-in-a-lifetime opportunity to make a difference at an agency that I had loved earlier in my career. And it proved to be just as meaningful an experience as I had hoped.”
For the next five years she led the agency, focusing on developing better military systems and the next generation of artificial intelligence, as well as creating solutions in social science, synthetic biology, and neurotechnology.
Under her leadership, in 2014 DARPA established the Biological Technologies Office to oversee basic and applied research in areas including gene editing, neurosciences, and synthetic biology. The office launched the Pandemic Prevention Platform, which helped fund the development of the mRNA technology that is used in the Moderna and Pfizer coronavirus vaccines.
She left the agency in 2017 to move back to California with her family.
“When I left the organization, what was very much on my mind was that the United States has the most powerful innovation engine the world has ever seen,” Prabhakar says. “At the same time, what kept tugging at me was that we have huge aspirations for the future—such as mitigating climate change—that science and technology have to be part of achieving.”
That’s why, in 2019, she helped found Actuate. She served as the nonprofit’s chief executive until 2022, when she took on the role of OSTP director.
Although she didn’t choose her career path because it was her passion, she says, she came to realize that she loves the role that engineering, science, and technology play in the world because of their “power to change how the future unfolds.”
Leading AI regulation worldwide
When Biden asked if Prabhakar would take the OSTP job, she didn’t think twice, she says. “When do you need me to move in?” she says she told him.
“I was so excited to work for the president because he sees science and technology as a necessary part of creating a bright future for the country,” Prabhakar says.
A month after she took office, the generative AI program ChatGPT launched and became a hot topic.
“AI was already being used in different areas, but all of a sudden it became visible to everyone in a way that it really hadn’t been before,” she says.
Regulating AI became a priority for the Biden administration because of the technology’s breadth and power, she says, as well as the rapid pace at which it’s being developed.
“The executive order is possibly the most important accomplishment in relation to AI,” Prabhakar says. “It’s a tool that mobilizes the [U.S. government’s] executive branch and recognizes that such systems have safety and security risks, but [it] also enables immense opportunity. The order has put the branches of government on a very constructive path toward regulation.”
Meanwhile, the United States spearheaded a U.N. resolution to make regulating AI an international priority. The United Nations adopted the measure this past March. In addition to defining regulations, it seeks to use AI to advance progress on the U.N.’s sustainable development goals.
“There’s much more to be done,” Prabhakar says, “but I’m really happy to see what the president has been able to accomplish, and really proud that I got to help with that.”
If you are into tech, keeping up with the latest updates can be tough, particularly when it comes to artificial intelligence (AI) and generative AI (GenAI). Sometimes I admit to feeling this way myself, however, there was one update recently that really caught my attention. OpenAI launched their latest iteration of ChatGPT, this time adding a female-sounding voice. Their launch video demonstrated the model supporting the presenters with a maths problem and giving advice around presentation techniques, sounding friendly and jovial along the way.
Adding a voice to these AI models was perhaps inevitable as big tech companies try to compete for market share in this space, but it got me thinking, why would they add a voice? Why does the model have to flirt with the presenter?
Working in the field of AI, I’ve always seen AI as a really powerful problem-solving tool. But with GenAI, I often wonder what problems the creators are trying to solve and how we can help young people understand the tech.
What problem are we trying to solve with GenAI?
The fact is that I’m really not sure. That’s not to suggest that I think that GenAI hasn’t got its benefits — it does. I’ve seen so many great examples in education alone: teachers using large language models (LLMs) to generate ideas for lessons, to help differentiate work for students with additional needs, to create example answers to exam questions for their students to assess against the mark scheme. Educators are creative people and whilst it is cool to see so many good uses of these tools, I wonder if the developers had solving specific problems in mind while creating them, or did they simply hope that society would find a good use somewhere down the line?
Whilst there are good uses of GenAI, you don’t need to dig very deeply before you start unearthing some major problems.
Anthropomorphism
Anthropomorphism relates to assigning human characteristics to things that aren’t human. This is something that we all do, all of the time, without it having consequences. The problem with doing this with GenAI is that, unlike an inanimate object you’ve named (I call my vacuum cleaner Henry, for example), chatbots are designed to be human-like in their responses, so it’s easy for people to forget they’re not speaking to a human.
As feared, since my last blog post on the topic, evidence has started to emerge that some young people are showing a desire to befriend these chatbots, going to them for advice and emotional support. It’s easy to see why. Here is an extract from an exchange between the presenters at the ChatGPT-4o launch and the model:
ChatGPT (presented with a live image of the presenter): “It looks like you’re feeling pretty happy and cheerful with a big smile and even maybe a touch of excitement. Whatever is going on? It seems like you’re in a great mood. Care to share the source of those good vibes?” Presenter: “The reason I’m in a good mood is we are doing a presentation showcasing how useful and amazing you are.” ChatGPT: “Oh stop it, you’re making me blush.”
“Some people just want to talk to somebody. Just because it’s not a real person, doesn’t mean it can’t make a person feel — because words are powerful. At the end of the day, it can always help in an emotional and mental way.”
The prospect of teenagers seeking solace and emotional support from a generative AI tool is a concerning development. While these AI tools can mimic human-like conversations, their outputs are based on patterns and data, not genuine empathy or understanding. The ultimate concern is that this exposes vulnerable young people to be manipulated in ways we can’t predict. Relying on AI for emotional support could lead to a sense of isolation and detachment, hindering the development of healthy coping mechanisms and interpersonal relationships.
Arguably worse is the recent news of the world’s first AI beauty pageant. The very thought of this probably elicits some kind of emotional response depending on your view of beauty pageants. There are valid concerns around misogyny and reinforcing misguided views on body norms, but it’s also important to note that the winner of “Miss AI” is being described as a lifestyle influencer. The questions we should be asking are, who are the creators trying to have influence over? What influence are they trying to gain that they couldn’t get before they created a virtual woman?
DeepFake tools
Another use of GenAI is the ability to create DeepFakes. If you’ve watched the most recent Indiana Jones movie, you’ll have seen the technology in play, making Harrison Ford appear as a younger version of himself. This is not in itself a bad use of GenAI technology, but the application of DeepFake technology can easily become problematic. For example, recently a teacher was arrested for creating a DeepFake audio clip of the school principal making racist remarks. The recording went viral before anyone realised that AI had been used to generate the audio clip.
Easy-to-use DeepFake tools are freely available and, as with many tools, they can be used inappropriately to cause damage or even break the law. One such instance is the rise in using the technology for pornography. This is particularly dangerous for young women, who are the more likely victims, and can cause severe and long-lasting emotional distress and harm to the individuals depicted, as well as reinforce harmful stereotypes and the objectification of women.
Why we should focus on using AI as a problem-solving tool
Technological developments causing unforeseen negative consequences is nothing new. A lot of our job as educators is about helping young people navigate the changing world and preparing them for their futures and education has an essential role in helping people understand AI technologies to avoid the dangers.
Our approach at the Raspberry Pi Foundation is not to focus purely on the threats and dangers, but to teach young people to be critical users of technologies and not passive consumers. Having an understanding of how these technologies work goes a long way towards achieving sufficient AI literacy skills to make informed choices and this is where our Experience AI program comes in.
Experience AI is a set of lessons developed in collaboration with Google DeepMind and, before we wrote any lessons, our team thought long and hard about what we believe are the important principles that should underpin teaching and learning about artificial intelligence. One such principle is taking a problem-first approach and emphasising that computers are tools that help us solve problems. In the Experience AI fundamentals unit, we teach students to think about the problem they want to solve before thinking about whether or not AI is the appropriate tool to use to solve it.
Taking a problem-first approach doesn’t by default avoid an AI system causing harm — there’s still the chance it will increase bias and societal inequities — but it does focus the development on the end user and the data needed to train the models. I worry that focusing on market share and opportunity rather than the problem to be solved is more likely to lead to harm.
Another set of principles that underpins our resources is teaching about fairness, accountability, transparency, privacy, and security (Fairness, Accountability, Transparency, and Ethics (FATE) in Artificial Intelligence (AI) and higher education, Understanding Artificial Intelligence Ethics and Safety) in relation to the development of AI systems. These principles are aimed at making sure that creators of AI models develop models ethically and responsibly. The principles also apply to consumers, as we need to get to a place in society where we expect these principles to be adhered to and consumer power means that any models that don’t, simply won’t succeed.
Furthermore, once students have created their models in the Experience AI fundamentals unit, we teach them about model cards, an approach that promotes transparency about their models. Much like how nutritional information on food labels allows the consumer to make an informed choice about whether or not to buy the food, model cards give information about an AI model such as the purpose of the model, its accuracy, and known limitations such as what bias might be in the data. Students write their own model cards based on the AI solutions they have created.
What else can we do?
At the Raspberry Pi Foundation, we have set up an AI literacy team with the aim to embed principles around AI safety, security, and responsibility into our resources and align them with the Foundations’ mission to help young people to:
Be critical consumers of AI technology
Understand the limitations of AI
Expect fairness, accountability, transparency, privacy, and security and work toward reducing inequities caused by technology
See AI as a problem-solving tool that can augment human capabilities, but not replace or narrow their futures
Our call to action to educators, carers, and parents is to have conversations with your young people about GenAI. Get to know their opinions on GenAI and how they view its role in their lives, and help them to become critical thinkers when interacting with technology.
AI-Musement Park and MONDO Vanilli’s Blockchain Busting Musical Experience “R.U. Cyber.. R.U. Against NFTs?”
Immediate release from: 03/03/2023
“AI-Musement Park comprises a cornucopia of performances / talks / happenings / documentary & discussion about AI, Intelligences, technocapitalism’s more than pressing-ongoing urgencies.” -Eleanor Dare, Cambridge University & AI-Musement Park
R.U. Cyber.. R.U. Against NFTs? An original AI-Musement Park, PlayLa.bZ & MONDO 2000 History Project human and machine learning co-creation, taking the perspective of an AI that is training itself on the R.U. Sirius & MONDO Vanilli ‘I’m Against NFT’s’ song lyrics, exploring a surreal, mind melting and multi-dimensional 360 world of paradoxes and conflicting rules.
“Mondo Vanilli was originally intended to be a virtual reality band exploding all assumptions about property and propriety in the 1990s. Today fabrication becomes de rigueur as the connection to the real is intentionally confused by the banal political tricksters of power and profitability… while storms pound our all-too-human bodies and communities. I am thrilled to finally see MONDO Vanilli in it’s appropriate context. Immersive. Come play in the simulacra one more time” -R.U. Sirius, MONDO 2000
R.U. Cyber.. R.U. Against NFTs? Is a satirical, irreverent block-chain busting commentary on the propaganda relations fueled ‘Web 3’ hype around non-fungible tokens and the broader issues that underpin our algorithmically massaged hyper-connected infinite scrolls and trolls age. Challenging our assumptions about the nature of technology, creativity, and value, reminding us that the digital world is shaped by powerful forces that determine what is valued and what is not, and a click is not always for free.
Join Us! On Spring Solstice 2023 For “R.U. Cyber? :// Mondo 2000 History Project Salon” at MozFest Virtual Plaza & Mozilla Hubs: AI-Musement Park 20th March / 8.30pm EU / GMT
About R.U.Sirius & Mondo 2000 #Mondo2000 #RUSirius
R.U. Sirius is an American writer, editor, and media pioneer. Known for being one of key psychedelic & cyberpunk movement figures. Best known as Mondo 2000 editor-in-chief and at forefront of 1990s underground cyberculture movement.
About Mozilla Festival #TrustworthyAI #AIMusementPark
Since 2010, MozFest has fueled the movement to ensure the internet benefits humanity, rather than harms it. This year, your part in the story is critical to our community’s mission: a better, healthier internet and more Trustworthy AI.
About PlayLa.bZ CIC #PlayLabZ #SpatialCadetZ
Co-founded by PsychFi, FreekMinds & Squire Studios we’re a next generation multipotentiality multi-award-winning, multi-dimensional motion arts experience design laboratory, developing DIY changemaking createch immersive experiences & software applications for social good storycraft. Supporters & Friends: Mozilla Festival, Jisc: Digifest, Beyond Games, Tate Modern, Furtherfield, Boomtown Festival, Sci-Fi-London, Ravensbourne University London, UAL, East London Dance, NESTA, Modern Panic, ArtFutura, Kimatica, National Gallery X, Kings College London, Looking Glass Factory, SubPac, Ecologi, The JUMP, BOM Labs, Mondo 2000
PR Contact: James E. Marks, Tel: 07921 523438 @: jem@playla.bz Twitter: @GoGenieMo
From cutting-edge robotics, design, and bioengineering to sustainable energy solutions, ocean engineering, nanotechnology, and innovative materials science, MechE students and their advisors are doing incredibly innovative work. The graduate students highlighted here represent a snapshot of the great work in progress this spring across the Department of Mechanical Engineering, and demonstrate the ways the future of this field is as limitless as the imaginations of its practitioners.
Democratizing design through AI
Lyle Regenwetter Hometown: Champaign, Illinois Advisor: Assistant Professor Faez Ahmed Interests: Food, climbing, skiing, soccer, tennis, cooking
Lyle Regenwetter finds excitement in the prospect of generative AI to "democratize" design and enable inexperienced designers to tackle complex design problems. His research explores new training methods through which generative AI models can be taught to implicitly obey design constraints and synthesize higher-performing designs. Knowing that prospective designers often have an intimate knowledge of the needs of users, but may otherwise lack the technical training to create solutions, Regenwetter also develops human-AI collaborative tools that allow AI models to interact and support designers in popular CAD software and real design problems.
Solving a whale of a problem
Loïcka Baille Hometown: L’Escale, France Advisor: Daniel Zitterbart Interests: Being outdoors — scuba diving, spelunking, or climbing. Sailing on the Charles River, martial arts classes, and playing volleyball
Loïcka Baille’s research focuses on developing remote sensing technologies to study and protect marine life. Her main project revolves around improving onboard whale detection technology to prevent vessel strikes, with a special focus on protecting North Atlantic right whales. Baille is also involved in an ongoing study of Emperor penguins. Her team visits Antarctica annually to tag penguins and gather data to enhance their understanding of penguin population dynamics and draw conclusions regarding the overall health of the ecosystem.
Water, water anywhere
Carlos Díaz-Marín Hometown: San José, Costa Rica Advisor: Professor Gang Chen | Former Advisor: Professor Evelyn Wang Interests: New England hiking, biking, and dancing
Carlos Díaz-Marín designs and synthesizes inexpensive salt-polymer materials that can capture large amounts of humidity from the air. He aims to change the way we generate potable water from the air, even in arid conditions. In addition to water generation, these salt-polymer materials can also be used as thermal batteries, capable of storing and reusing heat. Beyond the scientific applications, Díaz-Marín is excited to continue doing research that can have big social impacts, and that finds and explains new physical phenomena. As a LatinX person, Díaz-Marín is also driven to help increase diversity in STEM.
Scalable fabrication of nano-architected materials
Somayajulu Dhulipala Hometown: Hyderabad, India Advisor: Assistant Professor Carlos Portela Interests: Space exploration, taekwondo, meditation.
Somayajulu Dhulipala works on developing lightweight materials with tunable mechanical properties. He is currently working on methods for the scalable fabrication of nano-architected materials and predicting their mechanical properties. The ability to fine-tune the mechanical properties of specific materials brings versatility and adaptability, making these materials suitable for a wide range of applications across multiple industries. While the research applications are quite diverse, Dhulipala is passionate about making space habitable for humanity, a crucial step toward becoming a spacefaring civilization.
Ingestible health-care devices
Jimmy McRae Hometown: Woburn, Massachusetts Advisor: Associate Professor Giovani Traverso Interests: Anything basketball-related: playing, watching, going to games, organizing hometown tournaments
Jimmy McRae aims to drastically improve diagnostic and therapeutic capabilities through noninvasive health-care technologies. His research focuses on leveraging materials, mechanics, embedded systems, and microfabrication to develop novel ingestible electronic and mechatronic devices. This ranges from ingestible electroceutical capsules that modulate hunger-regulating hormones to devices capable of continuous ultralong monitoring and remotely triggerable actuations from within the stomach. The principles that guide McRae’s work to develop devices that function in extreme environments can be applied far beyond the gastrointestinal tract, with applications for outer space, the ocean, and more.
Freestyle BMX meets machine learning
Eva Nates Hometown: Narberth, Pennsylvania Advisor: Professor Peko Hosoi Interests: Rowing, running, biking, hiking, baking
Eva Nates is working with the Australian Cycling Team to create a tool to classify Bicycle Motocross Freestyle (BMX FS) tricks. She uses a singular value decomposition method to conduct a principal component analysis of the time-dependent point-tracking data of an athlete and their bike during a run to classify each trick. The 2024 Olympic team hopes to incorporate this tool in their training workflow, and Nates worked alongside the team at their facilities on the Gold Coast of Australia during MIT’s Independent Activities Period in January.
Augmenting Astronauts with Wearable Limbs
Erik Ballesteros Hometown: Spring, Texas Advisor: Professor Harry Asada Interests: Cosplay, Star Wars, Lego bricks
Erik Ballesteros’s research seeks to support astronauts who are conducting planetary extravehicular activities through the use of supernumerary robotic limbs (SuperLimbs). His work is tailored toward design and control manifestation to assist astronauts with post-fall recovery, human-leader/robot-follower quadruped locomotion, and coordinated manipulation between the SuperLimbs and the astronaut to perform tasks like excavation and sample handling.
This article appeared in the Spring 2024 edition of the Department of Mechanical Engineering's magazine, MechE Connects.
Benjamin Warf, a renowned neurosurgeon at Boston Children’s Hospital, stands in the MIT.nano Immersion Lab. More than 3,000 miles away, his virtual avatar stands next to Matheus Vasconcelos in Brazil as the resident practices delicate surgery on a doll-like model of a baby’s brain.
With a pair of virtual-reality goggles, Vasconcelos is able to watch Warf’s avatar demonstrate a brain surgery procedure before replicating the technique himself and while asking questions of Warf’s digital twin.
“It’s an almost out-of-body experience,” Warf says of watching his avatar interact with the residents. “Maybe it’s how it feels to have an identical twin?”
And that’s the goal: Warf’s digital twin bridged the distance, allowing him to be functionally in two places at once. “It was my first training using this model, and it had excellent performance,” says Vasconcelos, a neurosurgery resident at Santa Casa de São Paulo School of Medical Sciences in São Paulo, Brazil. “As a resident, I now feel more confident and comfortable applying the technique in a real patient under the guidance of a professor.”
Warf’s avatar arrived via a new project launched by medical simulator and augmented reality (AR) company EDUCSIM. The company is part of the 2023 cohort of START.nano, MIT.nano’s deep-tech accelerator that offers early-stage startups discounted access to MIT.nano’s laboratories.
In March 2023, Giselle Coelho, EDUCSIM’s scientific director and a pediatric neurosurgeon at Santa Casa de São Paulo and Sabará Children’s Hospital, began working with technical staff in the MIT.nano Immersion Lab to create Warf’s avatar. By November, the avatar was training future surgeons like Vasconcelos.
“I had this idea to create the avatar of Dr. Warf as a proof of concept, and asked, ‘What would be the place in the world where they are working on technologies like that?’” Coelho says. “Then I found MIT.nano.”
Capturing a Surgeon
As a neurosurgery resident, Coelho was so frustrated by the lack of practical training options for complex surgeries that she built her own model of a baby brain. The physical model contains all the structures of the brain and can even bleed, “simulating all the steps of a surgery, from incision to skin closure,” she says.
She soon found that simulators and virtual reality (VR) demonstrations reduced the learning curve for her own residents. Coelho launched EDUCSIM in 2017 to expand the variety and reach of the training for residents and experts looking to learn new techniques.
Those techniques include a procedure to treat infant hydrocephalus that was pioneered by Warf, the director of neonatal and congenital neurosurgery at Boston Children’s Hospital. Coelho had learned the technique directly from Warf and thought his avatar might be the way for surgeons who couldn’t travel to Boston to benefit from his expertise.
To create the avatar, Coelho worked with Talis Reks, the AR/VR/gaming/big data IT technologist in the Immersion Lab.
“A lot of technology and hardware can be very expensive for startups to access as they start their company journey,” Reks explains. “START.nano is one way of enabling them to utilize and afford the tools and technologies we have at MIT.nano’s Immersion Lab.”
Coelho and her colleagues needed high-fidelity and high-resolution motion-capture technology, volumetric video capture, and a range of other VR/AR technologies to capture Warf’s dexterous finger motions and facial expressions. Warf visited MIT.nano on several occasions to be digitally “captured,” including performing an operation on the physical baby model while wearing special gloves and clothing embedded with sensors.
“These technologies have mostly been used for entertainment or VFX [visual effects] or CGI [computer-generated imagery],” says Reks, “But this is a unique project, because we’re applying it now for real medical practice and real learning.”
One of the biggest challenges, Reks says, was helping to develop what Coelho calls “holoportation”— transmitting the 3D, volumetric video capture of Warf in real-time over the internet so that his avatar can appear in transcontinental medical training.
The Warf avatar has synchronous and asynchronous modes. The training that Vasconcelos received was in the asynchronous mode, where residents can observe the avatar’s demonstrations and ask it questions. The answers, delivered in a variety of languages, come from AI algorithms that draw from previous research and an extensive bank of questions and answers provided by Warf.
In the synchronous mode, Warf operates his avatar from a distance in real time, Coelho says. “He could walk around the room, he could talk to me, he could orient me. It’s amazing.”
Coelho, Warf, Reks, and other team members demonstrated a combination of the modes in a second session in late December. This demo consisted of volumetric live video capture between the Immersion Lab and Brazil, spatialized and visible in real-time through AR headsets. It significantly expanded upon the previous demo, which had only streamed volumetric data in one direction through a two-dimensional display.
Powerful impacts
Warf has a long history of training desperately needed pediatric neurosurgeons around the world, most recently through his nonprofit Neurokids. Remote and simulated training has been an increasingly large part of training since the pandemic, he says, although he doesn’t feel it will ever completely replace personal hands-on instruction and collaboration.
“But if in fact one day we could have avatars, like this one from Giselle, in remote places showing people how to do things and answering questions for them, without the cost of travel, without the time cost and so forth, I think it could be really powerful,” Warf says.
The avatar project is especially important for surgeons serving remote and underserved areas like the Amazon region of Brazil, Coelho says. “This is a way to give them the same level of education that they would get in other places, and the same opportunity to be in touch with Dr. Warf.”
One baby treated for hydrocephalus at a recent Amazon clinic had traveled by boat 30 hours for the surgery, according to Coelho.
Training surgeons with the avatar, she says, “can change reality for this baby and can change the future.”
How good are your AI-sleuthing abilities? Do you think you can identify an AI-generated image from a real one? If so, the president of Microsoft has put together a little test to see if you can tell the difference. And while some images are dead giveaways, you may be surprised at how realistic an AI-generated image can get - or how weird a real-life image can be.
OpenAI built a text watermarking tool to detect whether a piece of content was written by ChatGPT. However, internal debates rage over whether it should be released.
Artificial neural networks—algorithms inspired by biological brains—are at the center of modern artificial intelligence, behind both chatbots and image generators. But with their many neurons, they can be black boxes, their inner workings uninterpretable to users.
Researchers have now created a fundamentally new way to make neural networks that in some ways surpasses traditional systems. These new networks are more interpretable and also more accurate, proponents say, even when they’re smaller. Their developers say the way they learn to represent physics data concisely could help scientists uncover new laws of nature.
“It’s great to see that there is a new architecture on the table.” —Brice Ménard, Johns Hopkins University
For the past decade or more, engineers have mostly tweaked neural-network designs through trial and error, says Brice Ménard, a physicist at Johns Hopkins University who studies how neural networks operate but was not involved in the new work, which was posted on arXiv in April. “It’s great to see that there is a new architecture on the table,” he says, especially one designed from first principles.
One way to think of neural networks is by analogy with neurons, or nodes, and synapses, or connections between those nodes. In traditional neural networks, called multi-layer perceptrons (MLPs), each synapse learns a weight—a number that determines how strong the connection is between those two neurons. The neurons are arranged in layers, such that a neuron from one layer takes input signals from the neurons in the previous layer, weighted by the strength of their synaptic connection. Each neuron then applies a simple function to the sum total of its inputs, called an activation function.
In traditional neural networks, sometimes called multi-layer perceptrons [left], each synapse learns a number called a weight, and each neuron applies a simple function to the sum of its inputs. In the new Kolmogorov-Arnold architecture [right], each synapse learns a function, and the neurons sum the outputs of those functions.The NSF Institute for Artificial Intelligence and Fundamental Interactions
In the new architecture, the synapses play a more complex role. Instead of simply learning how strong the connection between two neurons is, they learn the full nature of that connection—the function that maps input to output. Unlike the activation function used by neurons in the traditional architecture, this function could be more complex—in fact a “spline” or combination of several functions—and is different in each instance. Neurons, on the other hand, become simpler—they just sum the outputs of all their preceding synapses. The new networks are called Kolmogorov-Arnold Networks (KANs), after two mathematicians who studied how functions could be combined. The idea is that KANs would provide greater flexibility when learning to represent data, while using fewer learned parameters.
“It’s like an alien life that looks at things from a different perspective but is also kind of understandable to humans.” —Ziming Liu, Massachusetts Institute of Technology
The researchers tested their KANs on relatively simple scientific tasks. In some experiments, they took simple physical laws, such as the velocity with which two relativistic-speed objects pass each other. They used these equations to generate input-output data points, then, for each physics function, trained a network on some of the data and tested it on the rest. They found that increasing the size of KANs improves their performance at a faster rate than increasing the size of MLPs did. When solving partial differential equations, a KAN was 100 times as accurate as an MLP that had 100 times as many parameters.
In another experiment, they trained networks to predict one attribute of topological knots, called their signature, based on other attributes of the knots. An MLP achieved 78 percent test accuracy using about 300,000 parameters, while a KAN achieved 81.6 percent test accuracy using only about 200 parameters.
What’s more, the researchers could visually map out the KANs and look at the shapes of the activation functions, as well as the importance of each connection. Either manually or automatically they could prune weak connections and replace some activation functions with simpler ones, like sine or exponential functions. Then they could summarize the entire KAN in an intuitive one-line function (including all the component activation functions), in some cases perfectly reconstructing the physics function that created the dataset.
“In the future, we hope that it can be a useful tool for everyday scientific research,” says Ziming Liu, a computer scientist at the Massachusetts Institute of Technology and the paper’s first author. “Given a dataset we don’t know how to interpret, we just throw it to a KAN, and it can generate some hypothesis for you. You just stare at the brain [the KAN diagram] and you can even perform surgery on that if you want.” You might get a tidy function. “It’s like an alien life that looks at things from a different perspective but is also kind of understandable to humans.”
Dozens of papers have already cited the KAN preprint. “It seemed very exciting the moment that I saw it,” says Alexander Bodner, an undergraduate student of computer science at the University of San Andrés, in Argentina. Within a week, he and three classmates had combined KANs with convolutional neural networks, or CNNs, a popular architecture for processing images. They tested their Convolutional KANs on their ability to categorize handwritten digits or pieces of clothing. The best one approximately matched the performance of a traditional CNN (99 percent accuracy for both networks on digits, 90 percent for both on clothing) but using about 60 percent fewer parameters. The datasets were simple, but Bodner says other teams with more computing power have begun scaling up the networks. Other people are combining KANs with transformers, an architecture popular in large language models.
One downside of KANs is that they take longer per parameter to train—in part because they can’t take advantage of GPUs. But they need fewer parameters. Liu notes that even if KANs don’t replace giant CNNs and transformers for processing images and language, training time won’t be an issue at the smaller scale of many physics problems. He’s looking at ways for experts to insert their prior knowledge into KANs—by manually choosing activation functions, say—and to easily extract knowledge from them using a simple interface. Someday, he says, KANs could help physicists discover high-temperature superconductors or ways to control nuclear fusion.
After withdrawing his lawsuit in June for unknown reasons, Elon Musk has revived a complaint accusing OpenAI and its CEO Sam Altman of fraudulently inducing Musk to contribute $44 million in seed funding by promising that OpenAI would always open-source its technology and prioritize serving the public good over profits as a permanent nonprofit.
Instead, Musk alleged that Altman and his co-conspirators—"preying on Musk’s humanitarian concern about the existential dangers posed by artificial intelligence"—always intended to "betray" these promises in pursuit of personal gains.
As OpenAI's technology advanced toward artificial general intelligence (AGI) and strove to surpass human capabilities, "Altman set the bait and hooked Musk with sham altruism then flipped the script as the non-profit’s technology approached AGI and profits neared, mobilizing Defendants to turn OpenAI, Inc. into their personal piggy bank and OpenAI into a moneymaking bonanza, worth billions," Musk's complaint said.
From cutting-edge robotics, design, and bioengineering to sustainable energy solutions, ocean engineering, nanotechnology, and innovative materials science, MechE students and their advisors are doing incredibly innovative work. The graduate students highlighted here represent a snapshot of the great work in progress this spring across the Department of Mechanical Engineering, and demonstrate the ways the future of this field is as limitless as the imaginations of its practitioners.
Democratizing design through AI
Lyle Regenwetter Hometown: Champaign, Illinois Advisor: Assistant Professor Faez Ahmed Interests: Food, climbing, skiing, soccer, tennis, cooking
Lyle Regenwetter finds excitement in the prospect of generative AI to "democratize" design and enable inexperienced designers to tackle complex design problems. His research explores new training methods through which generative AI models can be taught to implicitly obey design constraints and synthesize higher-performing designs. Knowing that prospective designers often have an intimate knowledge of the needs of users, but may otherwise lack the technical training to create solutions, Regenwetter also develops human-AI collaborative tools that allow AI models to interact and support designers in popular CAD software and real design problems.
Solving a whale of a problem
Loïcka Baille Hometown: L’Escale, France Advisor: Daniel Zitterbart Interests: Being outdoors — scuba diving, spelunking, or climbing. Sailing on the Charles River, martial arts classes, and playing volleyball
Loïcka Baille’s research focuses on developing remote sensing technologies to study and protect marine life. Her main project revolves around improving onboard whale detection technology to prevent vessel strikes, with a special focus on protecting North Atlantic right whales. Baille is also involved in an ongoing study of Emperor penguins. Her team visits Antarctica annually to tag penguins and gather data to enhance their understanding of penguin population dynamics and draw conclusions regarding the overall health of the ecosystem.
Water, water anywhere
Carlos Díaz-Marín Hometown: San José, Costa Rica Advisor: Professor Gang Chen | Former Advisor: Professor Evelyn Wang Interests: New England hiking, biking, and dancing
Carlos Díaz-Marín designs and synthesizes inexpensive salt-polymer materials that can capture large amounts of humidity from the air. He aims to change the way we generate potable water from the air, even in arid conditions. In addition to water generation, these salt-polymer materials can also be used as thermal batteries, capable of storing and reusing heat. Beyond the scientific applications, Díaz-Marín is excited to continue doing research that can have big social impacts, and that finds and explains new physical phenomena. As a LatinX person, Díaz-Marín is also driven to help increase diversity in STEM.
Scalable fabrication of nano-architected materials
Somayajulu Dhulipala Hometown: Hyderabad, India Advisor: Assistant Professor Carlos Portela Interests: Space exploration, taekwondo, meditation.
Somayajulu Dhulipala works on developing lightweight materials with tunable mechanical properties. He is currently working on methods for the scalable fabrication of nano-architected materials and predicting their mechanical properties. The ability to fine-tune the mechanical properties of specific materials brings versatility and adaptability, making these materials suitable for a wide range of applications across multiple industries. While the research applications are quite diverse, Dhulipala is passionate about making space habitable for humanity, a crucial step toward becoming a spacefaring civilization.
Ingestible health-care devices
Jimmy McRae Hometown: Woburn, Massachusetts Advisor: Associate Professor Giovani Traverso Interests: Anything basketball-related: playing, watching, going to games, organizing hometown tournaments
Jimmy McRae aims to drastically improve diagnostic and therapeutic capabilities through noninvasive health-care technologies. His research focuses on leveraging materials, mechanics, embedded systems, and microfabrication to develop novel ingestible electronic and mechatronic devices. This ranges from ingestible electroceutical capsules that modulate hunger-regulating hormones to devices capable of continuous ultralong monitoring and remotely triggerable actuations from within the stomach. The principles that guide McRae’s work to develop devices that function in extreme environments can be applied far beyond the gastrointestinal tract, with applications for outer space, the ocean, and more.
Freestyle BMX meets machine learning
Eva Nates Hometown: Narberth, Pennsylvania Advisor: Professor Peko Hosoi Interests: Rowing, running, biking, hiking, baking
Eva Nates is working with the Australian Cycling Team to create a tool to classify Bicycle Motocross Freestyle (BMX FS) tricks. She uses a singular value decomposition method to conduct a principal component analysis of the time-dependent point-tracking data of an athlete and their bike during a run to classify each trick. The 2024 Olympic team hopes to incorporate this tool in their training workflow, and Nates worked alongside the team at their facilities on the Gold Coast of Australia during MIT’s Independent Activities Period in January.
Augmenting Astronauts with Wearable Limbs
Erik Ballesteros Hometown: Spring, Texas Advisor: Professor Harry Asada Interests: Cosplay, Star Wars, Lego bricks
Erik Ballesteros’s research seeks to support astronauts who are conducting planetary extravehicular activities through the use of supernumerary robotic limbs (SuperLimbs). His work is tailored toward design and control manifestation to assist astronauts with post-fall recovery, human-leader/robot-follower quadruped locomotion, and coordinated manipulation between the SuperLimbs and the astronaut to perform tasks like excavation and sample handling.
This article appeared in the Spring 2024 edition of the Department of Mechanical Engineering's magazine, MechE Connects.
Benjamin Warf, a renowned neurosurgeon at Boston Children’s Hospital, stands in the MIT.nano Immersion Lab. More than 3,000 miles away, his virtual avatar stands next to Matheus Vasconcelos in Brazil as the resident practices delicate surgery on a doll-like model of a baby’s brain.
With a pair of virtual-reality goggles, Vasconcelos is able to watch Warf’s avatar demonstrate a brain surgery procedure before replicating the technique himself and while asking questions of Warf’s digital twin.
“It’s an almost out-of-body experience,” Warf says of watching his avatar interact with the residents. “Maybe it’s how it feels to have an identical twin?”
And that’s the goal: Warf’s digital twin bridged the distance, allowing him to be functionally in two places at once. “It was my first training using this model, and it had excellent performance,” says Vasconcelos, a neurosurgery resident at Santa Casa de São Paulo School of Medical Sciences in São Paulo, Brazil. “As a resident, I now feel more confident and comfortable applying the technique in a real patient under the guidance of a professor.”
Warf’s avatar arrived via a new project launched by medical simulator and augmented reality (AR) company EDUCSIM. The company is part of the 2023 cohort of START.nano, MIT.nano’s deep-tech accelerator that offers early-stage startups discounted access to MIT.nano’s laboratories.
In March 2023, Giselle Coelho, EDUCSIM’s scientific director and a pediatric neurosurgeon at Santa Casa de São Paulo and Sabará Children’s Hospital, began working with technical staff in the MIT.nano Immersion Lab to create Warf’s avatar. By November, the avatar was training future surgeons like Vasconcelos.
“I had this idea to create the avatar of Dr. Warf as a proof of concept, and asked, ‘What would be the place in the world where they are working on technologies like that?’” Coelho says. “Then I found MIT.nano.”
Capturing a Surgeon
As a neurosurgery resident, Coelho was so frustrated by the lack of practical training options for complex surgeries that she built her own model of a baby brain. The physical model contains all the structures of the brain and can even bleed, “simulating all the steps of a surgery, from incision to skin closure,” she says.
She soon found that simulators and virtual reality (VR) demonstrations reduced the learning curve for her own residents. Coelho launched EDUCSIM in 2017 to expand the variety and reach of the training for residents and experts looking to learn new techniques.
Those techniques include a procedure to treat infant hydrocephalus that was pioneered by Warf, the director of neonatal and congenital neurosurgery at Boston Children’s Hospital. Coelho had learned the technique directly from Warf and thought his avatar might be the way for surgeons who couldn’t travel to Boston to benefit from his expertise.
To create the avatar, Coelho worked with Talis Reks, the AR/VR/gaming/big data IT technologist in the Immersion Lab.
“A lot of technology and hardware can be very expensive for startups to access as they start their company journey,” Reks explains. “START.nano is one way of enabling them to utilize and afford the tools and technologies we have at MIT.nano’s Immersion Lab.”
Coelho and her colleagues needed high-fidelity and high-resolution motion-capture technology, volumetric video capture, and a range of other VR/AR technologies to capture Warf’s dexterous finger motions and facial expressions. Warf visited MIT.nano on several occasions to be digitally “captured,” including performing an operation on the physical baby model while wearing special gloves and clothing embedded with sensors.
“These technologies have mostly been used for entertainment or VFX [visual effects] or CGI [computer-generated imagery],” says Reks, “But this is a unique project, because we’re applying it now for real medical practice and real learning.”
One of the biggest challenges, Reks says, was helping to develop what Coelho calls “holoportation”— transmitting the 3D, volumetric video capture of Warf in real-time over the internet so that his avatar can appear in transcontinental medical training.
The Warf avatar has synchronous and asynchronous modes. The training that Vasconcelos received was in the asynchronous mode, where residents can observe the avatar’s demonstrations and ask it questions. The answers, delivered in a variety of languages, come from AI algorithms that draw from previous research and an extensive bank of questions and answers provided by Warf.
In the synchronous mode, Warf operates his avatar from a distance in real time, Coelho says. “He could walk around the room, he could talk to me, he could orient me. It’s amazing.”
Coelho, Warf, Reks, and other team members demonstrated a combination of the modes in a second session in late December. This demo consisted of volumetric live video capture between the Immersion Lab and Brazil, spatialized and visible in real-time through AR headsets. It significantly expanded upon the previous demo, which had only streamed volumetric data in one direction through a two-dimensional display.
Powerful impacts
Warf has a long history of training desperately needed pediatric neurosurgeons around the world, most recently through his nonprofit Neurokids. Remote and simulated training has been an increasingly large part of training since the pandemic, he says, although he doesn’t feel it will ever completely replace personal hands-on instruction and collaboration.
“But if in fact one day we could have avatars, like this one from Giselle, in remote places showing people how to do things and answering questions for them, without the cost of travel, without the time cost and so forth, I think it could be really powerful,” Warf says.
The avatar project is especially important for surgeons serving remote and underserved areas like the Amazon region of Brazil, Coelho says. “This is a way to give them the same level of education that they would get in other places, and the same opportunity to be in touch with Dr. Warf.”
One baby treated for hydrocephalus at a recent Amazon clinic had traveled by boat 30 hours for the surgery, according to Coelho.
Training surgeons with the avatar, she says, “can change reality for this baby and can change the future.”
AI’s impact on the gaming industry is undeniable and it’s starting to show its teeth. According to Wired, AI is not just helping developers, but it’s also taking over jobs. From character animation to QA testing, tasks that once required a human touch are now being handled by machines. This shift is causing a lot of buzz, and not all of it is positive. Some in the industry see it as a threat to job security, with AI systems performing tasks faster and often with greater precision than their human counterparts.
The Bright Side: Deeply Enhanced Gameplay
But let’s not get too bleak here. AI isn’t just about stealing jobs; it’s also about making our games more immersive and interactive. The Appinventiv blog highlights how AI is enhancing gameplay experiences. Think about the NPCs that react more realistically, adapting to your actions and decisions in real-time. AI-driven game design allows for more dynamic storylines, creating a more personalized gaming experience. It’s like having a game that learns and evolves with you, making each playthrough unique.
Crafting the Future of AI
Despite the controversy, there’s no denying that AI is a powerful tool for creativity. Developers can use AI to craft more intricate and engaging worlds. It’s not just about efficiency; it’s about pushing the boundaries of what’s possible in gaming. AI can analyze massive amounts of data to predict player preferences, helping developers create content that resonates more deeply with their audience. It’s about creating games that feel more alive, more responsive, and ultimately, more fun.
AI in gaming is a double-edged sword. It’s reshaping the industry, bringing both challenges and opportunities. As a gamer, I’m excited to see where this technological evolution takes us. But it’s crucial that we find a balance, ensuring that the integration of AI enhances our gaming experiences without compromising the human element that makes game development so unique.
Researchers have developed a way to tamperproof open source large language models to prevent them from being coaxed into, say, explaining how to make a bomb.
Thousands of video game actors went on strike on July 26 for the first time since 2017. The fight is over AI protections and other issues in contract negotiations with some of the biggest studios and publishers, and will halt work from SAG-AFTRA members on future projects, as well as possibly keep them from promotion…
Nvidia started as a humble graphics card maker. Now it’s riding the tech industry’s AI obsession to absurd new heights. The company added $329 billion to its market cap on Wall Street today after a record-breaking day of stock trading, Bloomberg reports.
This is a guest post. The views expressed here are solely those of the authors and do not represent positions of IEEE Spectrum, The Institute, or IEEE.
Many in the civilian artificial intelligence community don’t seem to realize that today’s AI innovations could have serious consequences for international peace and security. Yet AI practitioners—whether researchers, engineers, product developers, or industry managers—can play critical roles in mitigating risks through the decisions they make throughout the life cycle of AI technologies.
Other ways are more indirect. AI companies’ decisions about whether to make their software open-source and in which conditions, for example, have geopolitical implications. Such decisions determine how states or nonstate actors access critical technology, which they might use to develop military AI applications, potentially including autonomous weapons systems.
AI companies and researchers must become more aware of the challenges, and of their capacity to do something about them.
If education programs provide foundational knowledge about the societal impact of technology and the way technology governance works, AI practitioners will be better empowered to innovate responsibly and be meaningful designers and implementers of regulations.
What Needs to Change in AI Education
Responsible AI requires a spectrum of capabilities that are typically not covered in AI education. AI should no longer be treated as a pure STEM discipline but rather a transdisciplinary one that requires technical knowledge, yes, but also insights from the social sciences and humanities. There should be mandatory courses on the societal impact of technology and responsible innovation, as well as specific training on AI ethics and governance.
Those subjects should be part of the core curriculum at both the undergraduate and graduate levels at all universities that offer AI degrees.
If education programs provide foundational knowledge about the societal impact of technology and the way technology governance works, AI practitioners will be empowered to innovate responsibly and be meaningful designers and implementers of AI regulations.
Changing the AI education curriculum is no small task. In some countries, modifications to university curricula require approval at the ministry level. Proposed changes can be met with internal resistance due to cultural, bureaucratic, or financial reasons. Meanwhile, the existing instructors’ expertise in the new topics might be limited.
There’s no need for a one-size-fits-all teaching model, but there’s certainly a need for funding to hire dedicated staff members and train them.
Adding Responsible AI to Lifelong Learning
The AI community must develop continuing education courses on the societal impact of AI research so that practitioners can keep learning about such topics throughout their career.
AI is bound to evolve in unexpected ways. Identifying and mitigating its risks will require ongoing discussions involving not only researchers and developers but also people who might directly or indirectly be impacted by its use. A well-rounded continuing education program would draw insights from all stakeholders.
Some universities and private companies already have ethical review boards and policy teams that assess the impact of AI tools. Although the teams’ mandate usually does not include training, their duties could be expanded to make courses available to everyone within the organization. Training on responsible AI research shouldn’t be a matter of individual interest; it should be encouraged.
Organizations such as IEEE and the Association for Computing Machinery could play important roles in establishing continuing education courses because they’re well placed to pool information and facilitate dialogue, which could result in the establishment of ethical norms.
Engaging With the Wider World
We also need AI practitioners to share knowledge and ignite discussions about potential risks beyond the bounds of the AI research community.
Those communities, however, are currently too small and not sufficiently diverse, as their most prominent members typically share similar backgrounds. Their lack of diversity could lead the groups to ignore risks that affect underrepresented populations.
What’s more, AI practitioners might need help and tutelage in how to engage with people outside the AI research community—especially with policymakers. Articulating problems or recommendations in ways that nontechnical individuals can understand is a necessary skill.
We must find ways to grow the existing communities, make them more diverse and inclusive, and make them better at engaging with the rest of society. Large professional organizations such as IEEE and ACM could help, perhaps by creating dedicated working groups of experts or setting up tracks at AI conferences.
Universities and the private sector also can help by creating or expanding positions and departments focused on AI’s societal impact and AI governance. Umeå University recently created an AI Policy Lab to address the issues. Companies including Anthropic, Google, Meta, and OpenAI have established divisions or units dedicated to such topics.
The central question before regulators is whether AI researchers and companies can be trusted to develop the technology responsibly.
In our view, one of the most effective and sustainable ways to ensure that AI developers take responsibility for the risks is to invest in education. Practitioners of today and tomorrow must have the basic knowledge and means to address the risk stemming from their work if they are to be effective designers and implementers of future AI regulations.
From cutting-edge robotics, design, and bioengineering to sustainable energy solutions, ocean engineering, nanotechnology, and innovative materials science, MechE students and their advisors are doing incredibly innovative work. The graduate students highlighted here represent a snapshot of the great work in progress this spring across the Department of Mechanical Engineering, and demonstrate the ways the future of this field is as limitless as the imaginations of its practitioners.
Democratizing design through AI
Lyle Regenwetter Hometown: Champaign, Illinois Advisor: Assistant Professor Faez Ahmed Interests: Food, climbing, skiing, soccer, tennis, cooking
Lyle Regenwetter finds excitement in the prospect of generative AI to "democratize" design and enable inexperienced designers to tackle complex design problems. His research explores new training methods through which generative AI models can be taught to implicitly obey design constraints and synthesize higher-performing designs. Knowing that prospective designers often have an intimate knowledge of the needs of users, but may otherwise lack the technical training to create solutions, Regenwetter also develops human-AI collaborative tools that allow AI models to interact and support designers in popular CAD software and real design problems.
Solving a whale of a problem
Loïcka Baille Hometown: L’Escale, France Advisor: Daniel Zitterbart Interests: Being outdoors — scuba diving, spelunking, or climbing. Sailing on the Charles River, martial arts classes, and playing volleyball
Loïcka Baille’s research focuses on developing remote sensing technologies to study and protect marine life. Her main project revolves around improving onboard whale detection technology to prevent vessel strikes, with a special focus on protecting North Atlantic right whales. Baille is also involved in an ongoing study of Emperor penguins. Her team visits Antarctica annually to tag penguins and gather data to enhance their understanding of penguin population dynamics and draw conclusions regarding the overall health of the ecosystem.
Water, water anywhere
Carlos Díaz-Marín Hometown: San José, Costa Rica Advisor: Professor Gang Chen | Former Advisor: Professor Evelyn Wang Interests: New England hiking, biking, and dancing
Carlos Díaz-Marín designs and synthesizes inexpensive salt-polymer materials that can capture large amounts of humidity from the air. He aims to change the way we generate potable water from the air, even in arid conditions. In addition to water generation, these salt-polymer materials can also be used as thermal batteries, capable of storing and reusing heat. Beyond the scientific applications, Díaz-Marín is excited to continue doing research that can have big social impacts, and that finds and explains new physical phenomena. As a LatinX person, Díaz-Marín is also driven to help increase diversity in STEM.
Scalable fabrication of nano-architected materials
Somayajulu Dhulipala Hometown: Hyderabad, India Advisor: Assistant Professor Carlos Portela Interests: Space exploration, taekwondo, meditation.
Somayajulu Dhulipala works on developing lightweight materials with tunable mechanical properties. He is currently working on methods for the scalable fabrication of nano-architected materials and predicting their mechanical properties. The ability to fine-tune the mechanical properties of specific materials brings versatility and adaptability, making these materials suitable for a wide range of applications across multiple industries. While the research applications are quite diverse, Dhulipala is passionate about making space habitable for humanity, a crucial step toward becoming a spacefaring civilization.
Ingestible health-care devices
Jimmy McRae Hometown: Woburn, Massachusetts Advisor: Associate Professor Giovani Traverso Interests: Anything basketball-related: playing, watching, going to games, organizing hometown tournaments
Jimmy McRae aims to drastically improve diagnostic and therapeutic capabilities through noninvasive health-care technologies. His research focuses on leveraging materials, mechanics, embedded systems, and microfabrication to develop novel ingestible electronic and mechatronic devices. This ranges from ingestible electroceutical capsules that modulate hunger-regulating hormones to devices capable of continuous ultralong monitoring and remotely triggerable actuations from within the stomach. The principles that guide McRae’s work to develop devices that function in extreme environments can be applied far beyond the gastrointestinal tract, with applications for outer space, the ocean, and more.
Freestyle BMX meets machine learning
Eva Nates Hometown: Narberth, Pennsylvania Advisor: Professor Peko Hosoi Interests: Rowing, running, biking, hiking, baking
Eva Nates is working with the Australian Cycling Team to create a tool to classify Bicycle Motocross Freestyle (BMX FS) tricks. She uses a singular value decomposition method to conduct a principal component analysis of the time-dependent point-tracking data of an athlete and their bike during a run to classify each trick. The 2024 Olympic team hopes to incorporate this tool in their training workflow, and Nates worked alongside the team at their facilities on the Gold Coast of Australia during MIT’s Independent Activities Period in January.
Augmenting Astronauts with Wearable Limbs
Erik Ballesteros Hometown: Spring, Texas Advisor: Professor Harry Asada Interests: Cosplay, Star Wars, Lego bricks
Erik Ballesteros’s research seeks to support astronauts who are conducting planetary extravehicular activities through the use of supernumerary robotic limbs (SuperLimbs). His work is tailored toward design and control manifestation to assist astronauts with post-fall recovery, human-leader/robot-follower quadruped locomotion, and coordinated manipulation between the SuperLimbs and the astronaut to perform tasks like excavation and sample handling.
This article appeared in the Spring 2024 edition of the Department of Mechanical Engineering's magazine, MechE Connects.
Benjamin Warf, a renowned neurosurgeon at Boston Children’s Hospital, stands in the MIT.nano Immersion Lab. More than 3,000 miles away, his virtual avatar stands next to Matheus Vasconcelos in Brazil as the resident practices delicate surgery on a doll-like model of a baby’s brain.
With a pair of virtual-reality goggles, Vasconcelos is able to watch Warf’s avatar demonstrate a brain surgery procedure before replicating the technique himself and while asking questions of Warf’s digital twin.
“It’s an almost out-of-body experience,” Warf says of watching his avatar interact with the residents. “Maybe it’s how it feels to have an identical twin?”
And that’s the goal: Warf’s digital twin bridged the distance, allowing him to be functionally in two places at once. “It was my first training using this model, and it had excellent performance,” says Vasconcelos, a neurosurgery resident at Santa Casa de São Paulo School of Medical Sciences in São Paulo, Brazil. “As a resident, I now feel more confident and comfortable applying the technique in a real patient under the guidance of a professor.”
Warf’s avatar arrived via a new project launched by medical simulator and augmented reality (AR) company EDUCSIM. The company is part of the 2023 cohort of START.nano, MIT.nano’s deep-tech accelerator that offers early-stage startups discounted access to MIT.nano’s laboratories.
In March 2023, Giselle Coelho, EDUCSIM’s scientific director and a pediatric neurosurgeon at Santa Casa de São Paulo and Sabará Children’s Hospital, began working with technical staff in the MIT.nano Immersion Lab to create Warf’s avatar. By November, the avatar was training future surgeons like Vasconcelos.
“I had this idea to create the avatar of Dr. Warf as a proof of concept, and asked, ‘What would be the place in the world where they are working on technologies like that?’” Coelho says. “Then I found MIT.nano.”
Capturing a Surgeon
As a neurosurgery resident, Coelho was so frustrated by the lack of practical training options for complex surgeries that she built her own model of a baby brain. The physical model contains all the structures of the brain and can even bleed, “simulating all the steps of a surgery, from incision to skin closure,” she says.
She soon found that simulators and virtual reality (VR) demonstrations reduced the learning curve for her own residents. Coelho launched EDUCSIM in 2017 to expand the variety and reach of the training for residents and experts looking to learn new techniques.
Those techniques include a procedure to treat infant hydrocephalus that was pioneered by Warf, the director of neonatal and congenital neurosurgery at Boston Children’s Hospital. Coelho had learned the technique directly from Warf and thought his avatar might be the way for surgeons who couldn’t travel to Boston to benefit from his expertise.
To create the avatar, Coelho worked with Talis Reks, the AR/VR/gaming/big data IT technologist in the Immersion Lab.
“A lot of technology and hardware can be very expensive for startups to access as they start their company journey,” Reks explains. “START.nano is one way of enabling them to utilize and afford the tools and technologies we have at MIT.nano’s Immersion Lab.”
Coelho and her colleagues needed high-fidelity and high-resolution motion-capture technology, volumetric video capture, and a range of other VR/AR technologies to capture Warf’s dexterous finger motions and facial expressions. Warf visited MIT.nano on several occasions to be digitally “captured,” including performing an operation on the physical baby model while wearing special gloves and clothing embedded with sensors.
“These technologies have mostly been used for entertainment or VFX [visual effects] or CGI [computer-generated imagery],” says Reks, “But this is a unique project, because we’re applying it now for real medical practice and real learning.”
One of the biggest challenges, Reks says, was helping to develop what Coelho calls “holoportation”— transmitting the 3D, volumetric video capture of Warf in real-time over the internet so that his avatar can appear in transcontinental medical training.
The Warf avatar has synchronous and asynchronous modes. The training that Vasconcelos received was in the asynchronous mode, where residents can observe the avatar’s demonstrations and ask it questions. The answers, delivered in a variety of languages, come from AI algorithms that draw from previous research and an extensive bank of questions and answers provided by Warf.
In the synchronous mode, Warf operates his avatar from a distance in real time, Coelho says. “He could walk around the room, he could talk to me, he could orient me. It’s amazing.”
Coelho, Warf, Reks, and other team members demonstrated a combination of the modes in a second session in late December. This demo consisted of volumetric live video capture between the Immersion Lab and Brazil, spatialized and visible in real-time through AR headsets. It significantly expanded upon the previous demo, which had only streamed volumetric data in one direction through a two-dimensional display.
Powerful impacts
Warf has a long history of training desperately needed pediatric neurosurgeons around the world, most recently through his nonprofit Neurokids. Remote and simulated training has been an increasingly large part of training since the pandemic, he says, although he doesn’t feel it will ever completely replace personal hands-on instruction and collaboration.
“But if in fact one day we could have avatars, like this one from Giselle, in remote places showing people how to do things and answering questions for them, without the cost of travel, without the time cost and so forth, I think it could be really powerful,” Warf says.
The avatar project is especially important for surgeons serving remote and underserved areas like the Amazon region of Brazil, Coelho says. “This is a way to give them the same level of education that they would get in other places, and the same opportunity to be in touch with Dr. Warf.”
One baby treated for hydrocephalus at a recent Amazon clinic had traveled by boat 30 hours for the surgery, according to Coelho.
Training surgeons with the avatar, she says, “can change reality for this baby and can change the future.”
For years, Nvidia has dominated many machine learning benchmarks, and now there are two more notches in its belt.
MLPerf, the AI benchmarking suite sometimes called “the Olympics of machine learning,” has released a new set of training tests to help make more and better apples-to-apples comparisons between competing computer systems. One of MLPerf’s new tests concerns fine-tuning of large language models, a process that takes an existing trained model and trains it a bit more with specialized knowledge to make it fit for a particular purpose. The other is for graph neural networks, a type of machine learning behind some literature databases, fraud detection in financial systems, and social networks.
Even with the additions and the participation of computers using Google’s and Intel’s AI accelerators, systems powered by Nvidia’s Hopper architecture dominated the results once again. One system that included 11,616 Nvidia H100 GPUs—the largest collection yet—topped each of the nine benchmarks, setting records in five of them (including the two new benchmarks).
“If you just throw hardware at the problem, it’s not a given that you’re going to improve.” —Dave Salvator, Nvidia
The 11,616-H100 system is “the biggest we’ve ever done,” says Dave Salvator, director of accelerated computing products at Nvidia. It smashed through the GPT-3 training trial in less than 3.5 minutes. A 512-GPU system, for comparison, took about 51 minutes. (Note that the GPT-3 task is not a full training, which could take weeks and cost millions of dollars. Instead, the computers train on a representative portion of the data, at an agreed-upon point well before completion.)
Compared to Nvidia’s largest entrant on GPT-3 last year, a 3,584 H100 computer, the 3.5-minute result represents a 3.2-fold improvement. You might expect that just from the difference in the size of these systems, but in AI computing that isn’t always the case, explains Salvator. “If you just throw hardware at the problem, it’s not a given that you’re going to improve,” he says.
“We are getting essentially linear scaling,” says Salvator. By that he means that twice as many GPUs lead to a halved training time. “[That] represents a great achievement from our engineering teams,” he adds.
Competitors are also getting closer to linear scaling. This round Intel deployed a system using 1,024 GPUs that performed the GPT-3 task in 67 minutes versus a computer one-fourth the size that took 224 minutes six months ago. Google’s largest GPT-3 entry used 12-times the number of TPU v5p accelerators as its smallest entry and performed its task nine times as fast.
Linear scaling is going to be particularly important for upcoming “AI factories” housing 100,000 GPUs or more, Salvator says. He says to expect one such data center to come online this year, and another, using Nvidia’s next architecture, Blackwell, to startup in 2025.
Nvidia’s streak continues
Nvidia continued to boost training times despite using the same architecture, Hopper, as it did in last year’s training results. That’s all down to software improvements, says Salvator. “Typically, we’ll get a 2-2.5x [boost] from software after a new architecture is released,” he says.
For GPT-3 training, Nvidia logged a 27 percent improvement from the June 2023 MLPerf benchmarks. Salvator says there were several software changes behind the boost. For example, Nvidia engineers tuned up Hopper’s use of less accurate, 8-bit floating point operations by trimming unnecessary conversions between 8-bit and 16-bit numbers and better targeting of which layers of a neural network could use the lower precision number format. They also found a more intelligent way to adjust the power budget of each chip’s compute engines, and sped communication among GPUs in a way that Salvator likened to “buttering your toast while it’s still in the toaster.”
Additionally, the company implemented a scheme called flash attention. Invented in the Stanford University laboratory of Samba Nova founder Chris Re, flash attention is an algorithm that speeds transformer networks by minimizing writes to memory. When it first showed up in MLPerf benchmarks, flash attention shaved as much as 10 percent from training times. (Intel, too, used a version of flash attention but not for GPT-3. It instead used the algorithm for one of the new benchmarks, fine-tuning.)
Using other software and network tricks, Nvidia delivered an 80 percent speedup in the text-to-image test, Stable Diffusion, versus its submission in November 2023.
New benchmarks
MLPerf adds new benchmarks and upgrades old ones to stay relevant to what’s happening in the AI industry. This year saw the addition of fine-tuning and graph neural networks.
Fine tuning takes an already trained LLM and specializes it for use in a particular field. Nvidia, for example took a trained 43-billion-parameter model and trained it on the GPU-maker’s design files and documentation to create ChipNeMo, an AI intended to boost the productivity of its chip designers. At the time, the company’s chief technology officer Bill Dally said that training an LLM was like giving it a liberal arts education, and fine tuning was like sending it to graduate school.
The MLPerf benchmark takes a pretrained Llama-2-70B model and asks the system to fine tune it using a dataset of government documents with the goal of generating more accurate document summaries.
There are several ways to do fine-tuning. MLPerf chose one called low-rank adaptation (LoRA). The method winds up training only a small portion of the LLM’s parameters leading to a 3-fold lower burden on hardware and reduced use of memory and storage versus other methods, according to the organization.
The other new benchmark involved a graph neural network (GNN). These are for problems that can be represented by a very large set of interconnected nodes, such as a social network or a recommender system. Compared to other AI tasks, GNNs require a lot of communication between nodes in a computer.
The benchmark trained a GNN on a database that shows relationships about academic authors, papers, and institutes—a graph with 547 million nodes and 5.8 billion edges. The neural network was then trained to predict the right label for each node in the graph.
Future fights
Training rounds in 2025 may see head-to-head contests comparing new accelerators from AMD, Intel, and Nvidia. AMD’s MI300 series was launched about six months ago, and a memory-boosted upgrade the MI325x is planned for the end of 2024, with the next generation MI350 slated for 2025. Intel says its Gaudi 3, generally available to computer makers later this year, will appear in MLPerf’s upcoming inferencing benchmarks. Intel executives have said the new chip has the capacity to beat H100 at training LLMs. But the victory may be short-lived, as Nvidia has unveiled a new architecture, Blackwell, which is planned for late this year.
As large supercomputers keep getting larger,Sunnyvale, California-based Cerebras has been taking a different approach. Instead of connecting more and more GPUs together, the company has been squeezing as many processors as it can onto one giant wafer. The main advantage is in the interconnects—by wiring processors together on-chip, the wafer-scale chip bypasses many of the computational speed lossesthat come from many GPUs talking to each other, as well as losses from loading data to and from memory.
Now, Cerebras has flaunted the advantages of their wafer-scale chips in two separate but related results. First, the company demonstrated that its second generation wafer-scale engine, WSE-2,was significantly faster than world’s fastest supercomputer, Frontier, in molecular dynamics calculations—the field that underlies protein folding, modeling radiation damage in nuclear reactors, and other problems in material science. Second, in collaboration with machine learning model optimization company Neural Magic, Cerebras demonstrated that a sparse large language model could perform inference at one-third of the energy cost of a full model without losing any accuracy. Although the results are in vastly different fields, they were both possible because of the interconnects and fast memory access enabled by Cerebras’ hardware.
Speeding Through the Molecular World
“Imagine there’s a tailor and he can make a suit in a week,” says Cerebras CEO and co-founder Andrew Feldman. “He buys the neighboring tailor, and she can also make a suit in a week, but they can’t work together. Now, they can now make two suits in a week. But what they can’t do is make a suit in three and a half days.”
According to Feldman, GPUs are like tailors that can’t work together, at least when it comes to some problems in molecular dynamics. As you connect more and more GPUs, they can simulate more atoms at the same time, but they can’t simulate the same number of atoms more quickly.
Cerebras’ wafer-scale engine, however, scales in a fundamentally different way. Because the chips are not limited by interconnect bandwidth, they can communicate quickly, like two tailors collaborating perfectly to make a suit in three and a half days.
“It’s difficult to create materials that have the right properties, that have a long lifetime and sufficient strength and don’t break.” —Tomas Oppelstrup, Lawrence Livermore National Laboratory
To demonstrate this advantage, the team simulated 800,000 atoms interacting with each other, calculating the interactions in increments of one femtosecond at a time. Each step took just microseconds to compute on their hardware. Although that’s still 9 orders of magnitude slower than the actual interactions, it was also 179 times as fast as the Frontier supercomputer. The achievement effectively reduced a year’s worth of computation to just two days.
This work was done in collaboration with Sandia, Lawrence Livermore, and Los Alamos National Laboratories. Tomas Oppelstrup, staff scientist at Lawrence Livermore National Laboratory, says this advance makes it feasible to simulate molecular interactions that were previously inaccessible.
Oppelstrup says this will be particularly useful for understanding the longer-term stability of materials in extreme conditions. “When you build advanced machines that operate at high temperatures, like jet engines, nuclear reactors, or fusion reactors for energy production,” he says, “you need materials that can withstand these high temperatures and very harsh environments. It’s difficult to create materials that have the right properties, that have a long lifetime and sufficient strength and don’t break.” Being able to simulate the behavior of candidate materials for longer, Oppelstrup says, will be crucial to the material design and development process.
Ilya Sharapov, principal engineer at Cerebras, say the company is looking forward to extending applications of its wafer-scale engine to a larger class of problems, including molecular dynamics simulations of biological processes and simulations of airflow around cars or aircrafts.
Downsizing Large Language Models
As large language models (LLMs) are becoming more popular, the energy costs of using them are starting to overshadow the training costs—potentially by as much as a factor of ten in some estimates. “Inference is is the primary workload of AI today because everyone is using ChatGPT,” says James Wang, director of product marketing at Cerebras, “and it’s very expensive to run especially at scale.”
One way to reduce the energy cost (and speed) of inference is through sparsity—essentially, harnessing the power of zeros. LLMs are made up of huge numbers of parameters. The open-source Llama model used by Cerebras, for example, has 7 billion parameters. During inference, each of those parameters is used to crunch through the input data and spit out the output. If, however, a significant fraction of those parameters are zeros, they can be skipped during the calculation, saving both time and energy.
The problem is that skipping specific parameters is a difficult to do on a GPU. Reading from memory on a GPU is relatively slow, because they’re designed to read memory in chunks, which means taking in groups of parameters at a time. This doesn’t allow GPUs to skip zeros that are randomly interspersed in the parameter set. Cerebras CEO Feldman offered another analogy: “It’s equivalent to a shipper, only wanting to move stuff on pallets because they don’t want to examine each box. Memory bandwidth is the ability to examine each box to make sure it’s not empty. If it’s empty, set it aside and then not move it.”
“There’s a million cores in a very tight package, meaning that the cores have very low latency, high bandwidth interactions between them.” —Ilya Sharapov, Cerebras
Some GPUs are equipped for a particular kind of sparsity, called 2:4, where exactly two out of every four consecutively stored parameters are zeros. State-of-the-art GPUs have terabytes per second of memory bandwidth. The memory bandwidth of Cerebras’ WSE-2 is more than one thousand times as high, at 20 petabytes per second. This allows for harnessing unstructured sparsity, meaning the researcherscan zero out parameters as needed, wherever in the model they happen to be, and check each one on the fly during a computation. “Our hardware is built right from day one to support unstructured sparsity,” Wang says.
Even with the appropriate hardware, zeroing out many of the model’s parameters results in a worse model. But the joint team from Neural Magic and Cerebras figured out a way to recover the full accuracy of the original model. After slashing 70 percent of the parameters to zero, the team performed two further phases of training to give the non-zero parameters a chance to compensate for the new zeros.
This extra training uses about 7 percent of the original training energy, and the companies found that they recover full model accuracy with this training. The smaller model takes one-third of the time and energy during inference as the original, full model. “What makes these novel applications possible in our hardware,” Sharapov says, “Is that there’s a million cores in a very tight package, meaning that the cores have very low latency, high bandwidth interactions between them.”
Three major features in iOS 18 and macOS Sequoia will not be available to European users this fall, Apple says. They include iPhone screen mirroring on the Mac, SharePlay screen sharing, and the entire Apple Intelligence suite of generative AI features.
In a statement sent to Financial Times, The Verge, and others, Apple says this decision is related to the European Union's Digital Markets Act (DMA). Here's the full statement, which was attributed to Apple spokesperson Fred Sainz:
Two weeks ago, Apple unveiled hundreds of new features that we are excited to bring to our users around the world. We are highly motivated to make these technologies accessible to all users. However, due to the regulatory uncertainties brought about by the Digital Markets Act (DMA), we do not believe that we will be able to roll out three of these features — iPhone Mirroring, SharePlay Screen Sharing enhancements, and Apple Intelligence — to our EU users this year.
Specifically, we are concerned that the interoperability requirements of the DMA could force us to compromise the integrity of our products in ways that risk user privacy and data security. We are committed to collaborating with the European Commission in an attempt to find a solution that would enable us to deliver these features to our EU customers without compromising their safety.
It is unclear from Apple's statement precisely which aspects of the DMA may have led to this decision. It could be that Apple is concerned that it would be required to give competitors like Microsoft or Google access to user data collected for Apple Intelligence features and beyond, but we're not sure.
From cutting-edge robotics, design, and bioengineering to sustainable energy solutions, ocean engineering, nanotechnology, and innovative materials science, MechE students and their advisors are doing incredibly innovative work. The graduate students highlighted here represent a snapshot of the great work in progress this spring across the Department of Mechanical Engineering, and demonstrate the ways the future of this field is as limitless as the imaginations of its practitioners.
Democratizing design through AI
Lyle Regenwetter Hometown: Champaign, Illinois Advisor: Assistant Professor Faez Ahmed Interests: Food, climbing, skiing, soccer, tennis, cooking
Lyle Regenwetter finds excitement in the prospect of generative AI to "democratize" design and enable inexperienced designers to tackle complex design problems. His research explores new training methods through which generative AI models can be taught to implicitly obey design constraints and synthesize higher-performing designs. Knowing that prospective designers often have an intimate knowledge of the needs of users, but may otherwise lack the technical training to create solutions, Regenwetter also develops human-AI collaborative tools that allow AI models to interact and support designers in popular CAD software and real design problems.
Solving a whale of a problem
Loïcka Baille Hometown: L’Escale, France Advisor: Daniel Zitterbart Interests: Being outdoors — scuba diving, spelunking, or climbing. Sailing on the Charles River, martial arts classes, and playing volleyball
Loïcka Baille’s research focuses on developing remote sensing technologies to study and protect marine life. Her main project revolves around improving onboard whale detection technology to prevent vessel strikes, with a special focus on protecting North Atlantic right whales. Baille is also involved in an ongoing study of Emperor penguins. Her team visits Antarctica annually to tag penguins and gather data to enhance their understanding of penguin population dynamics and draw conclusions regarding the overall health of the ecosystem.
Water, water anywhere
Carlos Díaz-Marín Hometown: San José, Costa Rica Advisor: Professor Gang Chen | Former Advisor: Professor Evelyn Wang Interests: New England hiking, biking, and dancing
Carlos Díaz-Marín designs and synthesizes inexpensive salt-polymer materials that can capture large amounts of humidity from the air. He aims to change the way we generate potable water from the air, even in arid conditions. In addition to water generation, these salt-polymer materials can also be used as thermal batteries, capable of storing and reusing heat. Beyond the scientific applications, Díaz-Marín is excited to continue doing research that can have big social impacts, and that finds and explains new physical phenomena. As a LatinX person, Díaz-Marín is also driven to help increase diversity in STEM.
Scalable fabrication of nano-architected materials
Somayajulu Dhulipala Hometown: Hyderabad, India Advisor: Assistant Professor Carlos Portela Interests: Space exploration, taekwondo, meditation.
Somayajulu Dhulipala works on developing lightweight materials with tunable mechanical properties. He is currently working on methods for the scalable fabrication of nano-architected materials and predicting their mechanical properties. The ability to fine-tune the mechanical properties of specific materials brings versatility and adaptability, making these materials suitable for a wide range of applications across multiple industries. While the research applications are quite diverse, Dhulipala is passionate about making space habitable for humanity, a crucial step toward becoming a spacefaring civilization.
Ingestible health-care devices
Jimmy McRae Hometown: Woburn, Massachusetts Advisor: Associate Professor Giovani Traverso Interests: Anything basketball-related: playing, watching, going to games, organizing hometown tournaments
Jimmy McRae aims to drastically improve diagnostic and therapeutic capabilities through noninvasive health-care technologies. His research focuses on leveraging materials, mechanics, embedded systems, and microfabrication to develop novel ingestible electronic and mechatronic devices. This ranges from ingestible electroceutical capsules that modulate hunger-regulating hormones to devices capable of continuous ultralong monitoring and remotely triggerable actuations from within the stomach. The principles that guide McRae’s work to develop devices that function in extreme environments can be applied far beyond the gastrointestinal tract, with applications for outer space, the ocean, and more.
Freestyle BMX meets machine learning
Eva Nates Hometown: Narberth, Pennsylvania Advisor: Professor Peko Hosoi Interests: Rowing, running, biking, hiking, baking
Eva Nates is working with the Australian Cycling Team to create a tool to classify Bicycle Motocross Freestyle (BMX FS) tricks. She uses a singular value decomposition method to conduct a principal component analysis of the time-dependent point-tracking data of an athlete and their bike during a run to classify each trick. The 2024 Olympic team hopes to incorporate this tool in their training workflow, and Nates worked alongside the team at their facilities on the Gold Coast of Australia during MIT’s Independent Activities Period in January.
Augmenting Astronauts with Wearable Limbs
Erik Ballesteros Hometown: Spring, Texas Advisor: Professor Harry Asada Interests: Cosplay, Star Wars, Lego bricks
Erik Ballesteros’s research seeks to support astronauts who are conducting planetary extravehicular activities through the use of supernumerary robotic limbs (SuperLimbs). His work is tailored toward design and control manifestation to assist astronauts with post-fall recovery, human-leader/robot-follower quadruped locomotion, and coordinated manipulation between the SuperLimbs and the astronaut to perform tasks like excavation and sample handling.
This article appeared in the Spring 2024 edition of the Department of Mechanical Engineering's magazine, MechE Connects.
Benjamin Warf, a renowned neurosurgeon at Boston Children’s Hospital, stands in the MIT.nano Immersion Lab. More than 3,000 miles away, his virtual avatar stands next to Matheus Vasconcelos in Brazil as the resident practices delicate surgery on a doll-like model of a baby’s brain.
With a pair of virtual-reality goggles, Vasconcelos is able to watch Warf’s avatar demonstrate a brain surgery procedure before replicating the technique himself and while asking questions of Warf’s digital twin.
“It’s an almost out-of-body experience,” Warf says of watching his avatar interact with the residents. “Maybe it’s how it feels to have an identical twin?”
And that’s the goal: Warf’s digital twin bridged the distance, allowing him to be functionally in two places at once. “It was my first training using this model, and it had excellent performance,” says Vasconcelos, a neurosurgery resident at Santa Casa de São Paulo School of Medical Sciences in São Paulo, Brazil. “As a resident, I now feel more confident and comfortable applying the technique in a real patient under the guidance of a professor.”
Warf’s avatar arrived via a new project launched by medical simulator and augmented reality (AR) company EDUCSIM. The company is part of the 2023 cohort of START.nano, MIT.nano’s deep-tech accelerator that offers early-stage startups discounted access to MIT.nano’s laboratories.
In March 2023, Giselle Coelho, EDUCSIM’s scientific director and a pediatric neurosurgeon at Santa Casa de São Paulo and Sabará Children’s Hospital, began working with technical staff in the MIT.nano Immersion Lab to create Warf’s avatar. By November, the avatar was training future surgeons like Vasconcelos.
“I had this idea to create the avatar of Dr. Warf as a proof of concept, and asked, ‘What would be the place in the world where they are working on technologies like that?’” Coelho says. “Then I found MIT.nano.”
Capturing a Surgeon
As a neurosurgery resident, Coelho was so frustrated by the lack of practical training options for complex surgeries that she built her own model of a baby brain. The physical model contains all the structures of the brain and can even bleed, “simulating all the steps of a surgery, from incision to skin closure,” she says.
She soon found that simulators and virtual reality (VR) demonstrations reduced the learning curve for her own residents. Coelho launched EDUCSIM in 2017 to expand the variety and reach of the training for residents and experts looking to learn new techniques.
Those techniques include a procedure to treat infant hydrocephalus that was pioneered by Warf, the director of neonatal and congenital neurosurgery at Boston Children’s Hospital. Coelho had learned the technique directly from Warf and thought his avatar might be the way for surgeons who couldn’t travel to Boston to benefit from his expertise.
To create the avatar, Coelho worked with Talis Reks, the AR/VR/gaming/big data IT technologist in the Immersion Lab.
“A lot of technology and hardware can be very expensive for startups to access as they start their company journey,” Reks explains. “START.nano is one way of enabling them to utilize and afford the tools and technologies we have at MIT.nano’s Immersion Lab.”
Coelho and her colleagues needed high-fidelity and high-resolution motion-capture technology, volumetric video capture, and a range of other VR/AR technologies to capture Warf’s dexterous finger motions and facial expressions. Warf visited MIT.nano on several occasions to be digitally “captured,” including performing an operation on the physical baby model while wearing special gloves and clothing embedded with sensors.
“These technologies have mostly been used for entertainment or VFX [visual effects] or CGI [computer-generated imagery],” says Reks, “But this is a unique project, because we’re applying it now for real medical practice and real learning.”
One of the biggest challenges, Reks says, was helping to develop what Coelho calls “holoportation”— transmitting the 3D, volumetric video capture of Warf in real-time over the internet so that his avatar can appear in transcontinental medical training.
The Warf avatar has synchronous and asynchronous modes. The training that Vasconcelos received was in the asynchronous mode, where residents can observe the avatar’s demonstrations and ask it questions. The answers, delivered in a variety of languages, come from AI algorithms that draw from previous research and an extensive bank of questions and answers provided by Warf.
In the synchronous mode, Warf operates his avatar from a distance in real time, Coelho says. “He could walk around the room, he could talk to me, he could orient me. It’s amazing.”
Coelho, Warf, Reks, and other team members demonstrated a combination of the modes in a second session in late December. This demo consisted of volumetric live video capture between the Immersion Lab and Brazil, spatialized and visible in real-time through AR headsets. It significantly expanded upon the previous demo, which had only streamed volumetric data in one direction through a two-dimensional display.
Powerful impacts
Warf has a long history of training desperately needed pediatric neurosurgeons around the world, most recently through his nonprofit Neurokids. Remote and simulated training has been an increasingly large part of training since the pandemic, he says, although he doesn’t feel it will ever completely replace personal hands-on instruction and collaboration.
“But if in fact one day we could have avatars, like this one from Giselle, in remote places showing people how to do things and answering questions for them, without the cost of travel, without the time cost and so forth, I think it could be really powerful,” Warf says.
The avatar project is especially important for surgeons serving remote and underserved areas like the Amazon region of Brazil, Coelho says. “This is a way to give them the same level of education that they would get in other places, and the same opportunity to be in touch with Dr. Warf.”
One baby treated for hydrocephalus at a recent Amazon clinic had traveled by boat 30 hours for the surgery, according to Coelho.
Training surgeons with the avatar, she says, “can change reality for this baby and can change the future.”
Apple announced that Siri is getting an LLM brain transplant, ChatGPT integration, and Genmojis. So-called "Apple Intelligence" will also get a helping hand from ChatGPT-4o.
Apple Intelligence will make apps and services smarter. But Apple’s most notable innovations focus on ensuring the technology doesn’t disappoint, annoy, or offend.
At its Worldwide Developers Conference, Apple introduced its first serious foray into generative AI, with a focus on app integrations and data privacy—and a ChatGPT integration.
A popular AI training dataset is “stealing and weaponizing” the faces of Brazilian children without their knowledge or consent, human rights activists claim.
Today's technology companies are increasingly sandwiched between the regulatory requirements of the European Union (E.U.) and those of California. While the U.S. federal government may adopt a light touch, pro-innovation approach, California's state legislation can undermine this with a regulatory approach with impacts far beyond its borders.
A new California bill imposes a rigorous regulatory regime on Artificial Intelligence (AI), making it the latest technology caught in this potentially innovation-stifling squeeze between Brussels and Sacramento. The term "Brussels Effect" often refers to the outsize influence of E.U. policy—particularly in technology—as a de facto global standard. But now, companies are also experiencing the "Sacramento Effect," where California's stringent regulations effectively set de facto federal policy for the rest of the country.
California is not the only state diving into significant tech policy legislation. Colorado recently enacted notable AI regulations, Montana attempted to ban TikTok, and many states are pursuing data privacy or youth online safety regulations.
For better or worse, states can move faster than Congress, acting as laboratories of democracy. However, this agility also risks creating a fragmented tech policy landscape, with one state's regulations imposing heavy burdens on the entire nation. This is particularly pronounced with California.
The impact is profound not just because many leading tech companies are based in California but rather because of the nature of the technologies California seeks to regulate. For example, in some cases, the only feasible way to implement regulations is at a national level. In data privacy, the laws apply to California residents even when their actions are not occurring within the state's borders, pushing companies toward broader compliance to avoid legal pitfalls.
While some of these laws could be challenged under the dormant commerce clause, without judicial intervention, they become de facto federal policy. Many companies find it easier to comply with California's stringent regulations rather than juggling different standards across states and risking non-compliance.
This dynamic was evident in 2018 when California enacted its regulatory approach to data privacy. Now, we could soon see California—either by regulation or legislation—disrupting the crucial AI innovations currently taking place. Unlike some technologies, such as autonomous vehicles, the development of large language models and other foundational AI models cannot, in most cases, simply be removed from a state due to regulations.
Perhaps the "best-case scenario" from the actions of states like California and Colorado might be a problematic patchwork of AI regulations, but more realistically, California's proposal (if it becomes law) would deter innovation by creating a costly compliance regime. This would limit AI development to only the largest companies capable of bearing these costs and would come at the expense of investments in product improvements.
Moreover, beneficial AI applications could be thwarted by other proposals California's legislature is currently considering. As R Street's Adam Thierer notes in an analysis of state laws surrounding the AI revolution, the California legislature has considered a variety of anti-AI bills that could "ban self-checkout at grocery and retail stores and ban the use of AI in call centers that provide government services, making things even less efficient."
It is not only legislation that could result in California derailing a pro-innovation approach to AI. The California Privacy Protection Agency (CPPA), established under California's data privacy laws, has proposed a regulatory framework for "automated decision-making." The E.U.'s General Data Protection Regulation shows how data privacy regulation can inadvertently stifle AI development by imposing compliance requirements designed for older technologies. Regulating "automated decision-making" could give the CPPA an unintended yet significant role in obstructing AI and other beneficial algorithmic uses.
America's tech innovators and entrepreneurs are already facing challenges from the E.U.'s heavy-handed AI regulations. In the absence of federal preemption or an alternative framework, they may also be hindered by the heavy hand of Sacramento. Such a sandwiching of significant regulation could harm not only the tech sector's economy but also all Americans who stand to benefit from AI advancements, as a single state or region's policy preferences dictate the national landscape.
Photos of Brazilian kids—sometimes spanning their entire childhood—have been used without their consent to power AI tools, including popular image generators like Stable Diffusion, Human Rights Watch (HRW) warned on Monday.
This act poses urgent privacy risks to kids and seems to increase risks of non-consensual AI-generated images bearing their likenesses, HRW's report said.
An HRW researcher, Hye Jung Han, helped expose the problem. She analyzed "less than 0.0001 percent" of LAION-5B, a dataset built from Common Crawl snapshots of the public web. The dataset does not contain the actual photos but includes image-text pairs derived from 5.85 billion images and captions posted online since 2008.
At the Worldwide Developers Conference (WWDC) 2024, Apple made a significant shift by announcing “Apple Intelligence,” a suite of AI features designed for iPhones, Macs, ...
Video Friday is your weekly selection of awesome robotics videos, collected by your friends at IEEE Spectrum robotics. We also post a weekly calendar of upcoming robotics events for the next few months. Please send us your events for inclusion.
RoboCup 2024: 17–22 July 2024, EINDHOVEN, NETHERLANDS
ICRA@40: 23–26 September 2024, ROTTERDAM, NETHERLANDS
IROS 2024: 14–18 October 2024, ABU DHABI, UNITED ARAB EMIRATES
In this video, you see the start of 1X’s development of an advanced AI system that chains simple tasks into complex actions using voice commands, allowing seamless multi-robot control and remote operation. By starting with single-task models, we ensure smooth transitions to more powerful unified models, ultimately aiming to automate high-level actions using AI.
This video does not contain teleoperation, computer graphics, cuts, video speedups, or scripted trajectory playback. It’s all controlled via neural networks.
As the old adage goes, one cannot claim to be a true man without a visit to the Great Wall of China. XBot-L, a full-sized humanoid robot developed by Robot Era, recently acquitted itself well in a walk along sections of the Great Wall.
The paper presents a novel rotary wing platform, that is capable of folding and expanding its wings during flight. Our source of inspiration came from birds’ ability to fold their wings to navigate through small spaces and dive. The design of the rotorcraft is based on the monocopter platform, which is inspired by the flight of Samara seeds.
We present a variable stiffness robotic skin (VSRS), a concept that integrates stiffness-changing capabilities, sensing, and actuation into a single, thin modular robot design. Reconfiguring, reconnecting, and reshaping VSRSs allows them to achieve new functions both on and in the absence of a host body.
Heimdall is a new rover design for the 2024 University Rover Challenge (URC). This video shows highlights of Heimdall’s trip during the four missions at URC 2024.
Heimdall features a split body design with whegs (wheel legs), and a drill for sub-surface sample collection. It also has the ability to manipulate a variety of objects, collect surface samples, and perform onboard spectrometry and chemical tests.
The AI system used identifies and separates red apples from green apples, after which a robotic arm picks up the red apples identified with a qb SoftHand Industry and gently places them in a basket.
My favorite part is the magnetic apple stem system.
DexNex (v0, June 2024) is an anthropomorphic teleoperation testbed for dexterous manipulation at the Center for Robotics and Biosystems at Northwestern University. DexNex recreates human upper-limb functionality through a near 1-to-1 mapping between Operator movements and Avatar actions.
Motion of the Operator’s arms, hands, fingers, and head are fed forward to the Avatar, while fingertip pressures, finger forces, and camera images are fed back to the Operator. DexNex aims to minimize the latency of each subsystem to provide a seamless, immersive, and responsive user experience. Future research includes gaining a better understanding of the criticality of haptic and vision feedback for different manipulation tasks; providing arm-level grounded force feedback; and using machine learning to transfer dexterous skills from the human to the robot.
Fulfilling a school requirement by working in a Romanian locomotive factory one week each month, Daniela Rus learned to operate “machines that help us make things.” Appreciation for the practical side of math and science stuck with Daniela, who is now Director of the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL).
For AI to achieve its full potential, non-experts need to be let into the development process, says Rumman Chowdhury, CEO and cofounder of Humane Intelligence. She tells the story of farmers fighting for the right to repair their own AI-powered tractors (which some manufacturers actually made illegal), proposing everyone should have the ability to report issues, patch updates or even retrain AI technologies for their specific uses.
But such conveniences barely hint at the massive, sweeping changes to employment predicted by some analysts. And already, in ways large and small, striking and subtle, the tech world’s notables are grappling with changes, both real and envisioned, wrought by the onset of generative AI. To get a better idea of how some of them view the future of generative AI, IEEE Spectrum asked three luminaries—an academic leader, a regulator, and a semiconductor industry executive—about how generative AI has begun affecting their work. The three, Andrea Goldsmith, Juraj Čorba, and Samuel Naffziger, agreed to speak with Spectrum at the 2024 IEEE VIC Summit & Honors Ceremony Gala, held in May in Boston.
Juraj Čorba, senior expert on digital regulation and governance, Slovak Ministry of Investments, Regional Development
Samuel Naffziger, senior vice president and a corporate fellow at Advanced Micro Devices
Andrea Goldsmith
Andrea Goldsmith is dean of engineering at Princeton University.
There must be tremendous pressure now to throw a lot of resources into large language models. How do you deal with that pressure? How do you navigate this transition to this new phase of AI?
Andrea J. Goldsmith
Andrea Goldsmith: Universities generally are going to be very challenged, especially universities that don’t have the resources of a place like Princeton or MIT or Stanford or the other Ivy League schools. In order to do research on large language models, you need brilliant people, which all universities have. But you also need compute power and you need data. And the compute power is expensive, and the data generally sits in these large companies, not within universities.
So I think universities need to be more creative. We at Princeton have invested a lot of money in the computational resources for our researchers to be able to do—well, not large language models, because you can’t afford it. To do a large language model… look at OpenAI or Google or Meta. They’re spending hundreds of millions of dollars on compute power, if not more. Universities can’t do that.
But we can be more nimble and creative. What can we do with language models, maybe not large language models but with smaller language models, to advance the state of the art in different domains? Maybe it’s vertical domains of using, for example, large language models for better prognosis of disease, or for prediction of cellular channel changes, or in materials science to decide what’s the best path to pursue a particular new material that you want to innovate on. So universities need to figure out how to take the resources that we have to innovate using AI technology.
We also need to think about new models. And the government can also play a role here. The [U.S.] government has this new initiative, NAIRR, or National Artificial Intelligence Research Resource, where they’re going to put up compute power and data and experts for educators to use—researchers and educators.
That could be a game-changer because it’s not just each university investing their own resources or faculty having to write grants, which are never going to pay for the compute power they need. It’s the government pulling together resources and making them available to academic researchers. So it’s an exciting time, where we need to think differently about research—meaning universities need to think differently. Companies need to think differently about how to bring in academic researchers, how to open up their compute resources and their data for us to innovate on.
As a dean, you are in a unique position to see which technical areas are really hot, attracting a lot of funding and attention. But how much ability do you have to steer a department and its researchers into specific areas? Of course, I’m thinking about large language models and generative AI. Is deciding on a new area of emphasis or a new initiative a collaborative process?
Goldsmith: Absolutely. I think any academic leader who thinks that their role is to steer their faculty in a particular direction does not have the right perspective on leadership. I describe academic leadership as really about the success of the faculty and students that you’re leading. And when I did my strategic planning for Princeton Engineering in the fall of 2020, everything was shut down. It was the middle of COVID, but I’m an optimist. So I said, “Okay, this isn’t how I expected to start as dean of engineering at Princeton.” But the opportunity to lead engineering in a great liberal arts university that has aspirations to increase the impact of engineering hasn’t changed. So I met with every single faculty member in the School of Engineering, all 150 of them, one-on-one over Zoom.
And the question I asked was, “What do you aspire to? What should we collectively aspire to?” And I took those 150 responses, and I asked all the leaders and the departments and the centers and the institutes, because there already were some initiatives in robotics and bioengineering and in smart cities. And I said, “I want all of you to come up with your own strategic plans. What do you aspire to in these areas? And then let’s get together and create a strategic plan for the School of Engineering.” So that’s what we did. And everything that we’ve accomplished in the last four years that I’ve been dean came out of those discussions, and what it was the faculty and the faculty leaders in the school aspired to.
So we launched a bioengineering institute last summer. We just launched Princeton Robotics. We’ve launched some things that weren’t in the strategic plan that bubbled up. We launched a center on blockchain technology and its societal implications. We have a quantum initiative. We have an AI initiative using this powerful tool of AI for engineering innovation, not just around large language models, but it’s a tool—how do we use it to advance innovation and engineering? All of these things came from the faculty because, to be a successful academic leader, you have to realize that everything comes from the faculty and the students. You have to harness their enthusiasm, their aspirations, their vision to create a collective vision.
What are the most important organizations and governing bodies when it comes to policy and governance on artificial intelligence in Europe?
Juraj Čorba
Juraj Čorba: Well, there are many. And it also creates a bit of a confusion around the globe—who are the actors in Europe? So it’s always good to clarify. First of all we have the European Union, which is a supranational organization composed of many member states, including my own Slovakia. And it was the European Union that proposed adoption of a horizontal legislation for AI in 2021. It was the initiative of the European Commission, the E.U. institution, which has a legislative initiative in the E.U. And the E.U. AI Act is now finally being adopted. It was already adopted by the European Parliament.
So this started, you said 2021. That’s before ChatGPT and the whole large language model phenomenon really took hold.
Čorba: That was the case. Well, the expert community already knew that something was being cooked in the labs. But, yes, the whole agenda of large models, including large language models, came up only later on, after 2021. So the European Union tried to reflect that. Basically, the initial proposal to regulate AI was based on a blueprint of so-called product safety, which somehow presupposes a certain intended purpose. In other words, the checks and assessments of products are based more or less on the logic of the mass production of the 20th century, on an industrial scale, right? Like when you have products that you can somehow define easily and all of them have a clearly intended purpose. Whereas with these large models, a new paradigm was arguably opened, where they have a general purpose.
So the whole proposal was then rewritten in negotiations between the Council of Ministers, which is one of the legislative bodies, and the European Parliament. And so what we have today is a combination of this old product-safety approach and some novel aspects of regulation specifically designed for what we call general-purpose artificial intelligence systems or models. So that’s the E.U.
By product safety, you mean, if AI-based software is controlling a machine, you need to have physical safety.
Čorba: Exactly. That’s one of the aspects. So that touches upon the tangible products such as vehicles, toys, medical devices, robotic arms, et cetera. So yes. But from the very beginning, the proposal contained a regulation of what the European Commission called stand-alone systems—in other words, software systems that do not necessarily command physical objects. So it was already there from the very beginning, but all of it was based on the assumption that all software has its easily identifiable intended purpose—which is not the case for general-purpose AI.
Also, large language models and generative AI in general brings in this whole other dimension, of propaganda, false information, deepfakes, and so on, which is different from traditional notions of safety in real-time software.
Čorba: Well, this is exactly the aspect that is handled by another European organization, different from the E.U., and that is the Council of Europe. It’s an international organization established after the Second World War for the protection of human rights, for protection of the rule of law, and protection of democracy. So that’s where the Europeans, but also many other states and countries, started to negotiate a first international treaty on AI. For example, the United States have participated in the negotiations, and also Canada, Japan, Australia, and many other countries. And then these particular aspects, which are related to the protection of integrity of elections, rule-of-law principles, protection of fundamental rights or human rights under international law—all these aspects have been dealt with in the context of these negotiations on the first international treaty, which is to be now adopted by the Committee of Ministers of the Council of Europe on the 16th and 17th of May. So, pretty soon. And then the first international treaty on AI will be submitted for ratifications.
So prompted largely by the activity in large language models, AI regulation and governance now is a hot topic in the United States, in Europe, and in Asia. But of the three regions, I get the sense that Europe is proceeding most aggressively on this topic of regulating and governing artificial intelligence. Do you agree that Europe is taking a more proactive stance in general than the United States and Asia?
Čorba: I’m not so sure. If you look at the Chinese approach and the way they regulate what we call generative AI, it would appear to me that they also take it very seriously. They take a different approach from the regulatory point of view. But it seems to me that, for instance, China is taking a very focused and careful approach. For the United States, I wouldn’t say that the United States is not taking a careful approach because last year you saw many of the executive orders, or even this year, some of the executive orders issued by President Biden. Of course, this was not a legislative measure, this was a presidential order. But it seems to me that the United States is also trying to address the issue very actively. The United States has also initiated the first resolution of the General Assembly at the U.N. on AI, which was passed just recently. So I wouldn’t say that the E.U. is more aggressive in comparison with Asia or North America, but maybe I would say that the E.U. is the most comprehensive. It looks horizontally across different agendas and it uses binding legislation as a tool, which is not always the case around the world. Many countries simply feel that it’s too early to legislate in a binding way, so they opt for soft measures or guidance, collaboration with private companies, et cetera. Those are the differences that I see.
Do you think you perceive a difference in focus among the three regions? Are there certain aspects that are being more aggressively pursued in the United States than in Europe or vice versa?
Čorba: Certainly the E.U. is very focused on the protection of human rights, the full catalog of human rights, but also, of course, on safety and human health. These are the core goals or values to be protected under the E.U. legislation. As for the United States and for China, I would say that the primary focus in those countries—but this is only my personal impression—is on national and economic security.
Samuel Naffziger
Samuel Naffziger is senior vice president and a corporate fellow at Advanced Micro Devices, where he is responsible for technology strategy and product architectures. Naffziger was instrumental in AMD’s embrace and development of chiplets, which are semiconductor dies that are packaged together into high-performance modules.
To what extent is large language model training starting to influence what you and your colleagues do at AMD?
Samuel Naffziger
Samuel Naffziger: Well, there are a couple levels of that. LLMs are impacting the way a lot of us live and work. And we certainly are deploying that very broadly internally for productivity enhancements, for using LLMs to provide starting points for code—simple verbal requests, such as “Give me a Python script to parse this dataset.” And you get a really nice starting point for that code. Saves a ton of time. Writing verification test benches, helping with the physical design layout optimizations. So there’s a lot of productivity aspects.
The other aspect to LLMs is, of course, we are actively involved in designing GPUs [graphics processing units] for LLM training and for LLM inference. And so that’s driving a tremendous amount of workload analysis on the requirements, hardware requirements, and hardware-software codesign, to explore.
So that brings us to your current flagship, the Instinct MI300X, which is actually billed as an AI accelerator. How did the particular demands influence that design? I don’t know when that design started, but the ChatGPT era started about two years ago or so. To what extent did you read the writing on the wall?
Naffziger: So we were just into the MI300—in 2019, we were starting the development. A long time ago. And at that time, our revenue stream from the Zen [an AMD architecture used in a family of processors] renaissance had really just started coming in. So the company was starting to get healthier, but we didn’t have a lot of extra revenue to spend on R&D at the time. So we had to be very prudent with our resources. And we had strategic engagements with the [U.S.] Department of Energy for supercomputer deployments. That was the genesis for our MI line—we were developing it for the supercomputing market. Now, there was a recognition that munching through FP64 COBOL code, or Fortran, isn’t the future, right? [laughs] This machine-learning [ML] thing is really getting some legs.
So we put some of the lower-precision math formats in, like Brain Floating Point 16 at the time, that were going to be important for inference. And the DOE knew that machine learning was going to be an important dimension of supercomputers, not just legacy code. So that’s the way, but we were focused on HPC [high-performance computing]. We had the foresight to understand that ML had real potential. Although certainly no one predicted, I think, the explosion we’ve seen today.
So that’s how it came about. And, just another piece of it: We leveraged our modular chiplet expertise to architect the 300 to support a number of variants from the same silicon components. So the variant targeted to the supercomputer market had CPUs integrated in as chiplets, directly on the silicon module. And then it had six of the GPU chiplets we call XCDs around them. So we had three CPU chiplets and six GPU chiplets. And that provided an amazingly efficient, highly integrated, CPU-plus-GPU design we call MI300A. It’s very compelling for the El Capitan supercomputer that’s being brought up as we speak.
But we also recognize that for the maximum computation for these AI workloads, the CPUs weren’t that beneficial. We wanted more GPUs. For these workloads, it’s all about the math and matrix multiplies. So we were able to just swap out those three CPU chiplets for a couple more XCD GPUs. And so we got eight XCDs in the module, and that’s what we call the MI300X. So we kind of got lucky having the right product at the right time, but there was also a lot of skill involved in that we saw the writing on the wall for where these workloads were going and we provisioned the design to support it.
Earlier you mentioned 3D chiplets. What do you feel is the next natural step in that evolution?
Naffziger: AI has created this bottomless thirst for more compute [power]. And so we are always going to be wanting to cram as many transistors as possible into a module. And the reason that’s beneficial is, these systems deliver AI performance at scale with thousands, tens of thousands, or more, compute devices. They all have to be tightly connected together, with very high bandwidths, and all of that bandwidth requires power, requires very expensive infrastructure. So if a certain level of performance is required—a certain number of petaflops, or exaflops—the strongest lever on the cost and the power consumption is the number of GPUs required to achieve a zettaflop, for instance. And if the GPU is a lot more capable, then all of that system infrastructure collapses down—if you only need half as many GPUs, everything else goes down by half. So there’s a strong economic motivation to achieve very high levels of integration and performance at the device level. And the only way to do that is with chiplets and with 3D stacking. So we’ve already embarked down that path. A lot of tough engineering problems to solve to get there, but that’s going to continue.
And so what’s going to happen? Well, obviously we can add layers, right? We can pack more in. The thermal challenges that come along with that are going to be fun engineering problems that our industry is good at solving.
From cutting-edge robotics, design, and bioengineering to sustainable energy solutions, ocean engineering, nanotechnology, and innovative materials science, MechE students and their advisors are doing incredibly innovative work. The graduate students highlighted here represent a snapshot of the great work in progress this spring across the Department of Mechanical Engineering, and demonstrate the ways the future of this field is as limitless as the imaginations of its practitioners.
Democratizing design through AI
Lyle Regenwetter Hometown: Champaign, Illinois Advisor: Assistant Professor Faez Ahmed Interests: Food, climbing, skiing, soccer, tennis, cooking
Lyle Regenwetter finds excitement in the prospect of generative AI to "democratize" design and enable inexperienced designers to tackle complex design problems. His research explores new training methods through which generative AI models can be taught to implicitly obey design constraints and synthesize higher-performing designs. Knowing that prospective designers often have an intimate knowledge of the needs of users, but may otherwise lack the technical training to create solutions, Regenwetter also develops human-AI collaborative tools that allow AI models to interact and support designers in popular CAD software and real design problems.
Solving a whale of a problem
Loïcka Baille Hometown: L’Escale, France Advisor: Daniel Zitterbart Interests: Being outdoors — scuba diving, spelunking, or climbing. Sailing on the Charles River, martial arts classes, and playing volleyball
Loïcka Baille’s research focuses on developing remote sensing technologies to study and protect marine life. Her main project revolves around improving onboard whale detection technology to prevent vessel strikes, with a special focus on protecting North Atlantic right whales. Baille is also involved in an ongoing study of Emperor penguins. Her team visits Antarctica annually to tag penguins and gather data to enhance their understanding of penguin population dynamics and draw conclusions regarding the overall health of the ecosystem.
Water, water anywhere
Carlos Díaz-Marín Hometown: San José, Costa Rica Advisor: Professor Gang Chen | Former Advisor: Professor Evelyn Wang Interests: New England hiking, biking, and dancing
Carlos Díaz-Marín designs and synthesizes inexpensive salt-polymer materials that can capture large amounts of humidity from the air. He aims to change the way we generate potable water from the air, even in arid conditions. In addition to water generation, these salt-polymer materials can also be used as thermal batteries, capable of storing and reusing heat. Beyond the scientific applications, Díaz-Marín is excited to continue doing research that can have big social impacts, and that finds and explains new physical phenomena. As a LatinX person, Díaz-Marín is also driven to help increase diversity in STEM.
Scalable fabrication of nano-architected materials
Somayajulu Dhulipala Hometown: Hyderabad, India Advisor: Assistant Professor Carlos Portela Interests: Space exploration, taekwondo, meditation.
Somayajulu Dhulipala works on developing lightweight materials with tunable mechanical properties. He is currently working on methods for the scalable fabrication of nano-architected materials and predicting their mechanical properties. The ability to fine-tune the mechanical properties of specific materials brings versatility and adaptability, making these materials suitable for a wide range of applications across multiple industries. While the research applications are quite diverse, Dhulipala is passionate about making space habitable for humanity, a crucial step toward becoming a spacefaring civilization.
Ingestible health-care devices
Jimmy McRae Hometown: Woburn, Massachusetts Advisor: Associate Professor Giovani Traverso Interests: Anything basketball-related: playing, watching, going to games, organizing hometown tournaments
Jimmy McRae aims to drastically improve diagnostic and therapeutic capabilities through noninvasive health-care technologies. His research focuses on leveraging materials, mechanics, embedded systems, and microfabrication to develop novel ingestible electronic and mechatronic devices. This ranges from ingestible electroceutical capsules that modulate hunger-regulating hormones to devices capable of continuous ultralong monitoring and remotely triggerable actuations from within the stomach. The principles that guide McRae’s work to develop devices that function in extreme environments can be applied far beyond the gastrointestinal tract, with applications for outer space, the ocean, and more.
Freestyle BMX meets machine learning
Eva Nates Hometown: Narberth, Pennsylvania Advisor: Professor Peko Hosoi Interests: Rowing, running, biking, hiking, baking
Eva Nates is working with the Australian Cycling Team to create a tool to classify Bicycle Motocross Freestyle (BMX FS) tricks. She uses a singular value decomposition method to conduct a principal component analysis of the time-dependent point-tracking data of an athlete and their bike during a run to classify each trick. The 2024 Olympic team hopes to incorporate this tool in their training workflow, and Nates worked alongside the team at their facilities on the Gold Coast of Australia during MIT’s Independent Activities Period in January.
Augmenting Astronauts with Wearable Limbs
Erik Ballesteros Hometown: Spring, Texas Advisor: Professor Harry Asada Interests: Cosplay, Star Wars, Lego bricks
Erik Ballesteros’s research seeks to support astronauts who are conducting planetary extravehicular activities through the use of supernumerary robotic limbs (SuperLimbs). His work is tailored toward design and control manifestation to assist astronauts with post-fall recovery, human-leader/robot-follower quadruped locomotion, and coordinated manipulation between the SuperLimbs and the astronaut to perform tasks like excavation and sample handling.
This article appeared in the Spring 2024 edition of the Department of Mechanical Engineering's magazine, MechE Connects.
Benjamin Warf, a renowned neurosurgeon at Boston Children’s Hospital, stands in the MIT.nano Immersion Lab. More than 3,000 miles away, his virtual avatar stands next to Matheus Vasconcelos in Brazil as the resident practices delicate surgery on a doll-like model of a baby’s brain.
With a pair of virtual-reality goggles, Vasconcelos is able to watch Warf’s avatar demonstrate a brain surgery procedure before replicating the technique himself and while asking questions of Warf’s digital twin.
“It’s an almost out-of-body experience,” Warf says of watching his avatar interact with the residents. “Maybe it’s how it feels to have an identical twin?”
And that’s the goal: Warf’s digital twin bridged the distance, allowing him to be functionally in two places at once. “It was my first training using this model, and it had excellent performance,” says Vasconcelos, a neurosurgery resident at Santa Casa de São Paulo School of Medical Sciences in São Paulo, Brazil. “As a resident, I now feel more confident and comfortable applying the technique in a real patient under the guidance of a professor.”
Warf’s avatar arrived via a new project launched by medical simulator and augmented reality (AR) company EDUCSIM. The company is part of the 2023 cohort of START.nano, MIT.nano’s deep-tech accelerator that offers early-stage startups discounted access to MIT.nano’s laboratories.
In March 2023, Giselle Coelho, EDUCSIM’s scientific director and a pediatric neurosurgeon at Santa Casa de São Paulo and Sabará Children’s Hospital, began working with technical staff in the MIT.nano Immersion Lab to create Warf’s avatar. By November, the avatar was training future surgeons like Vasconcelos.
“I had this idea to create the avatar of Dr. Warf as a proof of concept, and asked, ‘What would be the place in the world where they are working on technologies like that?’” Coelho says. “Then I found MIT.nano.”
Capturing a Surgeon
As a neurosurgery resident, Coelho was so frustrated by the lack of practical training options for complex surgeries that she built her own model of a baby brain. The physical model contains all the structures of the brain and can even bleed, “simulating all the steps of a surgery, from incision to skin closure,” she says.
She soon found that simulators and virtual reality (VR) demonstrations reduced the learning curve for her own residents. Coelho launched EDUCSIM in 2017 to expand the variety and reach of the training for residents and experts looking to learn new techniques.
Those techniques include a procedure to treat infant hydrocephalus that was pioneered by Warf, the director of neonatal and congenital neurosurgery at Boston Children’s Hospital. Coelho had learned the technique directly from Warf and thought his avatar might be the way for surgeons who couldn’t travel to Boston to benefit from his expertise.
To create the avatar, Coelho worked with Talis Reks, the AR/VR/gaming/big data IT technologist in the Immersion Lab.
“A lot of technology and hardware can be very expensive for startups to access as they start their company journey,” Reks explains. “START.nano is one way of enabling them to utilize and afford the tools and technologies we have at MIT.nano’s Immersion Lab.”
Coelho and her colleagues needed high-fidelity and high-resolution motion-capture technology, volumetric video capture, and a range of other VR/AR technologies to capture Warf’s dexterous finger motions and facial expressions. Warf visited MIT.nano on several occasions to be digitally “captured,” including performing an operation on the physical baby model while wearing special gloves and clothing embedded with sensors.
“These technologies have mostly been used for entertainment or VFX [visual effects] or CGI [computer-generated imagery],” says Reks, “But this is a unique project, because we’re applying it now for real medical practice and real learning.”
One of the biggest challenges, Reks says, was helping to develop what Coelho calls “holoportation”— transmitting the 3D, volumetric video capture of Warf in real-time over the internet so that his avatar can appear in transcontinental medical training.
The Warf avatar has synchronous and asynchronous modes. The training that Vasconcelos received was in the asynchronous mode, where residents can observe the avatar’s demonstrations and ask it questions. The answers, delivered in a variety of languages, come from AI algorithms that draw from previous research and an extensive bank of questions and answers provided by Warf.
In the synchronous mode, Warf operates his avatar from a distance in real time, Coelho says. “He could walk around the room, he could talk to me, he could orient me. It’s amazing.”
Coelho, Warf, Reks, and other team members demonstrated a combination of the modes in a second session in late December. This demo consisted of volumetric live video capture between the Immersion Lab and Brazil, spatialized and visible in real-time through AR headsets. It significantly expanded upon the previous demo, which had only streamed volumetric data in one direction through a two-dimensional display.
Powerful impacts
Warf has a long history of training desperately needed pediatric neurosurgeons around the world, most recently through his nonprofit Neurokids. Remote and simulated training has been an increasingly large part of training since the pandemic, he says, although he doesn’t feel it will ever completely replace personal hands-on instruction and collaboration.
“But if in fact one day we could have avatars, like this one from Giselle, in remote places showing people how to do things and answering questions for them, without the cost of travel, without the time cost and so forth, I think it could be really powerful,” Warf says.
The avatar project is especially important for surgeons serving remote and underserved areas like the Amazon region of Brazil, Coelho says. “This is a way to give them the same level of education that they would get in other places, and the same opportunity to be in touch with Dr. Warf.”
One baby treated for hydrocephalus at a recent Amazon clinic had traveled by boat 30 hours for the surgery, according to Coelho.
Training surgeons with the avatar, she says, “can change reality for this baby and can change the future.”
Photolithography involves manipulating light to precisely etch features onto a surface, and is commonly used to fabricate computer chips and optical devices like lenses. But tiny deviations during the manufacturing process often cause these devices to fall short of their designers’ intentions.
To help close this design-to-manufacturing gap, researchers from MIT and the Chinese University of Hong Kong used machine learning to build a digital simulator that mimics a specific photolithography manufacturing process. Their technique utilizes real data gathered from the photolithography system, so it can more accurately model how the system would fabricate a design.
The researchers integrate this simulator into a design framework, along with another digital simulator that emulates the performance of the fabricated device in downstream tasks, such as producing images with computational cameras. These connected simulators enable a user to produce an optical device that better matches its design and reaches the best task performance.
This technique could help scientists and engineers create more accurate and efficient optical devices for applications like mobile cameras, augmented reality, medical imaging, entertainment, and telecommunications. And because the pipeline of learning the digital simulator utilizes real-world data, it can be applied to a wide range of photolithography systems.
“This idea sounds simple, but the reasons people haven’t tried this before are that real data can be expensive and there are no precedents for how to effectively coordinate the software and hardware to build a high-fidelity dataset,” says Cheng Zheng, a mechanical engineering graduate student who is co-lead author of an open-access paper describing the work. “We have taken risks and done extensive exploration, for example, developing and trying characterization tools and data-exploration strategies, to determine a working scheme. The result is surprisingly good, showing that real data work much more efficiently and precisely than data generated by simulators composed of analytical equations. Even though it can be expensive and one can feel clueless at the beginning, it is worth doing.”
Zheng wrote the paper with co-lead author Guangyuan Zhao, a graduate student at the Chinese University of Hong Kong; and her advisor, Peter T. So, a professor of mechanical engineering and biological engineering at MIT. The research will be presented at the SIGGRAPH Asia Conference.
Printing with light
Photolithography involves projecting a pattern of light onto a surface, which causes a chemical reaction that etches features into the substrate. However, the fabricated device ends up with a slightly different pattern because of miniscule deviations in the light’s diffraction and tiny variations in the chemical reaction.
Because photolithography is complex and hard to model, many existing design approaches rely on equations derived from physics. These general equations give some sense of the fabrication process but can’t capture all deviations specific to a photolithography system. This can cause devices to underperform in the real world.
For their technique, which they call neural lithography, the MIT researchers build their photolithography simulator using physics-based equations as a base, and then incorporate a neural network trained on real, experimental data from a user’s photolithography system. This neural network, a type of machine-learning model loosely based on the human brain, learns to compensate for many of the system’s specific deviations.
The researchers gather data for their method by generating many designs that cover a wide range of feature sizes and shapes, which they fabricate using the photolithography system. They measure the final structures and compare them with design specifications, pairing those data and using them to train a neural network for their digital simulator.
“The performance of learned simulators depends on the data fed in, and data artificially generated from equations can’t cover real-world deviations, which is why it is important to have real-world data,” Zheng says.
Dual simulators
The digital lithography simulator consists of two separate components: an optics model that captures how light is projected on the surface of the device, and a resist model that shows how the photochemical reaction occurs to produce features on the surface.
In a downstream task, they connect this learned photolithography simulator to a physics-based simulator that predicts how the fabricated device will perform on this task, such as how a diffractive lens will diffract the light that strikes it.
The user specifies the outcomes they want a device to achieve. Then these two simulators work together within a larger framework that shows the user how to make a design that will reach those performance goals.
“With our simulator, the fabricated object can get the best possible performance on a downstream task, like the computational cameras, a promising technology to make future cameras miniaturized and more powerful. We show that, even if you use post-calibration to try and get a better result, it will still not be as good as having our photolithography model in the loop,” Zhao adds.
They tested this technique by fabricating a holographic element that generates a butterfly image when light shines on it. When compared to devices designed using other techniques, their holographic element produced a near-perfect butterfly that more closely matched the design. They also produced a multilevel diffraction lens, which had better image quality than other devices.
In the future, the researchers want to enhance their algorithms to model more complicated devices, and also test the system using consumer cameras. In addition, they want to expand their approach so it can be used with different types of photolithography systems, such as systems that use deep or extreme ultraviolet light.
This research is supported, in part, by the U.S. National Institutes of Health, Fujikura Limited, and the Hong Kong Innovation and Technology Fund.
The work was carried out, in part, using MIT.nano’s facilities.
A jailbreak of OpenAI's GPT-4o used leetspeak to get ChatGPT to bypass its usual safety measures, allowing users to receive knowledge on how to hotwire cars, synthesize LSD, and other illicit activities.
Google rushed out fixes after its AI search feature made errors that went viral. Fundamental limitations of generative AI mean that it will still screw up sometimes.
Liz Reid, Google’s head of search, said in a blog post that the company had made adjustments to its new AI search feature after screenshots of its errors went viral.
Large language models, the AI systems that power chatbots like ChatGPT, are getting better and better—but they’re also getting bigger and bigger, demanding more energy and computational power. For LLMs that are cheap, fast, and environmentally friendly, they’ll need to shrink, ideally small enough to run directly on devices like cellphones. Researchers are finding ways to do just that by drastically rounding off the many high-precision numbers that store their memories to equal just 1 or -1.
LLMs, like all neural networks, are trained by altering the strengths of connections between their artificial neurons. These strengths are stored as mathematical parameters. Researchers have long compressed networks by reducing the precision of these parameters—a process called quantization—so that instead of taking up 16 bits each, they might take up 8 or 4. Now researchers are pushing the envelope to a single bit.
How to Make a 1-bit LLM
There are two general approaches. One approach, called post-training quantization (PTQ) is to quantize the parameters of a full-precision network. The other approach, quantization-aware training (QAT), is to train a network from scratch to have low-precision parameters. So far, PTQ has been more popular with researchers.
In February, a team including Haotong Qin at ETH Zurich, Xianglong Liu at Beihang University, and Wei Huang at the University of Hong Kong introduced a PTQ method called BiLLM. It approximates most parameters in a network using 1 bit, but represents a few salient weights—those most influential to performance—using 2 bits. In one test, the team binarized a version of Meta’s LLaMa LLM that has 13 billion parameters.
“One-bit LLMs open new doors for designing custom hardware and systems specifically optimized for 1-bit LLMs.” —Furu Wei, Microsoft Research Asia
To score performance, the researchers used a metric calledperplexity, which is basically a measure of how surprised the trained model was by each ensuing piece of text. For one dataset, the original model had a perplexity of around 5, and the BiLLM version scored around 15, much better than the closest binarization competitor, which scored around 37 (for perplexity, lower numbers are better). That said, the BiLLM model required about a tenth of the memory capacity as the original.
PTQ has several advantages over QAT, says Wanxiang Che, a computer scientist at Harbin Institute of Technology, in China. It doesn’t require collecting training data, it doesn’t require training a model from scratch, and the training process is more stable. QAT, on the other hand, has the potential to make models more accurate, since quantization is built into the model from the beginning.
1-bit LLMs Find Success Against Their Larger Cousins
Last year, a team led by Furu Wei and Shuming Ma, at Microsoft Research Asia, in Beijing, created BitNet, the first 1-bit QAT method for LLMs. After fiddling with the rate at which the network adjusts its parameters, in order to stabilize training, they created LLMs that performed better than those created using PTQ methods. They were still not as good as full-precision networks, but roughly 10 times as energy efficient.
In February, Wei’s team announced BitNet 1.58b, in which parameters can equal -1, 0, or 1, which means they take up roughly 1.58 bits of memory per parameter. A BitNet model with 3 billion parameters performed just as well on various language tasks as a full-precision LLaMA model with the same number of parameters and amount of training, but it was 2.71 times as fast, used 72 percent less GPU memory, and used 94 percent less GPU energy. Wei called this an “aha moment.” Further, the researchers found that as they trained larger models, efficiency advantages improved.
A BitNet model with 3 billion parameters performed just as well on various language tasks as a full-precision LLaMA model.
This year, a team led by Che, of Harbin Institute of Technology, released a preprint on another LLM binarization method, called OneBit. OneBit combines elements of both PTQ and QAT. It uses a full-precision pretrained LLM to generate data for training a quantized version. The team’s 13-billion-parameter model achieved a perplexity score of around 9 on one dataset, versus 5 for a LLaMA model with 13 billion parameters. Meanwhile, OneBit occupied only 10 percent as much memory. On customized chips, it could presumably run much faster.
Wei, of Microsoft, says quantized models have multiple advantages. They can fit on smaller chips, they require less data transfer between memory and processors, and they allow for faster processing. Current hardware can’t take full advantage of these models, though. LLMs often run on GPUs like those made by Nvidia, which represent weights using higher precision and spend most of their energy multiplying them. New hardware could natively represent each parameter as a -1 or 1 (or 0), and then simply add and subtract values and avoid multiplication. “One-bit LLMs open new doors for designing custom hardware and systems specifically optimized for 1-bit LLMs,” Wei says.
“They should grow up together,” Huang, of the University of Hong Kong, says of 1-bit models and processors. “But it’s a long way to develop new hardware.”