Normální zobrazení

Jsou dostupné nové články, klikněte pro obnovení stránky.

PředevčíremHlavní kanál

IEEE Spectrum
How Field AI Is Conquering Unstructured AutonomyEvan Ackerman
One of the biggest challenges for robotics right now is practical autonomous operation in unstructured environments. That is, doing useful stuff in places your robot hasn’t been before and where things may not be as familiar as your robot might like. Robots thrive on predictability, which has put some irksome restrictions on where and how they can be successfully deployed.But over the past few years, this has started to change, thanks in large part to a couple of pivotal robotics challenges put
30. Duben 2024 v 16:00

How Field AI Is Conquering Unstructured Autonomy

IEEE Spectrum

Od: Evan Ackerman

30. Duben 2024 v 16:00

One of the biggest challenges for robotics right now is practical autonomous operation in unstructured environments. That is, doing useful stuff in places your robot hasn’t been before and where things may not be as familiar as your robot might like. Robots thrive on predictability, which has put some irksome restrictions on where and how they can be successfully deployed.

But over the past few years, this has started to change, thanks in large part to a couple of pivotal robotics challenges put on by DARPA. The DARPA Subterranean Challenge ran from 2018 to 2021, putting mobile robots through a series of unstructured underground environments. And the currently ongoing DARPA RACER program tasks autonomous vehicles with navigating long distances off-road. Some extremely impressive technology has been developed through these programs, but there’s always a gap between this cutting-edge research and any real-world applications.

Now, a bunch of the folks involved in these challenges, including experienced roboticists from NASA, DARPA, Google DeepMind, Amazon, and Cruise (to name just a few places) are applying everything that they’ve learned to enable real-world practical autonomy for mobile robots at a startup called Field AI.

Field AI was cofounded by Ali Agha, who previously was a group leader for NASA JPL’s Aerial Mobility Group as well as JPL’s Perception Systems Group. While at JPL, Agha led Team CoSTAR, which won the DARPA Subterranean Challenge Urban Circuit. Agha has also been the principal investigator for DARPA RACER, first with JPL, and now continuing with Field AI. “Field AI is not just a startup,” Agha tells us. “It’s a culmination of decades of experience in AI and its deployment in the field.”

Unstructured environments are where things are constantly changing, which can play havoc with robots that rely on static maps.

The “field” part in Field AI is what makes Agha’s startup unique. Robots running Field AI’s software are able to handle unstructured, unmapped environments without reliance on prior models, GPS, or human intervention. Obviously, this kind of capability was (and is) of interest to NASA and JPL, which send robots to places where there are no maps, GPS doesn’t exist, and direct human intervention is impossible.

But DARPA SubT demonstrated that similar environments can be found on Earth, too. For instance, mines, natural caves, and the urban underground are all extremely challenging for robots (and even for humans) to navigate. And those are just the most extreme examples: robots that need to operate inside buildings or out in the wilderness have similar challenges understanding where they are, where they’re going, and how to navigate the environment around them.

driverless dune buggy-type vehicle with waving American flag drives through a blurred landscape of sand and scrub brush An autonomous vehicle drives across kilometers of desert with no prior map, no GPS, and no road.Field AI

Despite the difficulty that robots have operating in the field, this is an enormous opportunity that Field AI hopes to address. Robots have already proven their worth in inspection contexts, typically where you either need to make sure that nothing is going wrong across a large industrial site, or for tracking construction progress inside a partially completed building. There’s a lot of value here because the consequences of something getting messed up are expensive or dangerous or both, but the tasks are repetitive and sometimes risky and generally don’t require all that much human insight or creativity.

Uncharted Territory as Home Base

Where Field AI differs from other robotics companies offering these services, as Agha explains, is that his company wants to do these tasks without first having a map that tells the robot where to go. In other words, there’s no lengthy setup process, and no human supervision, and the robot can adapt to changing and new environments. Really, this is what full autonomy is all about: going anywhere, anytime, without human interaction. “Our customers don’t need to train anything,” Agha says, laying out the company’s vision. “They don’t need to have precise maps. They press a single button, and the robot just discovers every corner of the environment.” This capability is where the DARPA SubT heritage comes in. During the competition, DARPA basically said, “here’s the door into the course. We’re not going to tell you anything about what’s back there or even how big it is. Just go explore the whole thing and bring us back the info we’ve asked for.” Agha’s Team CoSTAR did exactly that during the competition, and Field AI is commercializing this capability.

“With our robots, our aim is for you to just deploy it, with no training time needed. And then we can just leave the robots.” —Ali Agha, Field AI

The other tricky thing about these unstructured environments, especially construction environments, is that things are constantly changing, which can play havoc with robots that rely on static maps. “We’re one of the few, if not the only company that can leave robots for days on continuously changing construction sites with minimal supervision,” Agha tells us. “These sites are very complex—every day there are new items, new challenges, and unexpected events. Construction materials on the ground, scaffolds, forklifts, and heavy machinery moving all over the place, nothing you can predict.”

Field AI

Field AI’s approach to this problem is to emphasize environmental understanding over mapping. Agha says that essentially, Field AI is working towards creating “field foundation models” (FFMs) of the physical world, using sensor data as an input. You can think of FFMs as being similar to the foundation models of language, music, and art that other AI companies have created over the past several years, where ingesting a large amount of data from the Internet enables some level of functionality in a domain without requiring specific training for each new situation. Consequently, Field AI’s robots can understand how to move in the world, rather than just where to move. “We look at AI quite differently from what’s mainstream,” Agha explains. “We do very heavy probabilistic modeling.” Much more technical detail would get into Field AI’s IP, says Agha, but the point is that real-time world modeling becomes a by-product of Field AI’s robots operating in the world rather than a prerequisite for that operation. This makes the robots fast, efficient, and resilient.

Developing field-foundation models that robots can use to reliably go almost anywhere requires a lot of real-world data, which Field AI has been collecting at industrial and construction sites around the world for the past year. To be clear, they’re collecting the data as part of their commercial operations—these are paying customers that Field AI has already. “In these job sites, it can traditionally take weeks to go around a site and map where every single target of interest that you need to inspect is,” explains Agha. “But with our robots, our aim is for you to just deploy it, with no training time needed. And then we can just leave the robots. This level of autonomy really unlocks a lot of use cases that our customers weren’t even considering, because they thought it was years away.” And the use cases aren’t just about construction or inspection or other areas where we’re already seeing autonomous robotic systems, Agha says. “These technologies hold immense potential.”

There’s obviously demand for this level of autonomy, but Agha says that the other piece of the puzzle that will enable Field AI to leverage a trillion dollar market is the fact that they can do what they do with virtually any platform. Fundamentally, Field AI is a software company—they make sensor payloads that integrate with their autonomy software, but even those payloads are adjustable, ranging from something appropriate for an autonomous vehicle to something that a drone can handle.

Heck, if you decide that you need an autonomous humanoid for some weird reason, Field AI can do that too. While the versatility here is important, according to Agha, what’s even more important is that it means you can focus on platforms that are more affordable, and still expect the same level of autonomous performance, within the constraints of each robot’s design, of course. With control over the full software stack, integrating mobility with high-level planning, decision making, and mission execution, Agha says that the potential to take advantage of relatively inexpensive robots is what’s going to make the biggest difference toward Field AI’s commercial success.

Group shot in a company parking lot of ten men and 12 robots Same brain, lots of different robots: the Field AI team’s foundation models can be used on robots big, small, expensive, and somewhat less expensive.Field AI

Field AI is already expanding its capabilities, building on some of its recent experience with DARPA RACER by working on deploying robots to inspect pipelines for tens of kilometers and to transport materials across solar farms. With revenue coming in and a substantial chunk of funding, Field AI has even attracted interest from Bill Gates. Field AI’s participation in RACER is ongoing, under a sort of subsidiary company for federal projects called Offroad Autonomy, and in the meantime its commercial side is targeting expansion to “hundreds” of sites on every platform it can think of, including humanoids.

IEEE Spectrum
15 Graphs That Explain the State of AI in 2024Eliza Strickland
Each year, the AI Index lands on virtual desks with a louder virtual thud—this year, its 393 pages are a testament to the fact that AI is coming off a really big year in 2023. For the past three years, IEEE Spectrum has read the whole damn thing and pulled out a selection of charts that sum up the current state of AI (see our coverage from 2021, 2022, and 2023). This year’s report, published by the Stanford Institute for Human-Centered Artificial Intelligence (HAI), has an expanded chapter on re
15. Duben 2024 v 17:03

15 Graphs That Explain the State of AI in 2024

IEEE Spectrum

Od: Eliza Strickland

15. Duben 2024 v 17:03

Each year, the AI Index lands on virtual desks with a louder virtual thud—this year, its 393 pages are a testament to the fact that AI is coming off a really big year in 2023. For the past three years, IEEE Spectrum has read the whole damn thing and pulled out a selection of charts that sum up the current state of AI (see our coverage from 2021, 2022, and 2023).

This year’s report, published by the Stanford Institute for Human-Centered Artificial Intelligence (HAI), has an expanded chapter on responsible AI and new chapters on AI in science and medicine, as well as its usual roundups of R&D, technical performance, the economy, education, policy and governance, diversity, and public opinion. This year is also the first time that Spectrum has figured into the report, with a citation of an article published here about generative AI’s visual plagiarism problem.

1. Generative AI investment skyrockets

While corporate investment was down overall last year, investment in generative AI went through the roof. Nestor Maslej, editor-in-chief of this year’s report, tells Spectrum that the boom is indicative of a broader trend in 2023, as the world grappled with the new capabilities and risks of generative AI systems like ChatGPT and the image-generating DALL-E 2. “The story in the last year has been about people responding [to generative AI],” says Maslej, “whether it’s in policy, whether it’s in public opinion, or whether it’s in industry with a lot more investment.” Another chart in the report shows that most of that private investment in generative AI is happening in the United States.

2. Google is dominating the foundation model race

Foundation models are big multipurpose models—for example, OpenAI’s GPT-3 and GPT-4 are the foundation model that enable ChatGPT users to write code or Shakespearean sonnets. Since training these models typically requires vast resources, Industry now makes most of them, with academia only putting out a few. Companies release foundation models both to push the state-of-the-art forward and to give developers a foundation on which to build products and services. Google released the most in 2023.

3. Closed models outperform open ones

One of the hot debates in AI right now is whether foundation models should be open or closed, with some arguing passionately that open models are dangerous and others maintaining that open models drive innovation. The AI Index doesn’t wade into that debate, but instead looks at trends such as how many open and closed models have been released (another chart, not included here, shows that of the 149 foundation models released in 2023, 98 were open, 23 gave partial access through an API, and 28 were closed).

The chart above reveals another aspect: Closed models outperform open ones on a host of commonly used benchmarks. Maslej says the debate about open versus closed “usually centers around risk concerns, but there’s less discussion about whether there are meaningful performance trade-offs.”

4. Foundation models have gotten super expensive

Here’s why industry is dominating the foundation model scene: Training a big one takes very deep pockets. But exactly how deep? AI companies rarely reveal the expenses involved in training their models, but the AI Index went beyond the typical speculation by collaborating with the AI research organization Epoch AI. To come up with their cost estimates, the report explains, the Epoch team “analyzed training duration, as well as the type, quantity, and utilization rate of the training hardware” using information gleaned from publications, press releases, and technical reports.

It’s interesting to note that Google’s 2017 transformer model, which introduced the architecture that underpins almost all of today’s large language models, was trained for only US $930.

5. And they have a hefty carbon footprint

The AI Index team also estimated the carbon footprint of certain large language models. The report notes that the variance between models is due to factors including model size, data center energy efficiency, and the carbon intensity of energy grids. Another chart in the report (not included here) shows a first guess at emissions related to inference—when a model is doing the work it was trained for—and calls for more disclosures on this topic. As the report notes: “While the per-query emissions of inference may be relatively low, the total impact can surpass that of training when models are queried thousands, if not millions, of times daily.”

6. The United States leads in foundation models

While Maslej says the report isn’t trying to “declare a winner to this race,” he does note that the United States is leading in several categories, including number of foundation models released (above) and number of AI systems deemed significant technical advances. However, he notes that China leads in other categories including AI patents granted and installation of industrial robots.

7. Industry calls new PhDs

This one is hardly a surprise, given the previously discussed data about industry getting lots of investment for generative AI and releasing lots of exciting models. In 2022 (the most recent year for which the Index has data), 70 precent of new AI PhDs in North America took jobs in industry. It’s a continuation of a trend that’s been playing out over the last few years.

8. Some progress on diversity

For years, there’s been little progress on making AI less white and less male. But this year’s report offers a few hopeful signs. For example, the number of non-white and female students taking the AP computer science exam is on the rise. The graph above shows the trends for ethnicity, while another graph, not included here, shows that 30 percent of the students taking the exam are now girls.

Another graph in the report shows that at the undergraduate level, there’s also a positive trend in increasing ethnic diversity among North American students earning bachelor degrees in computer science, although the number of women earning CS bachelor degrees has barely budged over the last five years. Says Maslej, “it’s important to know that there’s still a lot of work to be done here.”

9. Chatter in earnings calls

Businesses are awake to the possibilities of AI. The Index got data about Fortune 500 companies’ earnings calls from Quid, a market intelligence firm that used natural language processing tools to scan for all mentions of “artificial intelligence,” “AI,” “machine learning,” “ML,” and “deep learning.” Nearly 80 percent of the companies included discussion of AI in their calls. “I think there’s a fear in business leaders that if they don’t use this technology, they’re going to miss out,” Maslej says.

And while some of that chatter is likely just CEOs bandying about buzzwords, another graph in the report shows that 55 percent of companies included in a McKinsey survey have implemented AI in at least one business unit.

10. Costs go down, revenues go up

And here’s why AI isn’t just a corporate buzzword: The same McKinsey survey showed that the integration of AI has caused companies’ costs to go down and their revenues go up. Overall, 42 percent of respondents said they’d seen reduced costs, and 59 percent claimed increased revenue.

Other charts in the report suggest that this impact on the bottom line reflects efficiency gains and better worker productivity. In 2023, a number of studies in different fields showed that AI enabled workers to complete tasks more quickly and produce better quality work. One study looked at coders using Copilot, while others looked at consultants, call center agents, and law students. “These studies also show that although every worker benefits, AI helps lower-skilled workers more than it does high-skilled workers,” says Maslej.

11. Corporations do perceive risks

This year, the AI Index team ran a global survey of 1,000 corporations with revenues of at least $500 million to understand how businesses are thinking about responsible AI. The results showed that privacy and data governance is perceived as the greatest risk across the globe, while fairness (often discussed in terms of algorithmic bias) still hasn’t registered with most companies. Another chart in the report shows that companies are taking action on their perceived risks: The majority of organizations across regions have implemented at least one responsible AI measure in response to relevant risks.

12. AI can’t beat humans at everything... yet

In recent years, AI systems have outperformed humans on a range of tasks, including reading comprehension and visual reasoning, and Maslej notes that the pace of AI performance improvement has also picked up. “A decade ago, with a benchmark like ImageNet, you could rely on that to challenge AI researchers for for five or six years,” he says. “Now, a new benchmark is introduced for competition-level mathematics and the AI starts at 30 percent, and then in a year it gets to 90 percent.” While there are still complex cognitive tasks where humans outperform AI systems, let’s check in next year to see how that’s going.

13. Developing norms of AI responsibility

When an AI company is preparing to release a big model, it’s standard practice to test it against popular benchmarks in the field, thus giving the AI community a sense of how models stack up against each other in terms of technical performance. However, it has been less common to test models against responsible AI benchmarks that assess such things as toxic language output (RealToxicityPrompts and ToxiGen), harmful bias in responses (BOLD and BBQ), and a model’s degree of truthfulness (TruthfulQA). That’s starting to change, as there’s a growing sense that checking one’s model against theses benchmarks is, well, the responsible thing to do. However, another chart in the report shows that consistency is lacking: Developers are testing their models against different benchmarks, making comparisons harder.

14. Laws both boost and constrain AI

Between 2016 and 2023, the AI Index found that 33 countries had passed at least one law related to AI, with most of the action occurring in the United States and Europe; in total, 148 AI-related bills have been passed in that timeframe. The Index researchers also classified bills as either expansive laws that aim to enhance a country’s AI capabilities or restrictive laws that place limits on AI applications and usage. While many bills continue to boost AI, the researchers found a global trend toward restrictive legislation.

15. AI makes people nervous

The Index’s public opinion data comes from a global survey on attitudes toward AI, with responses from 22,816 adults (ages 16 to 74) in 31 countries. More than half of respondents said that AI makes them nervous, up from 39 percent the year before. And two-thirds of people now expect AI to profoundly change their daily lives in the next few years.

Maslej notes that other charts in the index show significant differences in opinion among different demographics, with young people being more inclined toward an optimistic view of how AI will change their lives. Interestingly, “a lot of this kind of AI pessimism comes from Western, well-developed nations,” he says, while respondents in places like Indonesia and Thailand said they expect AI’s benefits to outweigh its harms.