Published on November 4, 2023, 8:27 am
Google’s DeepMind team recently introduced Open X-Embodiment, a database of robotics functionality created in collaboration with 33 research institutes. Similar to the ImageNet database for computer vision, Open X-Embodiment aims to advance robotics research and training by providing a diverse set of robot demonstrations. The initial release of Open X-Embodiment included over 500 skills and 150,000 tasks from 22 robot embodiments.
DeepMind then trained its RT-1-X model on this data and used it to train robots in other labs, achieving a 50% success rate compared to their in-house methods. This development highlights the exciting advancements taking place in robotic learning.
Vincent Vanhoucke, Google DeepMind’s Head of Robotics, discussed how the robotics team came about within DeepMind. While there is a longer history of robotics research at Google DeepMind, the recent merger between Google Research and DeepMind sparked an increasing interest in real-world robotics due to significant improvements in perception technology. Vanhoucke mentioned that his involvement with DeepMind is relatively recent but stems from his background in general AI and computer vision.
Vanhoucke acknowledged that a portion of his team originated from Google’s acquisitions in the robotics field, such as Boston Dynamics. He emphasized how these past acquisitions contributed to the understanding that the challenges faced by robotics are closely tied to those of general AI. By leveraging strong AI capabilities, they aim to develop general-purpose methods applicable across various robot embodiments and form factors.
Regarding collaboration with Everyday Robots, Vanhoucke revealed that both teams have been working together for approximately seven years. One project involved training discontinued robot arms using machine learning for grasping objects – a breakthrough moment that prompted further investigations into using machine learning and perception for controlling robots.
When asked about the role of generative AI in robotics, Vanhoucke believes it will be central to progress in the field. Language models can provide common-sense reasoning and understanding of the everyday world, making them valuable tools for robot planning, interactions, manipulations, and human-robot interactions. This application of generative AI enables robots to reason about simulated environments – a crucial aspect of the robotics problem.
Simulation also plays a significant role in robotics research as it helps bridge the gap between simulation and reality. However, Vanhoucke acknowledges the challenges involved in accurately representing real-world physics and visuals within simulations. He mentions that generative AI is making strides in this area by providing alternatives to running complex physics simulators, such as generating images or using other generative models.
The potential of simulation extends beyond generating assets; it can be used to predict future outcomes based on different robot actions. Vanhoucke gives the example of Amazon using simulation to generate packages and foresees more opportunities for generating futures. By employing generative models, robots can plan and verify actions without having to rely solely on real-world testing.
As Google DeepMind’s robotics team continues its research and collaboration efforts with various partners, the field of generative AI holds promise in pushing the boundaries of robotics capabilities. With ongoing advancements in perception technology and the application of general-purpose methods, we may eventually see a world where general-purpose robots are a reality.