Skip to main content Skip to secondary navigation

SystemX Alliance Newsletter - March 2024

Main content start

SystemX Alliance 2024 Spring Workshop

Tuesday, April 16, 2024 | McCaw Hall, Arrillaga Alumni Center | Stanford University

The Intersection of AI, Edge, and Sensing Systems

We are experiencing an explosion of sensors and data, accompanied by an acceleration in machine-to-machine (M2M) communication to support ever more sophisticated AI. This is creating growing challenges in cloud-only processing, due to constraints in bandwidth, latency, energy consumption, and compounded by concerns over privacy and security. Addressing these issues demands breakthroughs in architecture and applications, coupled with advancements in edge sensing technologies. Such innovations are crucial for augmenting cloud capabilities, ensuring a seamless, secure, and efficient AI processing framework that can keep pace with the ever-increasing demands of our digital world.

There will be 3 consecutive sessions for this one-day event: 

  1. Computing & Sensing at the edge
  2. Edge Applications
  3. Edge Architectures

To view the event schedule, please visit the SystemX Members: https://members.systemx.stanford.edu/node/2040/

 


 

Video credit to Huy Ha, EE Graduate Student (REALab)

Lab Spotlight: Robotics and Embodied Artificial Intelligence Lab (REALab)

Shuran Song

Professor Shuran Song. How can we create versatile robots with the ability to learn and execute new skills, enabling them to perform everyday tasks like bed-making and dishwashing? These seemingly straightforward activities present considerable challenges for robots, given the intricate, varied, and unstructured nature of real-world physical environments. In the Robotics and Embodied AI Lab (REALab@Stanford), we aim to develop machine learning algorithms that help the robots acquire these useful manipulation skills from diverse data sources. This data can take various forms, including demonstrations by human instructors, experiences in physics simulators, real-world interactions, and information gathered from the internet.

1. Learning from human demonstrations

Humans are the masters of manipulation, even young children exhibiting remarkable dexterity with a wide array of objects. Teaching robots new manipulation skills through human demonstrations seems intuitive. Yet, the direct extraction of reusable skills from unstructured human videos proves challenging due to significant embodiment differences and unobserved action parameters. To address these challenges, lab members Cheng ChiZhenjia Xu, and Chuer Pan focus on designing intuitive yet capable interfaces for robots to learn complex manipulation tasks. Recently we introduced the universal manipulation interface — sensorized handheld grippers that allow non-experts to demonstrate diverse, intricate manipulation skills in any environment. In addition to the data collection interface, we also study better machine learning algorithms that could allow the robot to make best use of the data collected. For example, Cheng Chi and Zhenjia worked on a new visuomotor policy representation (Diffusion Policy) that could help in learning those complex multimodal action distributions captured in human demonstration data. Meanwhile, Mengda Xu studies cross-embodiment learning algorithms for skill discovery that could automatically discover reusable skill representations from unlabeled human manipulation videos.

2. Learning from physics simulators 

When we go beyond daily manipulation tasks with robot arms, we realize that human demonstration is often impractical for many robot hardware configurations and tasks. For example, how would we teach a four-legged robot dog to walk? In these cases, physics simulators can generate valuable training data. Yihuai Gao and Huy Ha are currently using simulators to teach a simulated robot dog to walk, run, and fetch objects. The simulator automatically tries different action parameters, allowing the robot to learn effective policies through numerous simulated experiences. Since simulation is purely computational and does not require real hardware, we can easily parallelize training across environments. Once the robot learns an effective policy, it can be deployed on physical hardware.

In addition to training robot policies, Xiaomeng Xu is using physics simulation to optimize robot hardware designs. For example, the simulator can automatically generate robot hand designs with varied finger geometries optimized for new tasks and objects. Meanwhile, Austin Patel is developing a unified manipulation policy that can make use of drastically different robot hand designs, like hands with two, three or five fingers of different lengths. Training in simulation allows us to generate unlimited data with diverse hardware without needing to manufacture real prototypes. We hope these policies can transfer to any new hardware without extensive real-world data collection.

3. Learning from self-supervised physical interactions 

Beyond learning from human teachers or physics simulators, robots can directly learn manipulation skills through their own physical interactions with the world. This “self-supervised learning” approach enables robots to acquire knowledge about real-world dynamics that are challenging to model and simulate accurately. For example, while high-quality simulations exist for rigid and articulated bodies, our ability to accurately simulate deformable objects like cloth, fabrics, and fluids remains limited. In these cases, robots may be able to learn the dynamics of those challenging materials directly by actively manipulating and observing them. Huy Ha explores this idea for deformable objects. In FlingBot, the robot learns how to efficiently unfold a piece of fabric using high-speed fling actions through a self-supervised learning process. The robot automatically computes task rewards from visual input. When combined with an automatic reset mechanism, the system can continue training for days with minimal human intervention. By leveraging their own experiences in the physical world, future robots may acquire robust manipulation skills beyond what human teachers can demonstrate.

4. Learning with commonsense knowledge 

With the rapid expansion of internet data, large language models (LLMs) like GPT-4 contain a vast repository of commonsense knowledge about object dynamics and properties. This knowledge, gleaned from the textual data used to train LLMs, provides complementary information beyond what robotic models can learn through self-supervised interactions. Integrating commonsense knowledge into robotic systems can thus inform planning and coordination for collaborative tasks. For example, Mandi Zhao is exploring the use of commonsense knowledge from LLMs to aid multi-agent coordination and collaboration. The commonsense insights help agents anticipate needs and actions in shared tasks. Meanwhile, Zeyi Liu is investigating how to leverage commonsense to help robots detect, analyze, and explain failures. We propose a framework REFLECT, which converts multi-sensory data into a summary of the robot's past experiences and queries LLM with a progressive failure explanation algorithm. These explanations can either help humans efficiently debug the robot, or guide the robot to self-correct based on the identified failure cause.

Video and article credit to REALab

Students & Faculty Awards/Recognitions