
We all know that robots are getting smarter by the day, but artificial intelligence companyFigure AIhas taken things to the next level with its Vision-Language-Action (VLA) model—Helix—that can understand speech, reason through problems, and interact with the world like ahuman.
Helix—The Thinking Robot
Helix is a first-of-its-kind VLA robot that, according to Figure AI’s founder, Brett Adcock, can not only follow a human’s commands, but process what it’s been told, and respondintelligently. For example, it can pick up a household item and know what to do with it.
Adcock noted that in order forrobotsto be integrated into homes, a step change in capabilities was in order. And it looks like that’s just what the company managed to do. “Helixunderstands speech, reasons through problems, and can grasp any object—all without needing training or code,” he said.
Figure AI’s Goal
After Figure AI parted ways with OpenAI earlier this month, Adcock announced that the company had “achieved a significant breakthrough in fully end-to-end robot AI, developed entirely in-house.”
He added that something “no one had ever seen before in a humanoid” would be revealed within 30 days. The company set out to develop something revolutionary and, after working on it for over a year, Helix is the result.
What it Can Do
Helix has been designed to seamlessly integrate into our daily lives and interact naturally. It can continuously control an entirehumanoidupper body—and fluidly, too! We’re talking wrists, fingers, torso… it can move them all with extreme precision. It’s able to do mundane tasks, like open a drawer or fridge, and handle thousands of objects as easily as a human can.
It’s also able to operate two robots at once so that they can work together in real time on tasks they’ve never done before. No programming is needed; you just give Helix a command, and it figures out the rest. These impressive feats are all early results which the company says “only scratch the surface of what’s possible”.
How it Works
Helix runs on embedded low-power GPUs, making it practical for real-world use. Trained on a dataset of teleoperated behaviors (500 hours worth), Helix is powered by two systems: S1 and S2. S2 is a pre-trained system that processes what’s happening in a scene and interprets language, while S1, a faster decision-maker, turns that information into real-time robot actions.
By separating these functions, Figure AI has created a robot that’s both fast and “thoughtful”. The division also makes upgrades and improvements much easier. To learn different behaviors, Helix uses a single set of neural network weights—no need for task-specific training.
You can see Helix in action inFigure AI’s video—prepare to be blown away.
Image Credit:Figure AI