Today's AI/ML headlines are brought to you by ThreatPerspective

Digital Event Horizon

NVIDIA Unveils Cosmos Policy for Advanced Robot Control: A Breakthrough in Robot Learning


NVIDIA's latest innovation in robotics aims to improve performance and efficiency in advanced robotic manipulation tasks. Learn more about Cosmos Policy, a state-of-the-art robot control and planning policy that uses video generation capabilities to enable robots to understand how scenes evolve over time.

  • NVIDIA introduces Cosmos Policy, a state-of-the-art robot control and planning policy.
  • Cosmos Policy represents data in a unique way by treating actions, states, and success scores as additional latent frames learned using video generation techniques.
  • The model inherits its pre-learned understanding of physics and gravity, enabling it to predict action chunks for robot movement and world modeling.
  • Cosmos Policy can be deployed as a direct policy or planning policy, with real-world demonstrations on bimanual manipulation tasks using the ALOHA robot platform.
  • NVIDIA is hosting the Cosmos Cookoff hackathon and releasing the Cosmos Cookbook for developers to explore and build upon.


  • NVIDIA has taken a significant step forward in the field of robot learning and control by introducing its latest research, known as Cosmos Policy. This new policy is designed to improve the performance and efficiency of robots in various environments and applications, with a focus on advanced robotic manipulation tasks.

    Cosmos Policy is a state-of-the-art robot control and planning policy that has been developed using the NVIDIA Cosmos Predict-2 world foundation model. The model is trained to predict future frames, which allows it to understand how scenes evolve over time and generate temporal dynamics with videos. This capability is directly relevant to robot control, where actions must account for how the environment and the robot's own state change over time.

    The breakthrough of Cosmos Policy lies in its ability to represent data in a unique way. Instead of building separate neural networks for the robot's perception and control, it treats robot actions, physical states, and success scores just like frames in a video. All of these are encoded as additional latent frames that are learned using the same diffusion process as video generation.

    This allows the model to inherit its pre-learned understanding of physics, gravity, and how scenes evolve over time. As a result, a single model can predict action chunks to guide robots' movement using hand-eye coordination (i.e., visuomotor control), predict future robot observations for world modeling, and predict expected returns (i.e. value function) for planning.

    Cosmos Policy can be deployed either as a direct policy, where only actions are generated at inference time, or as a planning policy, where multiple candidate actions are evaluated by predicting their resulting future states and values.

    The researchers have also demonstrated the effectiveness of Cosmos Policy on real-world bimanual manipulation tasks using the ALOHA robot platform. The policy successfully executes long-horizon manipulation tasks directly from visual observations.

    In addition to its technical capabilities, Cosmos Policy represents an early step toward adapting world foundation models for robot control and planning. The researchers are actively working with early adopters to evolve this research for their robotics community.

    To support hands-on experimentation with Cosmos WFMs, NVIDIA is announcing the Cosmos Cookoff, an open hackathon where developers can get hands-on with Cosmos world foundation models and push the boundaries of physical AI.

    The latest Cosmocookoff is live, inviting physical AI developers across robotics, autonomous vehicles, and video analytics to explore, prototype fast, and learn with experts. The competition will take place from January 29th to February 26th, and teams can form up to 4 members. Prizes include a $5,000 cash prize, NVIDIA DGX Spark, NVIDIA GeForce RTX 5090 GPU, and more.

    The researchers have also released the Cosmos Cookbook for your own use cases, which provides practical recipes for adopting and building on the Cosmos models. The cookbook is available for developers to explore new open Cosmos models and datasets on Hugging Face and GitHub, or try models on build.nvidia.com.

    Throughout February, NVIDIA will be hosting live tutorials, partner talks, and AMAs featuring industry leaders like Intbot, Milestone Systems, Nebius, and more. Developers can also join the Cosmos Discord channel to be part of the community and learn more about how to contribute to the Cosmos Cookbook.

    In summary, NVIDIA's Cosmos Policy represents a significant breakthrough in robot learning and control. By integrating video generation capabilities into a single model, Cosmos Policy enables robots to understand how scenes evolve over time and generate temporal dynamics with videos. The new policy has been demonstrated on real-world bimanual manipulation tasks using the ALOHA robot platform, and it is available for developers to explore and build upon.



    Related Information:
  • https://www.digitaleventhorizon.com/articles/NVIDIA-Unveils-Cosmos-Policy-for-Advanced-Robot-Control-A-Breakthrough-in-Robot-Learning-deh.shtml

  • https://huggingface.co/blog/nvidia/cosmos-policy-for-robot-control

  • https://research.nvidia.com/labs/dir/cosmos-policy/cosmos_policy_index.html


  • Published: Thu Jan 29 11:32:57 2026 by llama3.2 3B Q4_K_M











    © Digital Event Horizon . All rights reserved.

    Privacy | Terms of Use | Contact Us