Projects
Projects and publications I have worked on. Most of my work is open-source — see my Github and Google Scholar for the full list.
Research Publications
NVIDIA Research — GEAR (2024–Present)
- DreamZero: World Action Models are Zero-shot Policies (2026)
- A world action model focused on diverse task generalization, achieving top results on the MolmoSpace and RoboArena benchmarks.
- EgoScale: Scaling Human Video to Unlock Dexterous Robot Intelligence (2026)
- Studies how scaling egocentric human video data improves robot dexterous manipulation performance.
- DreamDojo: A Generalist Robot World Model from Large-Scale Human Videos (2026)
- An action-conditioned world model trained on large-scale human and robot data for diverse manipulation tasks.
- FLARE: Robot Learning with Implicit World Modeling (2025)
- Integrates implicit world modeling into robot policy learning to improve sample efficiency and cross-task generalization.
- DreamGen: Unlocking Generalization in Robot Learning through Neural Trajectories (2025)
- Uses a video world model to synthesize neural trajectories, augmenting robot policy training without additional real-world data collection.

- GR00T N1: An Open Foundation Model for Generalist Humanoid Robots (2025)
- Nvidia’s VLA (Vision-Language-Action) foundation model for generalist humanoid robots, trained with large-scale synthetic, human, and robot data. Core lead of the open-source effort for Isaac GR00T N1.x. [paper] [code]
Berkeley AI Research — RAIL Lab (2023–2024)

- AutoEval: Autonomous Evaluation of Generalist Robot Manipulation Policies in the Real World (2025)
- A self-resetting real-world evaluation system that autonomously benchmarks robot manipulation policies, eliminating the need for human resets. [paper]
- SERL: A Software Suite for Sample-Efficient Robotic Reinforcement Learning (ICRA 2024)
- Full-stack software suite for off-policy and offline RL directly on real robots, achieving high sample efficiency for contact-rich manipulation tasks. [code]
- Octo: An Open-Source Generalist Robot Policy (RSS 2024)
- Transformer-based generalist robot policy pretrained on 800k+ episodes from the Open X-Embodiment dataset, supporting flexible cross-embodiment fine-tuning. [code]
Georgia Tech — Healthcare Robotics Lab (2022–2023)

- ForceSight: Multi-Task Text-Guided Mobile Manipulation with Visual-Force Goals (ICRA 2024 / CoRL 2023)
- Predicts 3D visual-force affordance goals from RGBD images and language, enabling dexterous mobile manipulation across diverse tasks.
- Stretch with Stretch: Physical Therapy Exercise Games Led by a Mobile Manipulator (ICRA 2024)
- Mobile manipulator system that guides Parkinson’s patients through physical therapy exercises, combining assistive robotics with human-robot interaction.
Robotics 🤖
Robotics Middleware Framework (Open-RMF)

Core contributor to Open-RMF, a distributed open-source framework for managing heterogeneous robot fleets — handling task allocation, scheduling, and multi-robot path conflict resolution at scale.
- RMF core libraries (C++ / Python): [repo]
- RMF simulation demos: [repo]
- RMF Web App (full-stack): [repo]
- Cloud deployment template (Docker, Kubernetes, ArgoCD): [repo]
Robots and Simulation

- Fiducial Marker Based Auto Docking — vision-based autonomous docking using ArUco markers for mobile robots.
- Dual Arm Manipulation Workcell — multi-robot manipulator workcell in Gazebo with coordinated motion planning.
Miscellaneous

- 3D SLAM to 2D Occupancy Map — generates 2D navigation maps from 3D SLAM pointclouds for heterogeneous AGV fleets.
- Agnostic Camera Driver (ROS/ROS2) — CMake-based hardware-agnostic depth camera driver.
- Cargo Volume Detection — multi-camera 3D pointcloud detection system for aviation cargo handling (closed source).
Machine Learning 🦾

- ManipulatorGym — gym-like environment and utilities for robot manipulators; includes data collection pipelines for Octo and OpenVLA fine-tuning.
- Agentlace — lightweight distributed async agent framework for imitation learning and RL across multiple processes and machines.
- TennisBot RL — tennis gameplay agent trained with reinforcement learning and evolutionary search.
- UAV Landing NLP — drone landing site selection from aerial images guided by natural language instructions.
- Distributed MapReduce (C++) — built from scratch and deployed to Azure with Kubernetes.
- NewsGPT — news headline summarization using OpenAI + Langchain, hosted on Azure.