Projects

Projects and publications I have worked on. Most of my work is open-source — see my Github and Google Scholar for the full list.

Research Publications

NVIDIA Research — GEAR (2024–Present)

DreamZero: World Action Models are Zero-shot Policies (2026)
- A world action model focused on diverse task generalization, achieving top results on the MolmoSpace and RoboArena benchmarks.
EgoScale: Scaling Human Video to Unlock Dexterous Robot Intelligence (2026)
- Studies how scaling egocentric human video data improves robot dexterous manipulation performance.
DreamDojo: A Generalist Robot World Model from Large-Scale Human Videos (2026)
- An action-conditioned world model trained on large-scale human and robot data for diverse manipulation tasks.
FLARE: Robot Learning with Implicit World Modeling (2025)
- Integrates implicit world modeling into robot policy learning to improve sample efficiency and cross-task generalization.
DreamGen: Unlocking Generalization in Robot Learning through Neural Trajectories (2025)
- Uses a video world model to synthesize neural trajectories, augmenting robot policy training without additional real-world data collection.

GR00T N1: An Open Foundation Model for Generalist Humanoid Robots (2025)
- Nvidia’s VLA (Vision-Language-Action) foundation model for generalist humanoid robots, trained with large-scale synthetic, human, and robot data. Core lead of the open-source effort for Isaac GR00T N1.x. [paper] [code]

Berkeley AI Research — RAIL Lab (2023–2024)

AutoEval: Autonomous Evaluation of Generalist Robot Manipulation Policies in the Real World (2025)
- A self-resetting real-world evaluation system that autonomously benchmarks robot manipulation policies, eliminating the need for human resets. [paper]
SERL: A Software Suite for Sample-Efficient Robotic Reinforcement Learning (ICRA 2024)
- Full-stack software suite for off-policy and offline RL directly on real robots, achieving high sample efficiency for contact-rich manipulation tasks. [code]
Octo: An Open-Source Generalist Robot Policy (RSS 2024)
- Transformer-based generalist robot policy pretrained on 800k+ episodes from the Open X-Embodiment dataset, supporting flexible cross-embodiment fine-tuning. [code]

Georgia Tech — Healthcare Robotics Lab (2022–2023)

ForceSight: Multi-Task Text-Guided Mobile Manipulation with Visual-Force Goals (ICRA 2024 / CoRL 2023)
- Predicts 3D visual-force affordance goals from RGBD images and language, enabling dexterous mobile manipulation across diverse tasks.
Stretch with Stretch: Physical Therapy Exercise Games Led by a Mobile Manipulator (ICRA 2024)
- Mobile manipulator system that guides Parkinson’s patients through physical therapy exercises, combining assistive robotics with human-robot interaction.

Robotics 🤖

Robotics Middleware Framework (Open-RMF)

Core contributor to Open-RMF, a distributed open-source framework for managing heterogeneous robot fleets — handling task allocation, scheduling, and multi-robot path conflict resolution at scale.

RMF core libraries (C++ / Python): [repo]
RMF simulation demos: [repo]
RMF Web App (full-stack): [repo]
Cloud deployment template (Docker, Kubernetes, ArgoCD): [repo]

Robots and Simulation

autodock

Fiducial Marker Based Auto Docking — vision-based autonomous docking using ArUco markers for mobile robots.
Dual Arm Manipulation Workcell — multi-robot manipulator workcell in Gazebo with coordinated motion planning.

Miscellaneous

3D SLAM to 2D Occupancy Map — generates 2D navigation maps from 3D SLAM pointclouds for heterogeneous AGV fleets.
Agnostic Camera Driver (ROS/ROS2) — CMake-based hardware-agnostic depth camera driver.
Cargo Volume Detection — multi-camera 3D pointcloud detection system for aviation cargo handling (closed source).

Machine Learning 🦾

ManipulatorGym — gym-like environment and utilities for robot manipulators; includes data collection pipelines for Octo and OpenVLA fine-tuning.
Agentlace — lightweight distributed async agent framework for imitation learning and RL across multiple processes and machines.
TennisBot RL — tennis gameplay agent trained with reinforcement learning and evolutionary search.
UAV Landing NLP — drone landing site selection from aerial images guided by natural language instructions.
Distributed MapReduce (C++) — built from scratch and deployed to Azure with Kubernetes.
NewsGPT — news headline summarization using OpenAI + Langchain, hosted on Azure.

You Liang, Tan