Xuanlin (Simon) Li

I am a first-year PhD student at UCSD CSE, advised by Prof. Hao Su. Previously I was an undergraduate majoring in Mathematics and Computer Science at UC Berkeley (2017-2021). I was also an undergraduate research assistant at Berkeley Artificial Intelligence Research, where I was advised by Prof. Trevor Darrell.

Resume  /  GitHub  /  Google Scholar  /  LinkedIn  /  Twitter

profile photo


I am interested in embodied AI, which combines perspectives from computer vision, deep reinforcement learning, and natural language processing to enable robots to acquire concepts and generalizable skills. I am also interested in self-supervised and unsupervised representation learning, neural network architecture learning, and optimization.
(* = equal first-author contribution)

project image

ManiSkill: Generalizable Manipulation Skill Benchmark with Large-Scale Demonstrations

Tongzhou Mu*, Zhan Ling*, Fanbo Xiang*, Derek Yang*, Xuanlin Li*, Stone Tao, Zhiao Huang, Zhiwei Jia, Hao Su
Neural Information Processing Systems (NeurIPS) Datasets and Benchmarks Track, 2021
arxiv / website / video / code / implementation /

Object manipulation from 3D visual inputs poses many challenges on building generalizable perception and policy models. However, 3D assets in existing benchmarks mostly lack the diversity of 3D shapes that align with real-world intra-class complexity in topology and geometry. Here we propose SAPIEN Manipulation Skill Benchmark (ManiSkill) to benchmark manipulation skills over diverse objects in a full-physics simulator. 3D assets in ManiSkill include large intra-class topological and geometric variations. Tasks are carefully chosen to cover distinct types of manipulation challenges. Latest progress in 3D vision also makes us believe that we should customize the benchmark so that the challenge is inviting to researchers working on 3D deep learning. To this end, we simulate a moving panoramic camera that returns ego-centric point clouds or RGB-D images. In addition, we would like ManiSkill to serve a broad set of researchers interested in manipulation research. Besides supporting the learning of policies from interactions, we also support learning-from-demonstrations (LfD) methods, by providing a large number of high-quality demonstrations (~36,000 successful trajectories, ~1.5M point cloud/RGB-D frames in total). We provide baselines using 3D deep learning and LfD algorithms. All code of our benchmark (simulator, environment, SDK, and baselines) is open-sourced, and a challenge facing interdisciplinary researchers will be held based on the benchmark.

project image

Discovering Non-Monotonic Autoregressive Orderings with Variational Inference

Xuanlin Li*, Brandon Trabucco*, Dong Huk Park, Yang Gao, Michael Luo, Sheng Shen, Trevor Darrell
International Conference on Learning Representations (ICLR) 2021
paper / video_transcripts / code / poster / slides /

We propose the first domain-independent unsupervised / self-supervised learner that discovers high-quality autoregressive orders through fully-parallelizable end-to-end training in a data-driven manner - no domain knowledge required. The learner contains an encoder network and decoder language model that perform variational inference with autoregressive orders (represented as permutation matrices) as latent variables. The corresponding ELBO is not differentiable, so we develop a practical algorithm for end-to-end optimization using policy gradients. We implement the encoder as a Transformer with non-causal attention that outputs permutations in one forward pass. Permutations then serve as target generation orders for training an insertion-based Transformer language model. Empirical results in language modeling tasks demonstrate that our method is context-aware and discovers orderings that are competitive with or even better than fixed orders.

project image

Regularization Matters in Policy Optimization - An Empirical Study on Continuous Control

Zhuang Liu*, Xuanlin Li*, Bingyi Kang, Trevor Darrell
International Conference on Learning Representations (ICLR) 2021 (Spotlight)
arxiv / video / code / poster / slides /

We present the first comprehensive study of regularization techniques with multiple policy optimization algorithms on continuous control tasks. We show that conventional regularization methods in supervised learning, which have been largely ignored in RL methods, can be very effective in policy optimization on continuous control tasks, and our finding is robust against training hyperparameter variations. We also analyze why they can help policy generalization from sample complexity, return distribution, weight norm, and noise robustness perspectives.

Other Projects

These include coursework, side projects and unpublished research work.

project image

Inferring the Optimal Policy using Markov Chain Monte Carlo

Brandon Trabucco, Albert Qu, Xuanlin Li, Ganeshkumar Ashokavardhanan
Berkeley EECS 126 (Probability and Random Processes)
arxiv /

Final course project for EECS 126 (Probability and Random Processes) in Fall 2018.

Honors and Awards

Jacobs School of Engineering PhD Fellowship, UC San Diego CSE, 2021
Arthur M. Hopkin Award, UC Berkeley EECS, 2021
EECS Honors Program & Mathematics Honors Program, UC Berkeley

Design and source code from Jon Barron's website