Long (Tony) Lian

I am an EECS PhD student at UC Berkeley and BAIR, advised by Prof. Adam Yala and Prof. Trevor Darrell. My research primarily focuses on developing LLMs/VLMs with reasoning capabilities through RL. I was an intern with the Deep Imagination Research team at NVIDIA Research. I hold a B.A. in Computer Science from UC Berkeley, where I conducted research under the supervision of Prof. Stella Yu during my undergraduate studies. I also interned with Baidu’s Distributed Deep Learning team.

Publications (*: equal contribution)

Describe Anything: Detailed Localized Image and Video Captioning

Long Lian, Yifan Ding, Yunhao Ge, Sifei Liu, Hanzi Mao, Boyi Li, Marco Pavone, Ming-Yu Liu, Trevor Darrell, Adam Yala, Yin Cui

Paper Project Demo Code BibTex TL;DR

Describe Anything: Detailed Localized Image and Video Captioning

Learning Adaptive Parallel Reasoning with Language Models

Jiayi Pan*, Xiuyu Li*, Long Lian*, Charlie Victor Snell, Yifei Zhou, Adam Yala, Trevor Darrell, Kurt Keutzer, Alane Suhr

Paper Code BibTex TL;DR

Learning Adaptive Parallel Reasoning with Language Models

Atlas: Multi-Scale Attention Improves Long Context Image Modeling

Kumar Krishna Agrawal*, Long Lian*, Longchao Liu, Natalia Harguindeguy, Boyi Li, Alexander Bick, Maggie Chung, Trevor Darrell, Adam Yala

Paper Code BibTex TL;DR

Atlas: Multi-Scale Attention Improves Long Context Image Modeling

CrossMAE: Rethinking Patch Dependence for Masked Autoencoders

Letian Fu*, Long Lian*, Renhao Wang, Baifeng Shi, Xudong Wang, Adam Yala†, Trevor Darrell†, Alexei A. Efros†, Ken Goldberg†

Transactions on Machine Learning Research (TMLR), 2025

Paper Project Code BibTex TL;DR

CrossMAE: Rethinking Patch Dependence for Masked Autoencoders

Unsupervised Universal Image Segmentation

Dantong Niu*, Xudong Wang*, Xinyang Han*, Long Lian, Roei Herzig, Trevor Darrell

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024

Paper Code BibTex TL;DR

Unsupervised Universal Image Segmentation

Self-correcting LLM-controlled Diffusion Models

Tsung-Han Wu*, Long Lian*, Joseph E Gonzalez, Boyi Li, Trevor Darrell

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024

Paper Project Code BibTex TL;DR

Self-correcting LLM-controlled Diffusion Models

LLM-grounded Video Diffusion Models

Long Lian*, Baifeng Shi*, Adam Yala, Trevor Darrell, Boyi Li

International Conference on Learning Representations (ICLR), 2024

Paper Project Code BibTex TL;DR

LLM-grounded Video Diffusion Models

LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models

Long Lian, Boyi Li, Adam Yala, Trevor Darrell

Transactions on Machine Learning Research (TMLR), 2024 (Featured Certification)

Paper Blog Project Demo Code Poster BibTex TL;DR

LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models

Q-Diffusion: Quantizing Diffusion Models

Xiuyu Li, Yijiang Liu, Long Lian, Huanrui Yang, Zhen Dong, Daniel Kang, Shanghang Zhang, Kurt Keutzer

International Conference on Computer Vision (ICCV), 2023

Paper Project Code BibTex TL;DR

Q-Diffusion: Quantizing Diffusion Models

Bootstrapping Objectness from Videos by Relaxed Common Fate and Visual Grouping

Long Lian, Zhirong Wu, Stella X. Yu

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023

Paper Project Video Demo Video Code Poster BibTex TL;DR

Bootstrapping Objectness from Videos by Relaxed Common Fate and Visual Grouping

Unsupervised Selective Labeling for More Effective Semi-Supervised Learning

Xudong Wang*, Long Lian*, Stella X. Yu

European Conference on Computer Vision (ECCV), 2022

Paper Video Code Poster BibTex TL;DR

Unsupervised Selective Labeling for More Effective Semi-Supervised Learning

Debiased Learning from Naturally Imbalanced Pseudo-Labels

Xudong Wang, Zhirong Wu, Long Lian, Stella X. Yu

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022

Paper Code BibTex TL;DR

Debiased Learning from Naturally Imbalanced Pseudo-Labels

Unsupervised Visual Attention and Invariance for Reinforcement Learning

Xudong Wang*, Long Lian*, Stella X. Yu

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021

Paper Code Poster BibTex TL;DR

Unsupervised Visual Attention and Invariance for Reinforcement Learning

Long-tailed Recognition by Routing Diverse Distribution-aware Experts

Xudong Wang, Long Lian, Zhongqi Miao, Ziwei Liu, Stella X. Yu

International Conference on Learning Representations (ICLR), 2021 (Spotlight)

Paper Video Code Poster BibTex TL;DR

Long-tailed Recognition by Routing Diverse Distribution-aware Experts

Academic Services

Reviewer for CVPR/ECCV/ICCV/ICLR/ICML/NeurIPS/AAAI

Side Projects

Stable Diffusion XL Demo WebUI

A gradio-based WebUI that allows playing around with SDXL locally and on Colab for free.

Code Demo

AnimeGAN.js

An implementation of AnimeGAN, which converts photos to anime style online, with tf.js.

Code Demo

Rainbow

An implementation of Rainbow algorithm with PARL reinforcement learning framework.

Code