Long (Tony) Lian Long (Tony) Lian

I am a Member of Technical Staff at Thinking Machines Lab, where I work on post-training LLMs/MLLMs, including interaction models. I obtained my PhD in EECS at UC Berkeley and BAIR, advised by Prof. Trevor Darrell and Prof. Adam Yala. My research primarily focuses on developing multi-agent systems with real-time, multi-stream inputs through end-to-end RL. I was a research scientist intern at Meta GenAI with Victoria Lin and Yuandong Tian, working on RL for reasoning LLMs in 2025. I interned with the Deep Imagination Research team at NVIDIA Research in 2024. I hold a B.A. in Computer Science from UC Berkeley, where I conducted research under the supervision of Prof. Stella Yu during my undergraduate studies. I also interned with Baidu’s Distributed Deep Learning team as an undergrad.

Long (Tony) Lian

Publications (*: equal contribution)

ThreadWeaver: Adaptive Threading for Efficient Parallel Reasoning in Language Models

Long Lian, Sida Wang, Felix Juefei-Xu, Tsu-Jui Fu, Xiuyu Li, Adam Yala, Trevor Darrell, Alane Suhr, Yuandong Tian, Xi Victoria Lin

International Conference on Machine Learning (ICML), 2026 (Spotlight)

ThreadWeaver: Adaptive Threading for Efficient Parallel Reasoning in Language Models

Learning Adaptive Parallel Reasoning with Language Models

Jiayi Pan*, Xiuyu Li*, Long Lian*, Charlie Victor Snell, Yifei Zhou, Adam Yala, Trevor Darrell, Kurt Keutzer, Alane Suhr

Conference on Language Modeling (COLM), 2026

Learning Adaptive Parallel Reasoning with Language Models

Describe Anything: Detailed Localized Image and Video Captioning

Long Lian, Yifan Ding, Yunhao Ge, Sifei Liu, Hanzi Mao, Boyi Li, Marco Pavone, Ming-Yu Liu, Trevor Darrell, Adam Yala, Yin Cui

International Conference on Computer Vision (ICCV), 2025

Describe Anything: Detailed Localized Image and Video Captioning

CrossMAE: Rethinking Patch Dependence for Masked Autoencoders

Letian Fu*, Long Lian*, Renhao Wang, Baifeng Shi, Xudong Wang, Adam Yala†, Trevor Darrell†, Alexei A. Efros†, Ken Goldberg†

Transactions on Machine Learning Research (TMLR), 2025

CrossMAE: Rethinking Patch Dependence for Masked Autoencoders

Self-correcting LLM-controlled Diffusion Models

Tsung-Han Wu*, Long Lian*, Joseph E Gonzalez, Boyi Li, Trevor Darrell

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024

Self-correcting LLM-controlled Diffusion Models

LLM-grounded Video Diffusion Models

Long Lian*, Baifeng Shi*, Adam Yala, Trevor Darrell, Boyi Li

International Conference on Learning Representations (ICLR), 2024

LLM-grounded Video Diffusion Models

LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models

Long Lian, Boyi Li, Adam Yala, Trevor Darrell

Transactions on Machine Learning Research (TMLR), 2024 (Featured Certification)

LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models

Attend Before Attention: Efficient and Scalable Video Understanding via Autoregressive Gazing

Baifeng Shi*, Stephanie Fu*, Long Lian, Hanrong Ye, David Eigen, Aaron Reite, Boyi Li, Jan Kautz, Song Han, David M. Chan, Pavlo Molchanov, Trevor Darrell, Hongxu Yin

Attend Before Attention: Efficient and Scalable Video Understanding via Autoregressive Gazing

V1: Unifying Generation and Self-Verification for Parallel Reasoners

Harman Singh*, Xiuyu Li*, Kusha Sareen, Monishwaran Maheswaran, Sijun Tan, Xiaoxia Wu, Junxiong Wang, Alpay Ariyak, Qingyang Wu, Samir Khaki, Rishabh Tiwari, Long Lian, Yucheng Lu, Boyi Li, Alane Suhr, Ben Athiwaratkun, Kurt Keutzer

International Conference on Machine Learning (ICML), 2026

VisGym: Diverse, Customizable, Scalable Environments for Multimodal Agents

Zirui Wang*, Junyi Zhang*, Jiaxin Ge*, Long Lian, Letian Fu, Lisa Dunlap, Ken Goldberg, Xudong Wang, Ion Stoica, David M. Chan, Sewon Min, Joseph E. Gonzalez

ICLR Workshop on Multimodal Intelligence, 2026 (Best Paper Award)

VisGym: Diverse, Customizable, Scalable Environments for Multimodal Agents

Visually Prompted Benchmarks Are Surprisingly Fragile

Haiwen Feng*, Long Lian*, Lisa Dunlap*, Jiahao Shu, XuDong Wang, Renhao Wang, Trevor Darrell, Alane Suhr, Angjoo Kanazawa

Pillar-0: A New Frontier for Radiology Foundation Models

Kumar Krishna Agrawal, Longchao Liu, Long Lian, Michael Nercessian, Natalia Harguindeguy, Yufu Wu, Peter Mikhael, Gigin Lin, Lecia V. Sequist, Florian Fintelmann, Trevor Darrell, Yutong Bai, Maggie Chung, Adam Yala

Pillar-0: A New Frontier for Radiology Foundation Models

Constantly Improving Image Models Need Constantly Improving Benchmarks

Jiaxin Ge*, Grace Luo*, Heekyung Lee, Nishant Malpani, Long Lian, XuDong Wang, Aleksander Holynski, Trevor Darrell, Sewon Min, David M. Chan

International Conference on Learning Representations (ICLR), 2026

TULIP: Towards Unified Language-Image Pretraining

Zineng Tang, Long Lian, Seun Eisape, XuDong Wang, Roei Herzig, Adam Yala, Alane Suhr, Trevor Darrell, David M. Chan

International Conference on Computer Vision MMFM4 Workshop, 2025

Atlas: Multi-Scale Attention Improves Long Context Image Modeling

Kumar Krishna Agrawal*, Long Lian*, Longchao Liu, Natalia Harguindeguy, Boyi Li, Alexander Bick, Maggie Chung, Trevor Darrell, Adam Yala

Atlas: Multi-Scale Attention Improves Long Context Image Modeling

Unsupervised Universal Image Segmentation

Dantong Niu*, Xudong Wang*, Xinyang Han*, Long Lian, Roei Herzig, Trevor Darrell

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024

Unsupervised Universal Image Segmentation

Q-Diffusion: Quantizing Diffusion Models

Xiuyu Li, Yijiang Liu, Long Lian, Huanrui Yang, Zhen Dong, Daniel Kang, Shanghang Zhang, Kurt Keutzer

International Conference on Computer Vision (ICCV), 2023

Q-Diffusion: Quantizing Diffusion Models

Bootstrapping Objectness from Videos by Relaxed Common Fate and Visual Grouping

Long Lian, Zhirong Wu, Stella X. Yu

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023

Bootstrapping Objectness from Videos by Relaxed Common Fate and Visual Grouping

Unsupervised Selective Labeling for More Effective Semi-Supervised Learning

Xudong Wang*, Long Lian*, Stella X. Yu

European Conference on Computer Vision (ECCV), 2022

Unsupervised Selective Labeling for More Effective Semi-Supervised Learning

Debiased Learning from Naturally Imbalanced Pseudo-Labels

Xudong Wang, Zhirong Wu, Long Lian, Stella X. Yu

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022

Debiased Learning from Naturally Imbalanced Pseudo-Labels

Unsupervised Visual Attention and Invariance for Reinforcement Learning

Xudong Wang*, Long Lian*, Stella X. Yu

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021

Unsupervised Visual Attention and Invariance for Reinforcement Learning

Long-tailed Recognition by Routing Diverse Distribution-aware Experts

Xudong Wang, Long Lian, Zhongqi Miao, Ziwei Liu, Stella X. Yu

International Conference on Learning Representations (ICLR), 2021 (Spotlight)

Long-tailed Recognition by Routing Diverse Distribution-aware Experts

Side Projects

Stable Diffusion XL Demo WebUI

A gradio-based WebUI that allows playing around with SDXL locally and on Colab for free.

AnimeGAN.js

An implementation of AnimeGAN, which converts photos to anime style online, with tf.js.

Rainbow

An implementation of Rainbow algorithm with PARL reinforcement learning framework.