Long (Tony) Lian
About
I am an EECS PhD student at UC Berkeley and BAIR, advised by Prof. Adam Yala and Prof. Trevor Darrell. I am also an intern with the Deep Imagination Research team at NVIDIA Research. I hold a B.A. in Computer Science from UC Berkeley, where I conducted research under the supervision of Prof. Stella Yu during my undergraduate studies. My research primarily focuses on developing multi-modal vision-language models aimed at understanding and generation across text, image, and video data. I also interned with Baidu’s distributed deep learning team during my undergraduate years.
Email / Google Scholar / Twitter / LinkedIn / GithubPublications (*: equal contribution)
CrossMAE: Rethinking Patch Dependence for Masked Autoencoders
Paper / Project Page / Code / BibTex / TL;DR
LLM-grounded Video Diffusion Models
International Conference on Learning Representations (ICLR), 2024
Paper / Project Page / Code / BibTex / TL;DR
LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models
Transactions on Machine Learning Research (TMLR)
Paper / Blog Post / Project Page / Demo / Code / Poster / BibTex / TL;DR
Q-Diffusion: Quantizing Diffusion Models
International Conference on Computer Vision (ICCV), 2023
Paper / Project Page / Code / BibTex / TL;DR
Bootstrapping Objectness from Videos by Relaxed Common Fate and Visual Grouping
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023
Paper / Project Page / Video / Demo Video / Code / Poster / BibTex / TL;DR
Unsupervised Selective Labeling for More Effective Semi-Supervised Learning
European Conference on Computer Vision (ECCV), 2022
Debiased Learning from Naturally Imbalanced Pseudo-Labels
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022
Unsupervised Visual Attention and Invariance for Reinforcement Learning
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021
Academic Services
Reviewer for CVPR/ECCV/ICCV/ICLR/ICML/NeurIPS
Side Projects
Stable Diffusion XL Demo WebUI: A gradio-based WebUI that allows playing around with SDXL locally and on Colab for free.
AnimeGAN.js: An implementation of AnimeGAN, which converts photos to anime style online, with tf.js.
Rainbow: An implementation of Rainbow algorithm with PARL reinforcement learning framework.