Jiawei Yang

CS PhD at USC

I build simple and scalable generative models—spanning images, video, actions, and unified multimodal systems—that can perceive, understand, and generate the world.

Current focus: visual generation and unified multimodal models.

I'm a third-year CS PhD student at USC, advised by Yue Wang (He is super nice!). My research studies generative models as the interface between perception and reasoning. I aim for models that unify visual and textual modalities under a shared representation, reducing complexity while scaling efficiently and robustly. The goal is a single model that generalizes across domains, remains stable under scale, and is useful in the wild.

Earlier, I worked on representation learning and 3D/4D scene understanding—self-supervised learning in medical and natural images, few-shot NeRF, self-supervised dynamic scene decomposition, and improving ViT feature quality by removing grid artifacts.

Principles I care about:

Simplicity with scaling: fewer modules, stronger laws.
Unification: one model and representation for many modalities.
Controllability: precise conditioning and editing beyond sampling.
Efficiency: stable training, reasonable compute, minimal tuning.

"Make things as simple as possible, but not simpler."

At the heart of my research philosophy is the pursuit of simplicity and scalability—ideas that are minimal yet powerful enough to scale and shape the future of intelligent systems.

News

Selected Publications

See full list on Google Scholar