Ph.D. Candidate, Yale University
chen.liu.cl2482 at yale.edu
New Haven, CT & Mountain View, CA.
Google Scholar
LinkedIn
X
GitHub
Resume
Acknowledgements Many thanks to Zhuang Liu for kindly providing this website template, which was adapted from Zhe Cao's website.
Chen Liu
I am a Ph.D. candidate in CS at Yale University advised
by
Smita Krishnaswamy.
Research
My research focuses on the geometry of learned
latent representations, investigating how neural networks organize
information internally, when it breaks down, and how to fix it.
For LLMs, I identified and systematically
characterized embedding condensation, a geometric phenomenon
where token embeddings collapse into a narrow cone, and showed how to mitigate it to
improve generalization in pre- and mid-training
(ICML 2026).
For MLLMs, I co-developed VIGIL,
an RL post-training framework that punishes
visual laziness by penalizing over-reliance on language priors when
visual evidence is present (ECCV 2026).
I apply the same perspective to on-manifold generative modeling (RNAGenScape),
multimodal representation learning
(ImmunoStruct @
Nature Machine Intelligence),
unsupervised image segmentation
(CUTS @ MICCAI 2024 and
DiffKillR @ ICASSP 2025 Oral)
and latent dynamics modeling (ImageFlowNet @ ICASSP 2025 Oral).
This summer I am interning at ByteDance Seed, working
on LLM self-evolution and RL post-training.
Mentorship
[2025-2026] Xiangyu
Zhang, Tsinghua University junior. Now UC Berkeley PhD (Fall
2026).
[2025-2025] Ethan Zhang, high school student.
[2024-2025] Aryaman
Mishra, high school student. Now JHU (Fall 2025).
[2024-2025] Jason Shaye, high
school student. Now Stanford (Fall 2025).
Education
Yale University
2022 - present. Ph.D. in Computer Science
Columbia University
2018 - 2019. M.S. in Electrical Engineering.
Bucknell University
2014 - 2018. B.S. in Electrical Engineering.
Shanghai Foreign Language School
2007 - 2014. Middle and high school.
Experience
Research Scientist Intern @ ByteDance Seed
Summer 2026. LLM self-evolution and RL post-training.
Senior Research Scientist @ GE Healthcare
2021 - 2022. Deep learning for medical imaging.
Research Software Engineer @ Matic
2021. Developed SLAM algorithms for housekeeping robots.
Research Assistant @ Columbia University Medical Center
2019 - 2020. Deep learning for medical imaging.
News
[04/2026] Dispersion loss to counteract embedding condensation in LLMs is accepted to ICML
2026.
[11/2025] ImmunoStruct is accepted to
Nature Machine Intelligence.
[09/2025] Brainteaser, a benchmark for LLM reasoning, is accepted to NeurIPS 2025.
[06/2025] RNAGenScape is accepted to ICML 2025 GenBio workshop as a
Spotlight & Oral.
[01/2025] Geometry-Aware Generative Autoencoder (GAGA) is accepted to AISTATS 2025.
[12/2024] 3/3 papers (2 Oral) are accepted to ICASSP 2025.
[11/2024] I am recognized as a Top Reviewer at NeurIPS 2024.
[06/2024] CUTS, my first Ph.D. project, is accepted to MICCAI 2024.
[08/2022] I quit my industry job to start my Ph.D. journey at Krishnaswamy Lab, Yale University.
[06/2022] I am recognized as an Outstanding Reviewer at ICML 2022.
Selected Recent Publications (Asterisk denotes co-first authorship)
Dispersion loss counteracts embedding condensation and improves generalization in small language models
I identified and characterized a previously underexplored phenomenon in transformer-based language models that we call embedding condensation: token representations progressively collapse into a narrow cone across layers. Through systematic experiments, we found that larger models are naturally more resistant to this effect, that condensation emerges even at initialization, and that such resistance cannot simply be inherited through knowledge distillation. Motivated by these observations, we explored a simple dispersion regularizer that mitigates representation collapse and improves downstream generalization during both pre- and mid-training.
ICML 2026
Staying VIGILant: mitigating visual laziness via counterfactual visual alignment in MLLMs
I co-developed VIGIL, an RL post-training framework that addresses visual laziness, a common failure mode in multimodal language models (MLLMs) where hallucinations arise from overreliance on language priors rather than visual evidence. VIGIL contrasts the model's behavior when it can see the image with a counterfactual "blind" state induced by blocking visual-textual attention. This signal encourages the model to extract information from the image and improves visual grounding.
ECCV 2026
ImmunoStruct enables multimodal deep learning for immunogenicity prediction
ImmunoStruct predicts immunogenicity of peptide-MHC complexes by fusing information from multiple biological modalities: sequence, structure and biochemical properties. I designed a novel mutant–wild-type contrastive learning objective to establish a new state of the art in the field, by encouraging immunogenicity-aware pairwise similarity and suppressing feature space collapse.
Nature Machine Intelligence 2026, Impact Factor: 23.9 (2026)
RNAGenScape: property-guided, optimized generation of mRNA sequences with manifold Langevin Dynamics
Instead of generating mRNA sequences from scratch, we optimize existing sequences while explicitly constraining every intermediate step to remain biologically realistic. This on-manifold approach yields more reliable property optimization and is substantially more efficient than diffusion-based generation, producing larger improvements while preserving sequence viability across diverse real-world datasets.
ICML 2025 GenBio Workshop, Spotlight and Oral
ImageFlowNet: forecasting multiscale image-level trajectories of disease progression with irregularly-sampled longitudinal medical images
I designed a continuous-time image forecasting framework that predicts how a patient's disease anatomy evolves over time. Instead of forecasting only regularly-sampled timepoints or tracking a few predefined features, the model learns an image-level dynamical system that can generate future images at arbitrary horizons. For example, it can answer questions such as: "How would this patient's eye look like after two years without treatment?"
ICASSP 2025 Oral Presentation
Conference Papers (Asterisk denotes co-first authorship)