Ph.D. Candidate, Yale University
chen.liu.cl2482 at yale.edu
New Haven, CT & Mountain View, CA.
Google Scholar
LinkedIn
X
GitHub
Resume
Acknowledgements Many thanks to Zhuang Liu for kindly providing this website template, which was adapted from Zhe Cao's website.
Chen Liu
I am a Ph.D. candidate in CS at Yale University advised
by
Smita Krishnaswamy.
Research
My research focuses on the geometry of learned
latent representations, investigating how neural networks organize
information internally, when it breaks down, and how to fix it.
For LLMs, I identified and systematically
characterized embedding condensation, a geometric phenomenon
where token embeddings collapse into a narrow cone that affect smallers models
more than larger models, and presented a simple solution to improve LLM
generalization in pre- and mid-training (ICML 2026).
For MLLMs, I co-developed VIGIL, an RL post-training framework that mitigates
visual laziness by penalizing over-reliance on language priors when
visual evidence is absent.
I apply the same lens to on-manifold generative modeling (RNAGenScape),
multimodal representation learning (ImmunoStruct,
Nature Machine Intelligence),
and latent dynamics modeling (ImageFlowNet, ICASSP 2025 Oral).
This summer I am interning at ByteDance Seed.
Mentorship
[2025-2026] Xiangyu
Zhang, Tsinghua University junior. Now UC Berkeley PhD (Fall
2026).
[2025-2025] Ethan Zhang, high school student.
[2024-2025] Aryaman
Mishra, high school student. Now JHU (Fall 2025).
[2024-2025] Jason Shaye, high
school student. Now Stanford (Fall 2025).
Education
Yale University
2022 - present. Ph.D. in Computer Science
Columbia University
2018 - 2019. M.S. in Electrical Engineering.
Bucknell University
2014 - 2018. B.S. in Electrical Engineering.
Shanghai Foreign Language School
2007 - 2014. Middle and high school.
Experience
Research Scientist Intern @ ByteDance Seed
Summer 2026. LLM self-evolution and RL post-training.
Senior Research Scientist @ GE Healthcare
2021 - 2022. Deep learning for medical imaging.
Research Software Engineer @ Matic
2021. Developed SLAM algorithms for housekeeping robots.
Research Assistant @ Columbia University Medical Center
2019 - 2020. Deep learning for medical imaging.
News
[04/2026] Dispersion loss to counteract embedding condensation in LLMs is accepted to ICML
2026.
[11/2025] ImmunoStruct is accepted to
Nature Machine Intelligence.
[09/2025] Brainteaser, a benchmark for LLM reasoning, is accepted to NeurIPS 2025.
[06/2025] RNAGenScape is accepted to ICML 2025 GenBio workshop as a
Spotlight & Oral.
[01/2025] Geometry-Aware Generative Autoencoder (GAGA) is accepted to AISTATS 2025.
[12/2024] 3/3 papers (2 Oral) are accepted to ICASSP 2025.
[11/2024] I am recognized as a Top Reviewer at NeurIPS 2024.
[06/2024] CUTS, my first Ph.D. project, is accepted to MICCAI 2024.
[08/2022] I quit my industry job to start my Ph.D. journey at Krishnaswamy Lab, Yale University.
[06/2022] I am recognized as an Outstanding Reviewer at ICML 2022.
Selected Recent Publications (Asterisk denotes co-first authorship)
Dispersion loss counteracts embedding condensation and improves generalization in small language models
I identified and systematically characterized embedding condensation in transformer-based language models: token embeddings collapse into a narrow cone as they pass through layers. Key findings include (1) larger models exhibit less condensation, (2) the effect is reproducible under controlled settings, (3) condensation emerges at initialization and is gradually alleviated by pre-training, and (4) knowledge distillation from a larger model does not transfer resistance to condensation. As an exploratory follow-up, we also evaluate a simple regularizer to mitigate the effect during pre- and mid-training.
ICML 2026
ImmunoStruct enables multimodal deep learning for immunogenicity prediction
ImmunoStruct predicts immunogenicity of peptide-MHC complexes by fusing information from multiple biological modalities: sequence, structure and biochemical properties. I designed a novel mutant–wild-type contrastive learning objective to establish a new state of the art in the field, by encouraging immunogenicity-aware pairwise similarity and suppressing feature space collapse.
Nature Machine Intelligence 2026, Impact Factor: 23.9 (2026)
RNAGenScape: property-guided, optimized generation of mRNA sequences with manifold Langevin Dynamics
Instead of generating mRNA sequences from scratch, we proposed property-guided on-manifold Langevin dynamics that optimize from an existing sequence, with every intermediate step kept plausible via a learned denoising manifold projector. This keeps optimization on the data manifold, yields reliable property gradients, and runs more efficiently than diffusion models. Across three real-world mRNA datasets spanning two orders of magnitude in size, we increased median property gain by up to 148% and success rate by up to 30% while ensuring biological viability of generated sequence, and achieved 68% increase in inference efficiency.
ICML 2025 GenBio Workshop, Spotlight and Oral
ImageFlowNet: forecasting multiscale image-level trajectories of disease progression with irregularly-sampled longitudinal medical images
I designed a position-parameterized neural ODE that flows the multiscale latent representations, so that we can predict a future image given an earlier image and the change in time. For example: ``Predict how this patient's eye will look like if we leave the disease untreated for 2 years.''
ICASSP 2025 Oral Presentation
Conference Papers (Asterisk denotes co-first authorship)