Ph.D. Candidate, Yale University
chen.liu.cl2482 at yale.edu
New Haven, CT & Mountain View, CA.
Google Scholar
LinkedIn
X
GitHub
Resume
Acknowledgements Many thanks to Zhuang Liu for kindly providing this website template, which was adapted from Zhe Cao's website.
Chen Liu
I am a Ph.D. candidate in CS at Yale University advised
by
Smita Krishnaswamy.
Research Areas (TL;DR) I work on manifold learning and generative AI for bioscience.
Research Areas
Since 2022, I have been working broadly on manifold learning,
studying the geometry of learned spaces. This area has gained renewed attention
in recent advances such as DeepSeek's manifold-constrained
hyper-connections and Kaiming's manifold-inspired generative models (Just Image
Transformer, Drifting
Model).
Specifically, I design methods to understand and control how neural networks
organize information in their internal representations, covering both general AI and
AI for bioscience.
In general AI, I provide new insights into LLM pre-training (my
proudest work so far, Dispersion) and
generative modeling (GAGA).
On the application side, I design methods for disease progression modeling (ImageFlowNet),
image segmentation (CUTS, DiffKillR),
biological sequence optimization (RNAGenScape),
and protein function prediction (ImmunoStruct).
BTW, I am good at scientific figure making, and I can do a big part of the job
in Python.
Mentorship
[2025-2026] Xiangyu
Zhang, Tsinghua University junior. Now UC Berkeley PhD (Fall
2026).
[2025-2025] Ethan Zhang, high school student.
[2024-2025] Aryaman
Mishra, high school student. Now JHU (Fall 2025).
[2024-2025] Jason Shaye, high
school student. Now Stanford (Fall 2025).
Education
Yale University
2022 - present. Ph.D. in Computer Science
Columbia University
2018 - 2019. M.S. in Electrical Engineering.
Bucknell University
2014 - 2018. B.S. in Electrical Engineering.
Shanghai Foreign Language School
2007 - 2014. Middle and high school.
Experience
Research Scientist Intern @ ByteDance Seed
Summer 2026. LLM self-evolution and RL post-training.
Senior Research Scientist @ GE Healthcare
2021 - 2022. Deep learning for medical imaging.
Research Software Engineer @ Matic
2021. Developed SLAM algorithms for housekeeping robots.
Research Assistant @ Columbia University Medical Center
2019 - 2020. Deep learning for medical imaging.
News
[04/2026] Dispersion loss to counteract embedding condensation in LLMs is accepted to ICML
2026.
[11/2025] ImmunoStruct is accepted to
Nature Machine Intelligence.
[09/2025] Brainteaser, a benchmark for LLM reasoning, is accepted to NeurIPS 2025.
[06/2025] RNAGenScape is accepted to ICML 2025 GenBio workshop as a
Spotlight & Oral.
[01/2025] Geometry-Aware Generative Autoencoder (GAGA) is accepted to AISTATS 2025.
[12/2024] 3/3 papers (2 Oral) are accepted to ICASSP 2025.
[11/2024] I am recognized as a Top Reviewer at NeurIPS 2024.
[06/2024] CUTS, my first Ph.D. project, is accepted to MICCAI 2024.
[08/2022] I quit my industry job to start my Ph.D. journey at Krishnaswamy Lab, Yale University.
[06/2022] I am recognized as an Outstanding Reviewer at ICML 2022.
Selected Recent Publications (Asterisk denotes co-first authorship)
Dispersion loss counteracts embedding condensation and improves generalization in small language models
I identified an interesting phenomenon in transformer-based language models which I termed "embedding condensation": token embeddings collapse into a narrow cone. The three key observations are (1) larger models have less condensation, (2) condensations occur early in pre-training, and (3) knowledge distillation from a large model does not help. To counteract this undesirable phenomenon, I designed a dispersion loss to explicitly encourage embedding diversity during training. Without extra parameters, it improves generalization of language models in both mid- and pre-training.
ICML 2026
ImmunoStruct enables multimodal deep learning for immunogenicity prediction
ImmunoStruct predicts immunogenicity of peptide-MHC complexes by fusing information from multiple biological modalities: sequence, structure and biochemical properties. I designed a novel mutant–wild-type contrastive learning objective to establish a new state of the art in the field, by encouraging immunogenicity-aware pairwise similarity and suppressing feature space collapse.
Nature Machine Intelligence 2026, Impact Factor: 23.9 (2026)
RNAGenScape: property-guided, optimized generation of mRNA sequences with manifold Langevin Dynamics
Instead of generating mRNA sequences from scratch, we proposed property-guided on-manifold Langevin dynamics that optimize from an existing sequence, with every intermediate step kept plausible via a learned denoising manifold projector. This keeps optimization on the data manifold, yields reliable property gradients, and runs more efficiently than diffusion models. Across three real-world mRNA datasets spanning two orders of magnitude in size, we increased median property gain by up to 148% and success rate by up to 30% while ensuring biological viability of generated sequence, and achieved 68% increase in inference efficiency.
ICML 2025 GenBio Workshop, Spotlight and Oral
ImageFlowNet: forecasting multiscale image-level trajectories of disease progression with irregularly-sampled longitudinal medical images
I designed a position-parameterized neural ODE that flows the multiscale latent representations, so that we can predict a future image given an earlier image and the change in time. For example: ``Predict how this patient's eye will look like if we leave the disease untreated for 2 years.''
ICASSP 2025 Oral Presentation
Conference Papers (Asterisk denotes co-first authorship)