Curriculum Vitae
Education
Ph.D. in Statistics, University of California, Irvine, 2027 (expected)
M.A. in Statistics, Columbia University, 2020
B.S. in Statistics, Zhejiang University, 2018
- Chu Kochen Honors College
- Dual Degree: B.A. in English Language and Literature
Work experience
Applied Scientist Intern @ Amazon, 06/2025 - Current
GenAI for Computer Vision
- Visual Scene Control via Stable Diffusion
- Text-to-Image Generation
Multimodal Learning & AI4Science
- AI for Cell Biology
- In-context Learning
Senior Data Analyst @ Varo Bank, 11/2021 - 08/2022
Machine Learning & Experimental Design
- Modeling & Data Visualization
- Experimental Design & A/B Testing
Data Analyst @ YipitData, 09/2020 - 10/2021
Data Science & Business Analytics
- Database Management
- Web Scraping & ETL Pipeline Design
Technical Skills
Coding
- Languages: Python
- AI Development: PyTorch, Pandas, Numpy, Pyro, Optuna, scikit-learn, WandB, Transformers (Hugging Face)
- AI Development: PyTorch, Pandas, Numpy, Pyro, Optuna, scikit-learn, WandB, Transformers (Hugging Face)
- Data Infrastructure: Databases (PostgreSQL, MySQL), AWS (Redshift, S3, Athena)
- Software: Tableau, AirFlow, R (dplyr, ggplot2, shiny)
Algorithms
- Generative Models: Flow Matching, Diffusion Models, VAE, GAN, Normalizing-Flows
- Artificial intelligence: Multimodal Learning, Representation Learning, RNN, CNN, Contrastive Learning
- AI for Science: Neuroscience, Climate Analysis, Public Health
- Statistical Learning: Random Forest, Decision Trees, Regression, Boosting, PCA, SVM, Clustering, MCMC
Publications and Preprints
Sutter T. M., Meng Y., Agostini A., Chopard D., Fortin N., Vogt J. E., Shahbaba B., Mandt S. (2024). "Unity by Diversity: Improved Representation Learning in Multimodal VAEs" arXiv Preprint arXiv: 2403.05300, 2024.
Agostini A., Chopard D., Meng Y., Fortin N., Shahbaba B., Mandt S., Sutter T. M., Vogt J. E. (2024). "Weakly-Supervised Multimodal Learning on MIMIC-CXR" arXiv Preprint arXiv: 2411.10356, 2024.
Ding, X., Meng, Y., Xiang, L. et al. Stroke recurrence prediction using machine learning and segmented neural network risk factor aggregation. Discov Public Health 21, 119 (2024). https://doi.org/10.1186/s12982-024-00199-6
Moslemi Z., Meng Y., Lan S., Shahbaba B. (2023). "Scaling Up Bayesian Neural Networks with Neural Networks." arXiv preprint: arXiv: 2312.11799, 2023
Liu L., Meng Y., Wu X., Ying Z., Zheng T. (2022). "Log-rank-type tests for equality of distributions in high-dimensional spaces." Journal of Computational and Graphical Statistics 31 (4), 1384-1396.
Teaching
Univerity of California, Irvine
- STATS 212 – Statistical Methods III: Methods for Correlated Data, TA
- STATS 210C - Statistical Methods III: Longitudinal Data, TA
- STATS 67 – Introduction to Probability and Statistics for Computer Science, TA
- STATS 7 – Basic Statistics, TA
- STATS 210P – Statistical Methods I, TA
- STATS 120A – Introduction to Probability and Statistics I, Grader
Columbia University
- GU4234/ GR5234 – Sample Survey, Grader
- GU4222/ GR5222 - Nonparametric Statistics, Grader