Curriculum Vitae
Education
Ph.D. in Statistics, University of California, Irvine, 2026 (expected)
M.A. in Statistics, Columbia University, 2020
B.S. in Statistics, Zhejiang University, 2018
- Chu Kochen Honors College
- Dual Degree: B.A. in English Language and Literature
Work experience
AI Research
- AI Research: Augmented multimodal representation learning to balance interpretability and generation performance
- Scientific Application: Optimized learning of robust cell representations from diverse biological data across scales
Senior Data Analyst @ Varo Bank, 11/2021 - 08/2022
Machine Learning & Experimental Design
- Machine Learning: Enhanced customer retention by developing models of customer behavior using Markov chains
- A/B Testing: Boosted customer acquisition by 14+% through the design and execution of targeted ads experiments
- Database: Streamlined the SQL-to-Tableau pipeline and the ETL process for broader company-wide adoption
Data Analyst @ YipitData, 09/2020 - 10/2021
Data Science Modeling & Business Analytics
- Data Science: Drove a 37% revenue increase by an automated tree-based model for housing price estimation
- Data Mining: Launched web scraping and monthly reporting systems by designing DAGs in Airflow
Technical Skills
Coding
- Languages: Python (PyTorch, Pandas, Numpy, scikit-learn)
- Databases: PostgreSQL, MySQL, Redshift, Athena
- Software: Tableau, AirFlow, R (dplyr, ggplot2, shiny)
Algorithms
- Generative Models: Flow Matching, Diffusion Models, VAE, GAN, Normalizing-Flows
- Artificial intelligence: Multimodal Learning, Representation Learning, RNN, CNN, Contrastive Learning
- AI for Science: Neuroscience, Climate Analysis, Public Health
- Statistical Learning: Random Forest, Decision Trees, Regression, Boosting, PCA, SVM, Clustering, MCMC
Publications and Preprints
Sutter T. M., Meng Y., Agostini A., Chopard D., Fortin N., Vogt J. E., Shahbaba B., Mandt S. (2024). "Unity by Diversity: Improved Representation Learning in Multimodal VAEs" arXiv Preprint arXiv: 2403.05300, 2024.
Agostini A., Chopard D., Meng Y., Fortin N., Shahbaba B., Mandt S., Sutter T. M., Vogt J. E. (2024). "Weakly-Supervised Multimodal Learning on MIMIC-CXR" arXiv Preprint arXiv: 2411.10356, 2024.
Ding, X., Meng, Y., Xiang, L. et al. Stroke recurrence prediction using machine learning and segmented neural network risk factor aggregation. Discov Public Health 21, 119 (2024). https://doi.org/10.1186/s12982-024-00199-6
Moslemi Z., Meng Y., Lan S., Shahbaba B. (2023). "Scaling Up Bayesian Neural Networks with Neural Networks." arXiv preprint: arXiv: 2312.11799, 2023
Liu L., Meng Y., Wu X., Ying Z., Zheng T. (2022). "Log-rank-type tests for equality of distributions in high-dimensional spaces." Journal of Computational and Graphical Statistics 31 (4), 1384-1396.
Teaching
Univerity of California, Irvine
- STATS 212 – Statistical Methods III: Methods for Correlated Data, TA
- STATS 210C - Statistical Methods III: Longitudinal Data, TA
- STATS 67 – Introduction to Probability and Statistics for Computer Science, TA
- STATS 7 – Basic Statistics, TA
- STATS 210P – Statistical Methods I, TA
- STATS 120A – Introduction to Probability and Statistics I, Grader
Columbia University
- GU4234/ GR5234 – Sample Survey, Grader
- GU4222/ GR5222 - Nonparametric Statistics, Grader