University of VirginiaGreetings! I'm Xu Ouyang, a final-year Computer Science Ph.D. student at the University of Virginia, advised by Prof. Thomas Hartvigsen. My research centers on the science of large language models — scaling laws, training dynamics, data-centric pretraining, and emerging architectures such as continuous diffusion LLMs.
Recent first-author work includes ICML 2026 (the “Shannon Scaling Law”, now used internally at ByteDance Seed for large-scale training-dynamics analysis), ACL 2025 Main (low-bit quantization scaling laws, with 1500+ open quantized LLM checkpoints released on HuggingFace), and TMLR 2025 (ADMIRE-BayesOpt — multi-fidelity Bayesian optimization for LLM data mixture reweighting).
Earlier I spent two great years with Prof. Felix Xiaozhu Lin and Prof. Yangfeng Ji at UVA CS, and have done research and internships at ByteDance Applied ML (MLSys), ByteDance Seed-LLM-Model, Tencent AI Lab Seattle, Rice University (with Prof. Yingyan (Celine) Lin), and UT Austin (with Prof. Atlas (Zhangyang) Wang).
My research interests include:I am on the 2026–2027 job market for full-time industry research opportunities. Please feel free to reach out!
") does not match the recommended repository name for your site ("").
", so that your site can be accessed directly at "http://".
However, if the current repository name is intended, you can ignore this message by removing "{% include widgets/debug_repo_name.html %}" in index.html.
",
which does not match the baseurl ("") configured in _config.yml.
baseurl in _config.yml to "".

Xu Ouyang, Deyi Liu, Yuhang Cai, Jing Liu, Yuan Yang, Chen Zheng, Thomas Hartvigsen, Yiyuan Ma
The 43rd International Conference on Machine Learning (ICML) 2026
A unified scaling law modeling LLM pretraining as information transmission over a noisy channel; reconciles monotonic pretraining scaling with U-shaped phenomena such as catastrophic overtraining and quantization-induced degradation. Adopted internally at ByteDance Seed for large-scale training-dynamics analysis.
Xu Ouyang, Deyi Liu, Yuhang Cai, Jing Liu, Yuan Yang, Chen Zheng, Thomas Hartvigsen, Yiyuan Ma
The 43rd International Conference on Machine Learning (ICML) 2026
A unified scaling law modeling LLM pretraining as information transmission over a noisy channel; reconciles monotonic pretraining scaling with U-shaped phenomena such as catastrophic overtraining and quantization-induced degradation. Adopted internally at ByteDance Seed for large-scale training-dynamics analysis.

Xu Ouyang*, Shengzhuang Chen*, Michael Arthur Leopold Pearce, Thomas Hartvigsen, Jonathan Richard Schwarz
Transactions on Machine Learning Research 2025
A multi-fidelity Bayesian-optimization framework for LLM data-mixture re-weighting in both pretraining and instruction fine-tuning; achieves 5×+ speedups in identifying optimal mixtures, validated from 1M to 7B parameters. Released a public dataset of 460 full training/evaluation runs (13,000+ GPU hours).
Xu Ouyang*, Shengzhuang Chen*, Michael Arthur Leopold Pearce, Thomas Hartvigsen, Jonathan Richard Schwarz
Transactions on Machine Learning Research 2025
A multi-fidelity Bayesian-optimization framework for LLM data-mixture re-weighting in both pretraining and instruction fine-tuning; achieves 5×+ speedups in identifying optimal mixtures, validated from 1M to 7B parameters. Released a public dataset of 460 full training/evaluation runs (13,000+ GPU hours).

Xu Ouyang, Tao Ge, Thomas Hartvigsen, Zhisong Zhang, Haitao Mi, Dong Yu
The 63rd Annual Meeting of the Association for Computational Linguistics (ACL Main Conference) 2025
Low-bit quantization favors undertrained LLMs but induces significant degradation on fully-trained models. Released 1500+ quantized LLM checkpoints on HuggingFace spanning multiple model sizes, training-token budgets, and bit widths; derived scaling laws relating quantization-induced degradation to model size, training tokens, and bit width.
Xu Ouyang, Tao Ge, Thomas Hartvigsen, Zhisong Zhang, Haitao Mi, Dong Yu
The 63rd Annual Meeting of the Association for Computational Linguistics (ACL Main Conference) 2025
Low-bit quantization favors undertrained LLMs but induces significant degradation on fully-trained models. Released 1500+ quantized LLM checkpoints on HuggingFace spanning multiple model sizes, training-token budgets, and bit widths; derived scaling laws relating quantization-induced degradation to model size, training tokens, and bit width.

Xu Ouyang, Felix Xiaozhu Lin, Yangfeng Ji
The Thirteenth International Conference on Learning Representations (ICLR) 2025
Privacy-preserving, efficient data selection for transformers via multi-party computation, enabling fine-grained data valuation in data markets without exposing raw samples.
Xu Ouyang, Felix Xiaozhu Lin, Yangfeng Ji
The Thirteenth International Conference on Learning Representations (ICLR) 2025
Privacy-preserving, efficient data selection for transformers via multi-party computation, enabling fine-grained data valuation in data markets without exposing raw samples.

Xu Ouyang, Shahina Mohd Azam Ansari, Felix Xiaozhu Lin, Yangfeng Ji
International Joint Conference On Artificial Intelligence (IJCAI) 2023
Xu Ouyang, Shahina Mohd Azam Ansari, Felix Xiaozhu Lin, Yangfeng Ji
International Joint Conference On Artificial Intelligence (IJCAI) 2023