Yuan Zhou

Associate Professor

Affiliation: YMSC, BIMSA

Research Field: Machine learning theory, operations research & management, theoretical computer science

Office: A6-106

Email: zhouyuan@bimsa.cn

Biography

Group: Artificial Intelligence and Machine Learning

Education Experience

2009 - 2009 | Tsinghua University | Computer Science | Bachelor
2009 - 2013 | Carnegie Mellon University | Computer Science | M.Sc.
2009 - 2014 | Carnegie Mellon University | Computer Science | Ph.D

Work Experience

2021 - -- | Yau Mathematical Sciences Center, Tsinghua University | Associate Professor
2019 - 2021 | Department of ISE, University of Illinois Urbana-Champaign | Assistant Professor
2016 - 2019 | Computer Science Department, Indiana University at Bloomington | Assistant Professor
2014 - 2016 | Department of Mathematics, Massachusetts Institute of Technology | Instructor in Applied Mathematics

Publication

[1] J Ying, H Lin, C Yue, Y Chen, C Xiao, Q Shi, Y Liang, ST Yau, Y Zhou, J Ma, A Neural Symbolic Model for Space Physics, Nature Machine Intelligence (2025)
[2] Xi Chen, Jiameng Lyu, Xuan Zhang, Yuan Zhou, Fairness-aware Online Price Discrimination with Nonparametric Demand Models, Operations Research (2025)
[3] Zihan Zhang, Xiangyang Ji, Yuan Zhou, Almost Optimal Batch-Regret Tradeoff for Batch Linear Contextual Bandits, The Thirteenth International Conference on Learning Representations (ICLR) (2025)
[4] Xi Chen, David Simichi-Levi, Zishuo Zhao, Yuan Zhou, Bayesian Mechanism Design for Blockchain Transaction Fee Allocation, Operations Research (2025)
[5] Xuefeng Zhang, Haowei Lin, Muhan Zhang, Yuan Zhou, Jianzhu Ma, A data-driven group retrosynthesis planning model inspired by neurosymbolic programming, Nature Communications, 16 (2025)
[6] K Fan, Z Ren, R Guo, J Zhang, Z Huang, Y Zhou, Z Zhang, ME-PATS: Mutually Enhancing Search-Based Planner and Learning-Based Agent for Tractor-Trailer Systems, 2025 IEEE International Conference on Robotics and Automation (ICRA), 11940 (2025)
[7] X Chen, Y Chen, Y Zhou, AdaSwitch: An Adaptive Switching Meta-Algorithm for Learning-Augmented Bounded-Influence Problems, arXiv, 2509.02302 (2025)
[8] X Chen, J Lyu, S Yuan, Y Zhou, Learning When to Restart: Nonstationary Newsvendor from Uncensored to Censored Demand, arXiv, 2509.18709 (2025)
[9] K Fan, J Zhang, X Zhang, Y Wu, J Cao, Y Zhou, J Ma, Safety-Polarized and Prioritized Reinforcement Learning, Forty-second International Conference on Machine Learning (2025)
[10] X Chen, Y Chen, Y Zhou, A Minimax-MDP Framework with Future-imposed Conditions for Learning-augmented Problems, arXiv (2025)
[11] J Tang, B Chen, C Shi, Y Zhou, Fairness-Constrained Inventory Control with Demand Learning, SSRN (2025)
[12] Jiameng Lyu, Jinxing Xie, Shilin Yuan, Yuan Zhou, A Minibatch-SGD-based Learning Meta-Policy for Inventory Systems with Myopic Optimal Policy, Management Science (2024)
[13] Boxiao Chen, Yining Wang, Yuan Zhou, Optimal Policies for Dynamic Pricing and Inventory Control with Nonparametric Censored Demands, Management Science, 70(5), 3362-3380 (2024)
[14] Yingkai Li, Yining Wang, and Yuan Zhou, Nearly Minimax-Optimal Regret for Linearly Parameterized Bandits, IEEE Transactions on Information Theory, 70(2024), 1, 372-388
[15] Xi Chen, Jiameng Lyu, Yining Wang, and Yuan Zhou, Network Revenue Management With Demand Learning and Fair Resource-Consumption Balancing, Production and Operations Management, 33(2024), 2, 494-511
[16] Z Zhao, X Chen, Y Zhou, It Takes Two: A Peer-Prediction Solution for Blockchain Verifier's Dilemma, arXiv (2024)
[17] J Lyu, J Xie, S Yuan, Y Zhou, A Minibatch Stochastic Gradient Descent-Based Learning Metapolicy for Inventory Systems with Myopic Optimal Policy, Management Science (2024)
[18] Z Zhao, Z Fang, X Wang, X Chen, H Su, H Xiao, Y Zhou, Proof-of-learning with incentive security, arXiv (2024)
[19] S Yuan, J Lyu, J Xie, Y Zhou, Asymptotic optimality of base-stock policies for lost-sales inventory systems with stochastic lead times, Operations Research Letters, 57, 107196 (2024)
[20] X Chen, M Liu, Y Wang, Y Zhou, A Re-solving Heuristic for Dynamic Assortment Optimization with Knapsack Constraints, arXiv (2024)
[21] J Lyu, S Yuan, B Zhou, Y Zhou, Closing the Gaps: Optimality of Sample Average Approximation for Data-Driven Newsvendor Problems, arXiv (2024)
[22] Jinpeng Zhang, Yufeng Zheng, Chuheng Zhang, Li Zhao, Lei Song, Yuan Zhou, and Jiang Bian, Robust Situational Reinforcement Learning in Face of Context Disturbances, Proceedings of the International Conference on Machine Learning, PMLR, 202, 41973-41989 (2023)
[23] Yijie Wang, Yuan Zhou, Xiaoqing Huang, Kun Huang, Jie Zhang, and Jianzhu Ma, Learning Sparse Group Models through Boolean Relaxation, The Eleventh International Conference on Learning Representations (ICLR), Virtual Event(2023)
[24] X Chen, Z Xu, Z Zhao, Y Zhou, Personalized Pricing with Group Fairness Constraint, Proceedings of the 2023 ACM Conference on Fairness, Accountability, and … (2023)
[25] X Chen, J Lyu, S Yuan, Y Zhou, Learning in Lost-Sales Inventory Systems with Stochastic Lead Times and Random Supplies, SSRN (2023)
[26] Boxiao Chen, David Simchi-Levi, Yining Wang, and Yuan Zhou, Dynamic Pricing and Inventory Control with Fixed Ordering Cost and Incomplete Demand Information, Management Science, 68(2022), 8, 5684-5703
[27] Zihan Zhang, Yuhang Jiang, Yuan Zhou, and Xiangyang Ji, Near-Optimal Regret Bounds for Multi-batch Reinforcement Learning, Advances in Neural Information Processing Systems, 35(2022), 24586-24596
[28] Beining Han, Zhizhou Ren, Zuofan Wu, Yuan Zhou, and Jian Peng, Off-Policy Reinforcement Learning with Delayed Rewards, Proceedings of the International Conference on Machine Learning, PMLR, 162, 8280-8303 (2022)
[29] Zhizhou Ren, Jiahan Li, Fan Ding, Yuan Zhou, Jianzhu Ma, and Jian Peng, Proximal Exploration for Model-guided Protein Sequence Design, Proceedings of the International Conference on Machine Learning, PMLR, 162, 18520-18536 (2022)
[30] Tanmay Gangwani, Yuan Zhou, and Jian Peng, Imitation Learning from Observations under Transition Model Disparity, The Tenth International Conference on Learning Representations (ICLR), Virtual Event(2022)
[31] Zhizhou Ren, Ruihan Guo, Yuan Zhou, and Jian Peng, Learning Long-term Reward Redistribution via Randomized Return Decomposition, The Tenth International Conference on Learning Representations (ICLR), Virtual Event(2022)
[32] X Chen, J Li, M Li, T Zhao, Y Zhou, Assortment Optimization Under the Multivariate MNL Model, arXiv (2022)
[33] Z Zhao, X Chen, X Zhang, Y Zhou, Dynamic Car Dispatching and Pricing: Revenue and Fairness for Ridesharing Platforms, Proceedings of the Thirty-First International Joint Conference on Artificial (2022)
[34] Xi Chen, Yining Wang, and Yuan Zhou, Optimal Policy for Dynamic Assortment Planning Under Multinomial Logit Models, Mathematics of Operations Research, 46(2021), 4, 1639-1657
[35] Yufei Ruan, Jiaqi Yang, and Yuan Zhou, Linear Bandits with Limited Adaptivity and Learning Distributional Optimal Design, Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing (STOC), Rome, Italy(2021), 74-87
[36] Guangyu Xi, Chao Tao, and Yuan Zhou, Near-Optimal MNL Bandits Under Risk Criteria, Proceedings of the AAAI Conference on Artificial Intelligence, 35(2021), 12, 10397-10404
[37] Xi Chen, Chao Shi, Yining Wang, and Yuan Zhou, Dynamic Assortment Planning Under Nested Logit Models, Production and Operations Management, 30(2021), 1, 85-102
[38] Tanmay Gangwani, Jian Peng, and Yuan Zhou, Harnessing Distribution Ratio Estimators for Learning Agents with Quality and Diversity, Proceedings of the Conference on Robot Learning, PMLR, 155, 2206-2215 (2021)
[39] Zihan Zhang, Yuan Zhou, and Xiangyang Ji, Model-Free Reinforcement Learning: from Clipped Pseudo-Regret to Sample Complexity, Proceedings of the International Conference on Machine Learning, PMLR, 139, 12653-12662 (2021)
[40] Y Li, Y Wang, X Chen, Y Zhou, Tight regret bounds for infinite-armed linear contextual bandits, International Conference on Artificial Intelligence and Statistics, 370-378 (2021)
[41] Zihan Zhang, Yuan Zhou, and Xiangyang Ji, Almost Optimal Model-Free Reinforcement Learning via Reference-Advantage Decomposition, Advances in Neural Information Processing Systems, 33(2020), 15198-15207
[42] Xi Chen, Yining Wang, and Yuan Zhou, Dynamic Assortment Optimization with Changing Contextual Information, Journal of Machine Learning Research, 21(2020), 1-44
[43] Kefan Dong, Jian Peng, Yining Wang, and Yuan Zhou, Root-n-Regret for Learning in Markov Decision Processes with Function Approximation and Low Bellman Rank, Proceedings of the Conference on Learning Theory, PMLR, 125, 1554-1557 (2020)
[44] Kefan Dong, Yingkai Li, Qin Zhang, and Yuan Zhou, Multinomial Logit Bandit with Low Switching Cost, Proceedings of the International Conference on Machine Learning, PMLR, 119, 2607-2615 (2020)
[45] Nikolai Karpov, Qin Zhang, and Yuan Zhou, Collaborative Top Distribution Identifications with Limited Interaction, 2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS), Durham, NC, USA(2020), 160-171
[46] Tanmay Gangwani, Yuan Zhou, and Jian Peng, Learning Guidance Rewards with Trajectory-space Smoothing, Advances in Neural Information Processing Systems, 33(2020), 822-832
[47] Xiaojin Zhang, Honglei Zhuang, Shengyu Zhang, and Yuan Zhou, Adaptive Double-Exploration Tradeoff for Outlier Detection, Proceedings of the AAAI Conference on Artificial Intelligence, 34(2020), 04, 6837-6844
[48] J Peng, Y Qin, Y Wei, Y Zhou, A PTAS for the Bayesian Thresholding Bandit Problem, International Conference on Artificial Intelligence and Statistics, 2455-2464 (2020)
[49] Y Xie, Y Pei, Y Lu, H Tang, Y Zhou, Learning Structural Genetic Information via Graph Neural Embedding, International Symposium on Bioinformatics Research and Applications, 250-261 (2020)
[50] Y Zhong, Y Zhou, J Peng, Efficient Competitive Self-Play Policy Optimization, arXiv (2020)
[51] Chao Tao, Qin Zhang, and Yuan Zhou, Collaborative Learning with Limited Interaction: Tight Bounds for Distributed Exploration in Multi-Armed Bandits, IEEE 60th Annual Symposium on Foundations of Computer Science (FOCS), Baltimore, Maryland(2019)
[52] Yuan Xie, Boyi Liu, Qiang Liu, Zhaoran Wang, Yuan Zhou, and Jian Peng, Off-policy evaluation and learning from logged bandit feedback: Error reduction via surrogate policy, The Seventh International Conference on Learning Representations (ICLR), Virtual Event(2019)
[53] Chao Tao, Saúl A. Blanco, Jean Peng, and Yuan Zhou, Thresholding Bandit with Optimal Aggregate Regret, Advances in Neural Information Processing Systems, 32(2019), 11664-11673
[54] Zhizhou Ren, Kefan Dong, Yuan Zhou, Qiang Liu, and Jian Peng, Exploration via Hindsight Goal Generation, Advances in Neural Information Processing Systems(2019)
[55] Xi Chen, Tengyu Ma, Jiawei Zhang, and Yuan Zhou, Optimal Design of Process Flexibility for General Production Systems, Operations Research, 67(2019), 2, 516-531
[56] Y Jin, Y Li, Y Wang, Y Zhou, On Asymptotically Tight Tail Bounds for Sums of Geometric and Exponential Random Variables, arXiv (2019)
[57] Yining Wang, Xi Chen, and Yuan Zhou, Near-Optimal Policies for Dynamic Multinomial Logit Assortment Selection Models, NeurIPS 2018(2018)
[58] Jiecao Chen, Qin Zhang, and Yuan Zhou, Tight Bounds for Collaborative PAC Learning via Multiplicative Weights, NeurIPS 2018(2018)
[59] Chao Tao, Saúl A. Blanco, and Yuan Zhou, Best Arm Identification in Linear Bandits with Linear Dimension Dependency, Proceedings of the 35th International Conference on Machine Learning, PMLR(2018)
[60] X Chen, Y Wang, Y Zhou, An Optimal Policy for Dynamic Assortment Planning Under Uncapacitated Multinomial Logit Models, arXiv (2018)
[61] Xue Chen, and Zhou Yuan, Parameterized Algorithms for Constraint Satisfaction Problems Above Average with Global Cardinality Constraints,, Proceedings of the 28th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA)(2017)
[62] Jiecao Chen, Xi Chen, Qin Zhang, and Yuan Zhou, Adaptive Multiple-Arm Identification, Proceedings of the 34th International Conference on Machine Learning, PMLR(2017)
[63] X Chen, Y Zhou, Parameterized Algorithms for Constraint Satisfaction Problems Above Average with Global Cardinality Constraints, Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete (2017)
[64] Konstantin Makarychev, Yury Makarychev, and Yuan Zhou, Satisfiability of Ordering CSPs Above Average Is Fixed-Parameter Tractable, Proceedings of the IEEE 56th Annual Symposium on Foundations of Computer Science (FOCS)(2015)
[65] Xi Chen, Jiawei Zhang, Yuan Zhou, Optimal Sparse Designs for Process Flexibility via Probabilistic Expanders, Operations Research, 63(2015), 5, 1159-1176
[66] R O’Donnell, Y Wu, Y Zhou, Hardness of Max-2Lin and Max-3Lin over integers, reals, and large cyclic groups, ACM Transactions on Computation Theory (TOCT), 7(2), 1-16 (2015)
[67] J Chuzhoy, Y Makarychev, A Vijayaraghavan, Y Zhou, Approximation Algorithms and Hardness of the-Route Cut Problem, ACM Transactions on Algorithms (TALG), 12(1), 1-40 (2015)
[68] Ryan O’Donnell, Li-Yang Tan, and Yuan Zhou, Hypercontractive inequalities via SOS, with an application to Vertex-Cover, Manuel Kauers, Proceedings of the 25th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA)(2014)
[69] Ryan O’Donnell, John Wright, Chenggang Wu, and Yuan Zhou, Hardness of Robust Graph Isomorphism, Lasserre Gaps, and Asymmetry of Random Graphs, Proceedings of the 25th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA)(2014)
[70] Venkatesan Guruswami, Ali Kemal Sinop, and Yuan Zhou, Constant Factor Lasserre Gaps for Graph Partitioning Problems, SIAM Journal on Optimization, 24(2014), 4, 1698-1717
[71] Yuan Zhou, Xi Chen, and Jian Li, Optimal PAC Multiple Arm Identification with Applications to Crowdsourcing, Proceedings of the 31st International Conference on Machine Learning(2014)
[72] M Tulsiani, J Wright, Y Zhou, Optimal strong parallel repetition for projection games on low threshold rank graphs, International Colloquium on Automata, Languages, and Programming, 1003-1014 (2014)
[73] V Guruswami, AK Sinop, Y Zhou, Constant factor lasserre integrality gaps for graph partitioning problems, SIAM Journal on Optimization, 24(4), 1698-1717 (2014)
[74] R Meka, O Reingold, Y Zhou, Deterministic Coupon Collection and Better Strong Dispersers, Approximation, Randomization, and Combinatorial Optimization. Algorithms and (2014)
[75] Y Yoshida, Y Zhou, Approximation schemes via Sherali-Adams hierarchy for dense constraint satisfaction problems and assignment problems, Proceedings of the 5th conference on Innovations in theoretical computer (2014)
[76] M Kauers, R O'Donnell, LY Tan, Y Zhou, Hypercontractive inequalities via SOS, and the Frankl-Rödl graph, Proceedings of the Twenty-Fifth Annual ACM-SIAM Symposium on Discrete (2014)
[77] R O’Donnell, Y Wu, Y Zhou, Optimal lower bounds for locality-sensitive hashing (except when q is tiny), ACM Transactions on Computation Theory (TOCT), 6(1), 1-13 (2014)
[78] P Gopalan, S Vadhan, Y Zhou, Locally testable codes and cayley graphs, Proceedings of the 5th conference on Innovations in theoretical computer (2014)
[79] Y Zhou, New Directions in Approximation Algorithms and Hardness of Approximation, Carnegie Mellon University (2014)
[80] Ryan O’Donnell, and Yuan Zhou, Approximability and proof complexity, Proceedings of the 24th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA)(2013)
[81] Boaz Barak, Fernando Brandao, Aram Harrow, Jonathan Kelner, David Steurer, and Yuan Zhou, Hypercontractivity, Sum-of-Squares Proofs, and their Applications, Proceedings of the 44th Annual ACM Symposium on Theory of computing (STOC)(2012)
[82] Aditya Bhaskara, Moses Charikar, Venkatesan Guruswami, Aravindan Vijayaraghavan, and Yuan Zhou, Polynomial integrality gaps for strong SDP relaxations of Densest k-Subgraph, Proceedings of the 23rd Annual ACM-SIAM Symposium on Discrete Algorithms (SODA)(2012)
[83] Julia Chuzhoy, Yury Makarychev, Aravindan Vijayaraghavan, and Yuan Zhou, Approximation Algorithms and Hardness of the k-Route Cut Problem, Proceedings of the 23rd Annual ACM-SIAM Symposium on Discrete Algorithms (SODA)(2012)
[84] V Guruswami, Y Zhou, Approximating bounded occurrence ordering CSPs, Approximation, Randomization, and Combinatorial Optimization. Algorithms and (2012)
[85] Y Wang, Y Zhou, Dynamic Assortment Optimization: Beyond MNL Model, The Elements of Joint Learning and Optimization in Operations Management (2012)
[86] V Guruswami, Y Zhou, Tight Bounds on the Approximability of Almost-satisfiable Horn SAT and Exact Hitting Set, Theory of Computing, 8, 239-267 (2012)
[87] G Kun, R O’Donnell, S Tamaki, Y Yoshida, Y Zhou, Linear programming, width-1 CSPs, and robust satisfaction (2012)
[88] A Bhaskara, M Charikar, V Guruswami, A Vijayaraghavan, Y Zhou, Polynomial integrality gaps for strong SDP relaxations of Densest-subgraph, Proceedings of the twenty-third annual ACM-SIAM symposium on Discrete (2012)
[89] Venkatesan Guruswami, and Yuan Zhou, Tight Inapproximability Bounds for Almost-satisfiable Horn SAT and Exact Hitting Set, Proceedings of the 22nd Annual ACM-SIAM Symposium on Discrete Algorithms (SODA)(2011)
[90] V Guruswami, Y Makarychev, P Raghavendra, D Steurer, Y Zhou, Finding almost-perfect graph bisections, Innovations in Computer Science, 321-337 (2011)
[91] Z Huang, L Wang, Y Zhou, Black-box reductions in mechanism design, Approximation, Randomization, and Combinatorial Optimization. Algorithms and (2011)
[92] R O’Donnell, J Wright, Y Zhou, The Fourier Entropy–Influence Conjecture for certain classes of Boolean functions, Automata, Languages and Programming, 330-341 (2011)
[93] L Cai, Y Cheng, E Verbin, Y Zhou, Surviving Rates of Graphs with Bounded Treewidth for the Firefighter Problem, SIAM Journal on Discrete Mathematics, 24(4), 1322--1335 (2010)
[94] W Chen, SH Teng, Y Wang, Y Zhou, On the α-Sensitivity of Nash Equilibria in PageRank-Based Network Reputation Games, Frontiers in Algorithmics, 63-73 (2009)
[95] P Lu, Y Wang, Y Zhou, Tighter bounds for facility games, Internet and Network Economics, 137-148 (2009)
[96] X Chen, M Liu, Y Wang, Y Zhou, EXPRESS: A Re-solving Heuristic for Dynamic Assortment Optimization with Knapsack Constraints, Production and Operations Management, 10591478251399005

Update Time: 2026-08-02 19:18:13