
Hello! I am a PhD student in Reinforcement Learning, pursuing double PhD at CMAP, École Polytechnique, and LMO, Université Paris-Saclay under supervision of Éric Moulines and Gilles Stoltz.
Recently, I was a student-researcher at Google DeepMind Paris, where I studied distillation of language models; check out this paper with my results!
Additionally, I was doing research at HDI Lab at HSE University. I did my Master’s degree in Applied Mathematics and Computer Science on the program “Math of Machine Learning” by HSE University.
My research interests include
- Reinforcement learning theory;
 - Foundational model post-training: RLHF, Distillation;
 - Connections between RL and sampling problems;
 
| Email: daniil.tiapkin@polytechnique.edu | Google Scholar | ORCID | 
News
- 🔥New🔥 February 2025. I can finally present the results of my internship: “On teacher Hacking in Language Model Distillation”.
 - January 2025. My first paper as a supervisor, “Optimizing Backward Policies in GFlowNets via Trajectory Likelihood Maximization”, got accepted to ICLR-2025! Additionally, I got 2 accepted papers on AISTATS-2025.
 - September 2024. I am starting my student-researcher internship at Google DeepMind!
 - January 2023. The paper on RL/RLHF learning from demonstrations “Demonstration-Regularized RL” was accepted at ICLR-2024 and, additionally, the GFlowNet-RL paper “Generative Flow Networks as Entropy-Regularized RL” was honored by an oral presetation at AISTATS-2024!
 - September 2023. I moved to École Polytechnique, France for pursuing PhD degree.
 - September 2023. The paper “Model-free Posterior Sampling via Learning Rate Randomization” was accepted at NeurIPS-2023!
 - July 2023 The paper “Orthogonal Directions Constrained Gradient Method: from non-linear equality constraints to Stiefel manifold” was presented at COLT-2023!
 - April 2023. The paper “Fast Rates for Maximum Entropy Exploration” was accepted at ICML-2023!