Hello! I am a PhD student in Reinforcement Learning, pursuing double PhD at CMAP, École Polytechnique, and LMO, Université Paris-Saclay under supervision of Éric Moulines and Gilles Stoltz.

Recently, I was a student-researcher at Google DeepMind Paris, where I studied distillation of language models; check out this paper with my results!

Additionally, I was doing research at HDI Lab at HSE University. I did my Master’s degree in Applied Mathematics and Computer Science on the program “Math of Machine Learning” by HSE University.

My research interests include

Reinforcement learning theory;
Foundational model post-training: RLHF, Distillation;
Connections between RL and sampling problems;

Email: daniil.tiapkin@polytechnique.edu

Google Scholar

ORCID

News

🔥New🔥 February 2025. I can finally present the results of my internship: “On teacher Hacking in Language Model Distillation”.
January 2025. My first paper as a supervisor, “Optimizing Backward Policies in GFlowNets via Trajectory Likelihood Maximization”, got accepted to ICLR-2025! Additionally, I got 2 accepted papers on AISTATS-2025.
September 2024. I am starting my student-researcher internship at Google DeepMind!
January 2023. The paper on RL/RLHF learning from demonstrations “Demonstration-Regularized RL” was accepted at ICLR-2024 and, additionally, the GFlowNet-RL paper “Generative Flow Networks as Entropy-Regularized RL” was honored by an oral presetation at AISTATS-2024!
September 2023. I moved to École Polytechnique, France for pursuing PhD degree.
September 2023. The paper “Model-free Posterior Sampling via Learning Rate Randomization” was accepted at NeurIPS-2023!
July 2023 The paper “Orthogonal Directions Constrained Gradient Method: from non-linear equality constraints to Stiefel manifold” was presented at COLT-2023!
April 2023. The paper “Fast Rates for Maximum Entropy Exploration” was accepted at ICML-2023!

Daniil Tiapkin

Personal page of Daniil Tiapkin, PhD student at École Polytechnique

News