Hello! I am a PhD student in Reinforcement Learning, pursuing double PhD at CMAP, École Polytechnique, and LMO, Université Paris-Saclay under supervision of Éric Moulines and Gilles Stoltz.
Recently, I was a student-researcher at Google DeepMind Paris, where I studied distillation of language models; check out this paper with my results!
Additionally, I was doing research at HDI Lab at HSE University. I did my Master’s degree in Applied Mathematics and Computer Science on the program “Math of Machine Learning” by HSE University.
My research interests include
- Reinforcement Learning Theory;
- LLM post-training: RLHF, Distillation;
- Connections between RL and sampling problems;
Email: daniil.tiapkin@polytechnique.edu | Google Scholar | ORCID |
News
- 🔥New🔥 February 2025. I can finally present the results of my internship: “On teacher Hacking in Language Model Distillation”.
- January 2025. My first paper as a supervisor, “Optimizing Backward Policies in GFlowNets via Trajectory Likelihood Maximization”, got accepted to ICLR-2025! Additionally, I got 2 accepted papers on AISTATS-2025.
- September 2024. I am starting my student-researcher internship at Google DeepMind!
- January 2023. The paper on RL/RLHF learning from demonstrations “Demonstration-Regularized RL” was accepted at ICLR-2024 and, additionally, the GFlowNet-RL paper “Generative Flow Networks as Entropy-Regularized RL” was honored by an oral presetation at AISTATS-2024!
- September 2023. I moved to École Polytechnique, France for pursuing PhD degree.
- September 2023. The paper “Model-free Posterior Sampling via Learning Rate Randomization” was accepted at NeurIPS-2023!
- July 2023 The paper “Orthogonal Directions Constrained Gradient Method: from non-linear equality constraints to Stiefel manifold” was presented at COLT-2023!
- April 2023. The paper “Fast Rates for Maximum Entropy Exploration” was accepted at ICML-2023!