This is an automated archive made by the Lemmit Bot.
The original was posted on /r/machinelearning by /u/nihaomundo123 on 2025-02-20 21:59:39+00:00.
Hi all,
21M deciding whether or not to specialize in theoretical ML for their math PhD. Specifically, I am interested in
i) trying to understand curious phenomena in neural networks and transformers, such as neural tangent kernel and the impact of pre-training & multimodal training in generative AI (papers like: and ).
ii) but NOT interested in papers focusing on improving empirical performance, like the original dropout and batch normalization papers.
I want to work on something with the potential for deep impact during my PhD, yet still theoretical. When trying to find out if the understanding-based questions in category i) fits this description, however, I could not find much on the web…
If anyone has any specific examples of papers whose main focus was to understand some phenomena, and that ended up revolutionizing things for practitioners, would appreciate it :)
Sincerely,
nihaomundo123