This is an automated archive made by the Lemmit Bot.

The original was posted on /r/machinelearning by /u/gokulPRO on 2024-04-08 22:38:54.


I am planning on working on large multiomodal training (1B parameters) for text+audio. As of now I was thinking of going with pytorch, deepspeed, wandb. What do you recommend and what do you use in general for distributed large model training?

Do you use hugginface? I felt it a bit too wrapped that it becomes messy to access the bare backbones, but haven’t given it a proper try. For out of shelf models and custom dataset training that does sound useful, but research requires more than that. So How was your experience in terms of research, where you need flexiblity to change the model? And in general whats your tech stack when it comes to research?