Lemmit.Online bot

Lemmit.Online bot

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/machinelearning by /u/Seankala on 2024-04-01 05:12:58.

I recently made custom BERT and ELECTRA models for the fashion domain that could also handle English and my own native language (I’m not in the US). I noticed that performance wasn’t as good as I anticipated and felt that it wasn’t worth it.

Are there any papers or resources regarding when it’s worth it to create your own pre-trained LM from scratch? I recall reading a paper for the biomedical domain a long time ago titled Pretrained Language Models for Biomedical and Clinical Tasks: Understanding and Extending the State-of-the-Art (Lewis et al., 2020) that seems to show that pre-training from scratch can help with biomedical and clinical tasks but am not sure if there are any other papers out there.

Also, are there any tips or good-to-know things when assessing a newly pre-trained LM? For example, checking OOV rate etc.

Thanks in advance.

[D] Creating your own language model (i.e., text encoder) for a specific domain. When is it worth it and what should you be aware of?

[D] Creating your own language model (i.e., text encoder) for a specific domain. When is it worth it and what should you be aware of?

This is an automated archive made by the Lemmit Bot.