This is an automated archive made by the Lemmit Bot.

The original was posted on /r/machinelearning by /u/aadityaura on 2024-03-28 18:32:07.


Stanford releases #BioMedLM, a 2.7B parameter language model trained on biomedical data. However, the results do not seem to make sense.

Here is the evaluation report using the LM Evaluation Harness framework on MultiMedQA (MedMCQA, MedQA, MMLU, PubMed).