This is an automated archive made by the Lemmit Bot.

The original was posted on /r/machinelearning by /u/skeltzyboiii on 2025-01-13 16:11:00+00:00.


Netflix and Cornell University researchers have exposed significant flaws in cosine similarity. Their study reveals that regularization in linear matrix factorization models introduces arbitrary scaling, leading to unreliable or meaningless cosine similarity results. These issues stem from the flexibility of embedding rescaling, affecting downstream tasks like recommendation systems. The research highlights the need for alternatives, such as Euclidean distance, dot products, or normalization techniques, and suggests task-specific evaluations to ensure robustness.

Read the full paper review of ‘Is Cosine-Similarity of Embeddings Really About Similarity?’ here: