This is an automated archive made by the Lemmit Bot.
The original was posted on /r/machinelearning by /u/Snoo_65491 on 2024-11-13 12:07:45+00:00.
Hello everyone,
I was reading the paper “Neural Discrete Representation Learning” and I was puzzled when I looked at the first term in VQ-VAE Loss Equation
I understand the role of the second and the third term. However, I am not able to derive the first term from either the MSE between the original and reconstructed image. I assumed it will be similar to the ELBO Loss in the VAE. The paper mentions why they have omitted the KL Divergence Term, but even then I don’t understand how the expectation in the ELBO Loss turned out to be the first term.
Note: I am not coming from a stats background, so If the question is something fundamental, it would be helpful if you could tell me what it is. Also, If the question isn’t clearly explained, I could explain it more in the discussionHello everyone,I was reading the paper “Neural Discrete Representation Learning” and I was puzzled when I looked at the first term in VQ-VAE Loss EquationI understand the role of the second and the third term. However, I am not able to derive the first term from either the MSE between the original and reconstructed image. I assumed it will be similar to the ELBO Loss in the VAE. The paper mentions why they have omitted the KL Divergence Term, but even then I don’t understand how the expectation in the ELBO Loss turned out to be the first term.Note: I am not coming from a stats background, so If the question is something fundamental, it would be helpful if you could tell me what it is. Also, If the question isn’t clearly explained, I could explain it more in the discussion
[Discussion]