This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Competitive-War-8645 on 2024-11-05 13:06:49+00:00.


As part of my masterthesis on stable diffusion and artificial imagery I rendered almost all tokens from vocab.json. I filtered doubles and empty spaces and rendered each 4 images per token with an sdxl lightning model. It took a bit on my shitty hardware, and as this particular experiment still is from the preflux era and thus represent also clip understanding the biases from t5xxl could be different.

But it might help prompting a bit, as it is a visual dictionary instead just guessing the token.

The website is also a bit educational, so if you have additions, i can add them on the fly

thanks to lostinspaz for the inspiration of getting the urge for a deeper understanding for the tokenspace.