This is an automated archive made by the Lemmit Bot.
The original was posted on /r/learnjapanese by /u/SpanishAhora on 2025-10-03 10:31:41+00:00.
If you’ve ever googled “How many kanji do I need to know?” you’ve probably run into the same kind of answers:
- “With 1,000 kanji you’ll understand 90% of texts.”
- “The top 2,000 words cover 80% of daily conversation.”
If you’ve tried reading Japanese, you already know the reality feels very different.
That “remaining 10%” is usually the one word that makes or breaks the sentence.
So instead of taking frequency stats at face value, I decided to test comprehension at the sentence level. To do that, I built a database of over 120 million unique Japanese sentences drawn from every corner of the language: anime, movies, manga, Wikipedia, news articles, education, and books. That scale is large enough that the results aren’t just anecdotal. They reflect real, everyday Japanese across domains.
The Problem with Frequency
Frequency is calculated across all words in a corpus.
Just because you know 90% of the words in a text doesn’t mean you can actually read it.
Imagine this sentence:
If you don’t know the word 合格 (to pass an exam), the sentence collapses. You understood 90%, but it wasn’t enough.
This is why sentence-level comprehension is the true test.
Not just how many kanji you’ve “seen before,” but whether you can follow entire sentences without stumbling.
A Stricter Test: Sentence-Level Comprehension
Here’s the method I used:
- A sentence counts as readable if every word in it is made of known kanji and vocabulary.
- A sentence counts as guessable if it contains only one unknown word, but that word is fully composed of known kanji, making it reasonable to infer the meaning.
- Everything else counts as not understood.
This is much closer to what learners experience: you either get the full meaning of a sentence, or you don’t.
The Results from 120 Million Sentences
After crunching through the database, here’s what the numbers show:
- 75% comprehension → 1,568 kanji, 3,986 words
- 85% comprehension → 1,926 kanji, 6,255 words
- 95% comprehension → 2,570 kanji, 13,157 words
You can read the full article and methodology on the attached link.