- Published on
Japanese Pronunciation and Pitch Accent — The Secret to Sounding Natural
- Authors

- Name
- Youngju Kim
- @fjvbn20031
- Introduction
- The Basic Units of Japanese Sound
- The Sense of Mora Timing
- The Principle of Pitch Accent
- The Four Pitch Accent Types
- Homophones and Pitch
- Sounds That Are Especially Hard
- Intonation and Rhythm
- Practical Drills
- Dialects — Tokyo-Type and Keihan-Type
- Common Misconceptions
- Conclusion
- References
Introduction
Even people who have studied Japanese for years often hit a wall when talking with native speakers: "My pronunciation is accurate, so why does it sound off?" The individual sounds (vowels and consonants) are nearly perfect, yet whole sentences come out unnatural.
Much of the cause lies not in individual sounds but in pitch accent and mora-timed rhythm. Unlike English, Japanese does not distinguish words by stress; it distinguishes them by pitch (the highness or lowness of the voice). And unlike a syllable-timed language, Japanese builds rhythm from mora (拍), evenly timed rhythmic units.
This article starts from the basics of the Japanese sound system, moves through the four pitch accent types, covers the sounds learners struggle with most, and ends with practical drills — all backed by a generous set of example tables. By the end, you should have a clear answer to why your Japanese has sounded unnatural.
The Basic Units of Japanese Sound
The Five Vowels
Japanese has only five vowels — far simpler than English or Korean.
| Kana | Romaji | IPA (approx.) | Notes |
|---|---|---|---|
| あ | a | a | Open mouth |
| い | i | i | Lips spread wide |
| う | u | ɯ | Unrounded, flat lips |
| え | e | e | |
| お | o | o |
The tricky one is う. Unlike the English "oo," it is pronounced with flat, unrounded lips, closer to a tense "uh." This is especially true in standard Tokyo speech.
The Consonant System
Japanese consonants are easiest to organize by kana row (行).
| Row | Consonant | Examples | Notes for learners |
|---|---|---|---|
| か row | k | か き く け こ | Do not turn into a tense stop word-initially |
| が row | g | が ぎ ぐ げ ご | Nasalized (鼻濁音) word-medially |
| さ row | s | さ し す せ そ | し is an sh sound |
| ざ row | z | ざ じ ず ぜ ぞ | Voiced fricatives, hard for many |
| た row | t | た ち つ て と | つ is the biggest hurdle |
| だ row | d | だ ぢ づ で ど | |
| な row | n | な に ぬ ね の | |
| は row | h | は ひ ふ へ ほ | ふ is a bilabial fricative |
| ば row | b | ば び ぶ べ ぼ | |
| ぱ row | p | ぱ ぴ ぷ ぺ ぽ | |
| ま row | m | ま み む め も | |
| や row | y | や ゆ よ | |
| ら row | r | ら り る れ ろ | A flap, not English r or l |
| わ row | w | わ を |
The ざ row, つ, ふ, and the ら row get detailed treatment later.
Special Beats: Long Vowels, Geminates, and the Moraic Nasal
To understand Japanese rhythm you must know three special beats. Each occupies one full mora (beat).
| Name | Example | Nature | Meaning |
|---|---|---|---|
| Long vowel (長音) | おばあさん | Extends a vowel by one beat | Grandmother |
| Geminate (促音) | きって | っ, a one-beat consonant hold | Postage stamp |
| Moraic nasal (撥音) | ほん | ん, a one-beat nasal | Book |
The fact that each takes a full beat is central to Japanese rhythm. Learners often rush through these beats, which is a major source of unnaturalness.
The Sense of Mora Timing
Syllable Timing vs. Mora Timing
English is largely stress-timed and Korean is roughly syllable-timed. Japanese, by contrast, is mora-timed. A mora is a minimal rhythmic unit of nearly equal length.
The basic rules:
| Element | Mora count | Explanation |
|---|---|---|
| One kana (plain or voiced) | 1 mora | か, し, ぐ, etc. |
| Contracted sound (拗音) きゃ | 1 mora | Small ゃゅょ still counts as one |
| Long vowel | 1 mora | The lengthened vowel is a separate beat |
| Geminate っ | 1 mora | The pause is a beat |
| Moraic nasal ん | 1 mora | The nasal is a beat |
A Mora-Counting Table
Let us count moras word by word — cases where learners often drop a beat.
| Word | Meaning | Kana | Mora breakdown | Mora count |
|---|---|---|---|---|
| 東京 | Tokyo | とうきょう | と/う/きょ/う | 4 |
| 学校 | School | がっこう | が/っ/こ/う | 4 |
| 切手 | Stamp | きって | き/っ/て | 3 |
| 新聞 | Newspaper | しんぶん | し/ん/ぶ/ん | 4 |
| 病院 | Hospital | びょういん | びょ/う/い/ん | 4 |
| 先生 | Teacher | せんせい | せ/ん/せ/い | 4 |
| 空港 | Airport | くうこう | く/う/こ/う | 4 |
The key point is that とうきょう sounds like a two-syllable "Tokyo" to an English ear but is four moras in Japanese. Each of と, う, きょ, う must be given equal weight. In particular, do not drop the long vowel う.
The Principle of Pitch Accent
Why Pitch Matters
Japanese distinguishes words not by stress but by pitch patterns. Each mora carries one of two pitches: High (H) or Low (L).
Consider the most famous minimal pair (Tokyo dialect):
| Word | Kana | Pitch pattern | Meaning |
|---|---|---|---|
| 橋 | はし | LH | Bridge |
| 箸 | はし | HL | Chopsticks |
| 端 | はし | LH (distinguished by particle) | Edge |
The same はし becomes entirely different words depending on pitch. Chopsticks (箸) are high then low (HL); bridge (橋) is low then high (LH).
Two Cardinal Rules of Pitch
Tokyo-dialect pitch has two absolute rules.
| Rule | Content |
|---|---|
| Rule 1 | The first and second mora must differ in pitch |
| Rule 2 | Once pitch falls within a word, it never rises again |
If the first mora is high, the second must be low, and vice versa. After a fall, pitch stays low. This falling point is called the accent kernel (核).
The Four Pitch Accent Types
Tokyo-dialect word accents fall into four broad types, defined by where the kernel sits.
Overview Table
| Type | Japanese | Kernel position | Pattern (3-mora) | Feature |
|---|---|---|---|---|
| Flat | 平板型 | None | LHH | Stays high through the particle |
| Head-high | 頭高型 | First mora | HLL | Only the first beat is high |
| Middle-high | 中高型 | Middle | LHL | The middle is high |
| Tail-high | 尾高型 | Last mora | LHH (falls on particle) | Word is high, particle is low |
A particle here means a following word such as が, を, or は. Flat and tail-high look identical (LHH) on the word alone; the difference emerges the moment a particle attaches.
Flat (平板型)
A kernel-less type. Only the first mora is low; the rest stays high all the way, even through the particle. It is the most common type in Tokyo speech.
| Word | Kana | Word pattern | With particle |
|---|---|---|---|
| 桜 | さくら | LHH | さくらが LHHH |
| 学生 | がくせい | LHHH | がくせいが LHHHH |
| 日本語 | にほんご | LHHH | にほんごが LHHHH |
| 友達 | ともだち | LHHH | ともだちが LHHHH |
The particle が staying high is the decisive marker of the flat type.
Head-High (頭高型)
Only the first mora is high, dropping sharply from the second — a high-then-low type. The kernel is on the first mora.
| Word | Kana | Pattern | Meaning |
|---|---|---|---|
| 雨 | あめ | HL | Rain |
| 箸 | はし | HL | Chopsticks |
| 猫 | ねこ | HL | Cat |
| 電気 | でんき | HLL | Electricity |
| 日本 | にほん | HLL | Japan |
Hit the first beat clearly high and drop immediately.
Middle-High (中高型)
The kernel sits in the middle, so the contour rises then falls. The first mora is low, it rises, and it drops again at the kernel.
| Word | Kana | Pattern | Meaning |
|---|---|---|---|
| お菓子 | おかし | LHL | Sweets |
| 卵 | たまご | LHL | Egg |
| 二人 | ふたり | LHL | Two people |
| 心 | こころ | LHL | Heart, mind |
| 湖 | みずうみ | LHHL | Lake |
Aim for a peak in a middle mora, then come down.
Tail-High (尾高型)
The word itself stays high to the end, but the moment a particle attaches, that particle drops. The kernel is on the last mora of the word.
| Word | Kana | Word pattern | With particle |
|---|---|---|---|
| 花 | はな | LH | はなが LHL |
| 男 | おとこ | LHH | おとこが LHHL |
| 山 | やま | LH | やまが LHL |
| 弟 | おとうと | LHHH | おとうとが LHHHL |
Flat vs. Tail-High
This distinction confuses learners most, because the words sound identical in isolation. You must attach a particle to tell them apart.
| Word | Kana | Type | Word alone | With particle |
|---|---|---|---|---|
| 端 | はし | Flat | LH | はしが LHH |
| 橋 | はし | Tail-high | LH | はしが LHL |
| 花 | はな | Tail-high | LH | はなが LHL |
| 鼻 | はな | Flat | LH | はなが LHH |
If はな means nose (鼻) it is flat, so the particle stays high (LHH); if it means flower (花) it is tail-high, so the particle drops (LHL).
Accent Types by Kernel Position
Organizing by which mora holds the kernel in an n-mora word yields a clean rule.
| Kernel position | Type name | 3-mora pattern |
|---|---|---|
| No kernel | Flat | LHH (particle H) |
| 1st | Head-high | HLL |
| 2nd | Middle-high | LHL |
| 3rd (last) | Tail-high | LHH (L on particle) |
Homophones and Pitch
Here are more pairs where pitch alone carries the meaning — often distinguished without any context.
| Kana | Pattern A | Meaning A | Pattern B | Meaning B |
|---|---|---|---|---|
| あめ | HL | Rain | LH | Candy |
| はし | HL | Chopsticks | LH | Bridge |
| かき | HL | Oyster | LH | Persimmon |
| いま | HL | Living room | LH | Now |
| かみ | HL | God | LH (flat) | Paper, hair |
| せき | HL | Cough | LH | Seat |
Native speakers distinguish these perfectly and unconsciously. Learners pick them up much faster by memorizing minimal pairs in sets.
Sounds That Are Especially Hard
つ (tsu)
An affricate absent from Korean and English. Do not substitute an "s" or "ch." Touch the tongue tip behind the upper teeth and release it, producing t and s simultaneously.
| Wrong substitute | Correct direction | Example word |
|---|---|---|
| Replace with "s" | ts as one burst | つき (moon) |
| Replace with "ch" | ts as one burst | つくえ (desk) |
| Replace with "tu" | ts as one burst | みつ (honey) |
The ざ Row (Voiced Fricatives and Affricates)
ざ, じ, ず, ぜ, ぞ are voiced sounds that learners often devoice or slide into a "j" sound.
| Kana | Common mistake | Target sound |
|---|---|---|
| ざ | Slides to "ja" | Voiced z fricative |
| じ | Devoiced "chi" | Voiced j |
| ず | Devoiced | Voiced z |
| ぜ | Devoiced | Voiced z |
| ぞ | Devoiced | Voiced z |
They are harder word-medially and word-finally than word-initially. Practice with words like かぜ (wind) and みず (water).
Voicing Contrasts (清濁)
Many languages auto-voice consonants medially, but in Japanese the voiced-versus-voiceless contrast carries meaning.
| Voiceless (清音) | Meaning | Voiced (濁音) | Meaning |
|---|---|---|---|
| かい | Shellfish | がい | Harm |
| てん | Point | でん | Electric (bound form) |
| こま | Spinning top | ごま | Sesame |
| たい | Sea bream | だい | Stand, base |
| きん | Gold | ぎん | Silver |
Train yourself to clearly separate か and が, た and だ word-initially in particular.
Long Vowels (長音)
Dropping or adding a long vowel produces a completely different word — one of the most common learner errors.
| Short | Meaning | Long | Meaning |
|---|---|---|---|
| おばさん | Aunt | おばあさん | Grandmother |
| おじさん | Uncle | おじいさん | Grandfather |
| ゆき | Snow | ゆうき | Courage |
| とる | To take | とおる | To pass through |
| ここ | Here | こうこう | High school |
| びよういん | Beauty salon | びょういん | Hospital |
The last pair, びよういん (salon) and びょういん (hospital), splits on the contracted-sound-plus-long-vowel difference — a famous example.
Geminates っ
Rushing through a geminate yields a different word.
| No geminate | Meaning | With geminate | Meaning |
|---|---|---|---|
| いか | Squid | いっか | A family |
| かこ | Past | かっこ | Parentheses |
| さか | Slope | さっか | Author |
| ぶか | Subordinate | ぶっか | Prices (物価) |
The ら Row (Flap)
The Japanese ら row is neither English r nor l but a flap (a light tap of the tongue). It resembles the initial ㄹ in Korean but is softer.
| Kana | Common mistake | Target |
|---|---|---|
| ら | Rolled English r | Light flap |
| り | Pronounced as l | Light flap |
| る | Over-rolled | Light flap |
ふ (Bilabial Fricative)
ふ is not made with the upper teeth and lower lip like English f, but by narrowing both lips and pushing air through.
| Wrong direction | Correct direction |
|---|---|
| English f (teeth + lip) | Air between both lips |
| Plain "hu" | Narrow the lips |
Nasalized が Row (鼻濁音)
In traditional Tokyo speech, the が row nasalizes word-medially, adding a nasal quality. It is fading nowadays but survives in announcer speech.
| Word | Medial が row | Nasalized? |
|---|---|---|
| 学校 | None (initial) | Not nasalized |
| 鏡 かがみ | が | Traditionally nasalized |
| りんご | ご | Traditionally nasalized |
Intonation and Rhythm
Sentence-Level Intonation
When word accents combine into a sentence, an overall gentle downward drift (downstep) appears. Overall pitch tends to lower gradually toward the end of the sentence.
| Element | Explanation |
|---|---|
| Downstep | The overall register lowers toward the end |
| Phrase boundary | A slight reset at each meaning unit |
| Sentence-final | Questions rise at the end; statements fall |
Even Beats Keep the Rhythm Alive
The single biggest trick to natural-sounding Japanese is giving each mora equal length. Speakers of stress-timed languages tend to lengthen important beats and shorten the rest, which breaks Japanese's even rhythm.
| Word | Wrong rhythm | Correct rhythm |
|---|---|---|
| ありがとう | "arigato" (like 4 beats) | あ/り/が/と/う, 5 even moras |
| がっこう | "gakko" (rushed) | が/っ/こ/う, 4 even moras |
| おはよう | "ohayo" (3 beats) | お/は/よ/う, 4 even moras |
Practical Drills
Shadowing (シャドーイング)
The single most effective pronunciation drill. While listening to native audio, follow along like a shadow about half a second behind.
| Stage | Content | Goal |
|---|---|---|
| Stage 1 | Confirm sounds while reading the script | Grasp pitch and beats |
| Stage 2 | Speak in sync while reading the script | Match the rhythm |
| Stage 3 | Shadow without the script | Internalize intonation |
| Stage 4 | Record and compare | Self-correct |
Minimal-Pair Training
Repeat pitch-distinguished pairs as sets. Pairing あめ (rain/candy) and はし (chopsticks/bridge) builds a fast sense of pitch.
Using OJAD
OJAD (Online Japanese Accent Dictionary), built at the University of Tokyo, is a free accent dictionary. It shows the pitch curve of words and sentences visually and provides audio. Strongly recommended for learners.
Record and Self-Compare
Recording your own voice and comparing waveforms and pitch with a native speaker reveals objectively where you differ. It is especially good for checking the length of long vowels, geminates, and the moraic nasal.
A Sample Practice Routine
| Day | Activity | Time |
|---|---|---|
| Mon | 30 minimal pairs | 15 min |
| Tue | Shadow a 1-minute script | 20 min |
| Wed | Check new-word accent on OJAD | 15 min |
| Thu | Record and compare | 20 min |
| Fri | Shadow news audio | 20 min |
Dialects — Tokyo-Type and Keihan-Type
Japanese pitch accent varies greatly by region, splitting into two broad systems.
| Category | Representative area | Feature |
|---|---|---|
| Tokyo-type (東京式) | Tokyo, most of eastern Japan | The standard baseline; first two moras always differ |
| Keihan-type (京阪式) | Osaka, Kyoto, Kansai | Can start high on the first mora; more complex patterns |
The same word can have opposite pitch depending on the region.
| Word | Tokyo-type | Keihan-type (Kansai) |
|---|---|---|
| 橋 はし | LH (bridge) | Differs, HL-family |
| 雨 あめ | HL | Differs, LH-family |
Early on, focus on the standard Tokyo-type alone. Just knowing that Kansai content (manzai, dramas) uses different pitch will reduce confusion when you hear it.
Common Misconceptions
"You do not need to learn pitch"?
Meaning still gets across with accurate sounds. But mismatched pitch is immediately heard as a "foreign accent," and minimal pairs can cause real misunderstanding. If you aim for advanced fluency, pitch is essential.
"Knowing the kanji reading is enough"?
The same kanji can take different accent types depending on the word. Memorize accent together with each word.
"Speaking slowly makes it natural"?
Often the opposite. You should maintain even mora timing at a natural speed to keep the rhythm alive. Too slow, and the sense of beat collapses.
Conclusion
The real key to Japanese pronunciation is not individual sounds but pitch accent and mora rhythm. The five vowels and the consonant system are relatively easy to acquire, but the four accent types (flat, head-high, middle-high, tail-high) and the sense of even timing require conscious training.
To summarize: first, build the habit of counting moras evenly. Second, internalize pitch differences through minimal pairs. Third, compare yourself with real audio using shadowing and OJAD. Fourth, never rush the beats of long vowels, geminates, and the moraic nasal.
Practice these four consistently and you can move past the "accurate but off" wall to genuinely natural Japanese.
References
- OJAD (Online Japanese Accent Dictionary): https://www.gavo.t.u-tokyo.ac.jp/ojad/
- Japanese pitch accent (Wikipedia): https://en.wikipedia.org/wiki/Japanese_pitch_accent
- Mora (linguistics) (Wikipedia): https://en.wikipedia.org/wiki/Mora_(linguistics)
- NHK Broadcasting Culture Research Institute: https://www.nhk.or.jp/bunken/
- Japanese phonology (Wikipedia): https://en.wikipedia.org/wiki/Japanese_phonology
- Weblio Japanese dictionary: https://www.weblio.jp/
- kotobank encyclopedia and dictionaries: https://kotobank.jp/
- Tokyo dialect (Wikipedia): https://en.wikipedia.org/wiki/Tokyo_dialect
- Kansai dialect (Wikipedia): https://en.wikipedia.org/wiki/Kansai_dialect