Welcome

Welcome to MiniCore Japanese — a small, polite, spoken-only slice of real Japanese for travelers. The whole system is about 220 drilled items slotted into 20 fixed sentence patterns. Before any of that, this lesson tunes your ear and mouth to how Japanese sounds.

There is nothing to memorize in this lesson. Every Japanese word below is a demonstration, not vocabulary — most will come back in later lessons, where you’ll actually learn them. For now: listen, repeat, and don’t worry about meaning.

Everything is written in romaji (Latin letters). You will never be asked to read Japanese script in this course — though you’ll see the native writing alongside each phrase, just so you know what it looks like in the wild.

One promise before we start: Japanese pronunciation is easy. There are only five vowels, the consonants are mostly familiar, and there is no stress accent to get wrong. If you can count beats, you can be understood.

The five vowels

Japanese has exactly five vowel sounds. They never change, never reduce, and never glide into each other. Say them short, clean, and pure:

a i u e o

The two to watch: u is said with relaxed, unrounded lips (not the tight English “oo”), and o stays pure — English speakers glide “go” into “gow”; Japanese o just stops.

ue

No diphthongs. When two vowels sit together, each keeps its full identity. This word is u then e — two clean beats, no blending.

hai iie

hai is not English “high” — it’s ha + i, two beats. And iie is i, i again, e: three beats. Count them in the audio.

Everything runs on beats

English rhythm is built on stress: we squash some syllables and stretch others (ba-NA-na). Japanese has no stress at all. Instead, every word is a string of short, even beats — like a metronome. These beats are called morae, but we’ll just say beats.

Each beat gets the same length and the same weight. Try clapping along:

tamago

Three beats, perfectly even: ta-ma-go. Resist the English urge to stress the middle (ta-MA-go) — keep it flat.

konnichiwa sumimasen

Five beats each. sumimasen will become the single most useful word you own (Lesson 1) — for now, just get its rhythm: su-mi-ma-se-n, flat and even.

Why this matters: getting the number of beats right does more for being understood than getting any individual consonant perfect. The rest of this lesson is mostly about the things that count as a beat when English ears don’t expect one.

Long vowels — hold it for two beats

A long vowel is the same vowel held for two beats instead of one. In romaji we write it doubled: oo, ii, aa. This is not decoration — vowel length changes the word.

Tookyoo Oosaka koohii

“Tokyo” is not to-kyo (2 beats) — it’s to-o-kyo-o, four beats. Hold each long vowel like a musical note that lasts twice as long.

biru biiru

One held beat is the entire difference between a building and a beer. Order carefully.

kado kaado

Same again: kaado (card — you’ll use this one at every register) has a held first vowel that kado (corner) doesn’t.

obasan obaasan

The classic warning pair: hold the a one beat too long and you’ve aged someone thirty years.

Spelling note: in the romaji used throughout this course, a long o is sometimes written oo (Tookyoo, koohii) and sometimes ou (arigatou, ginkou) — both are pronounced exactly the same: a two-beat o. Similarly ei (keitai — phone) is usually said as a two-beat e. And one loanword quirk: ramen is written the way you know it, but said with a long first vowel — ra-a-me-n.

The small pause — doubled consonants

A doubled consonant (kk, pp, tt, ss) means: stop, hold for one silent beat, then release. It feels like a tiny catch or hiccup in the middle of the word — and that held silence counts as a full beat.

kippu chotto massugu

kippu: say ki, close your lips for the p and hold one beat of silence, then release into pu. Three beats, one of them silent.

oto otto

Another meaning-changing pair: the held beat is the only difference between a sound and a husband.

The n beat

Japanese has one consonant that stands alone as its own full beat: n. When n ends a syllable, it doesn’t attach to the vowel after it — it takes a beat of its own.

konbini ginkou konbanwa

konbini is not “kon-bi-ni” said fast — it’s ko, then a full beat on n, then bi-ni. Four beats. Give the n its space and you’ll instantly sound more Japanese.

The Japanese r

The Japanese r is not the English r (no lip-rounding, no growl) and not a rolled Spanish rr. It’s a single, light tap of the tongue-tip just behind your teeth — almost exactly the American English dd/tt sound in “ladder” or “butter” said quickly. To English ears it lands somewhere between r, l, and d.

ramen hidari resutoran arerugii

Say “ladder” a few times, isolate that middle tap, then use it for every r above. Don’t overthink it — even a plain English l is closer to the target than an English r.

Vanishing vowels

Between voiceless consonants (k, s, t, p, h) or at the end of a word, the vowels u and i often get whispered to almost nothing. The beat is still there — the vowel just loses its voice. You’ll hear this constantly, because it affects the polite endings you’ll say in every single sentence:

desu onegaishimasu suki

desu sounds like “dess,” every verb ending in -masu sounds like “mahss,” and suki sounds like “ski.” Copy what you hear, not the spelling — but keep the beat count.

What we’re deliberately ignoring: pitch accent

Japanese words carry a subtle high–low pitch pattern, and yes, a few word pairs differ only by pitch (hashi can mean “bridge” or “chopsticks”). This course ignores pitch accent entirely, on purpose. Context resolves the rare ambiguous pair — nobody at a restaurant thinks you’re asking for a bridge — and the study time pitch would cost buys far more elsewhere. Speak with flat, even beats and you will be understood everywhere in Japan. Don’t fret about it; it’s not part of this system.

How the romaji works

Everything in this course uses standard (Hepburn) romaji. It’s almost entirely “say what you see,” with these reminders:

  • Every letter is pronounced. No silent letters (except the whispered u/i above — and even those keep their beat).
  • Vowels are always the same five soundsa i u e o as taught above, no exceptions.
  • Doubled vowels (oo, ii, aa) and ou/ei = one vowel held for two beats.
  • Doubled consonants (kk, pp, tt, ss) = one silent held beat.
  • g is always hard, as in “get” — ginkou is not “jin-ko.”
  • e at the end of a word is a full “eh” — sake is sa-ke, never “sah-kee.”
  • fu is softer than English f — blow the sound between your lips without biting them (fukuro — bag).
  • tsu is the ts of “cats”, just at the start of a beat (otsuri — change).

Sound check

That’s the whole sound system. Here’s the success test for this lesson, straight from the course design: hear any word twice, then repeat it with the right number of beats and the right vowel lengths. Meaning doesn’t matter yet.

Try it cold on these — play each one, repeat it aloud, then check yourself against the beat count:

takushii arigatou gozaimasu kuukou kitte chikatetsu

If you can echo these with even rhythm, correct beat counts, and held long vowels, you’re calibrated. Everything from Lesson 1 onward builds on exactly this skill — the words will be new, but the sounds never will be.

Review

An Anki deck is available for this lesson — pure listen-and-repeat cards for every demonstration word above. A few minutes of echo practice before Lesson 1 is plenty.

Next up: Lesson 1 — Survival glue & repair, where you learn the dozen-and-a-half phrases that carry an entire trip.