How to pronounce Tibetan writing in the modern Lhasa dialect

If you've ever attempted to read anything about Tibet, you might have come across words like this:

thug pa (thukpa)
rnying ma (Nyingma)
ta' la'i bla ma (Dalai Lama)
srong btsan sgam po (Songtsen Gampo)

And so on. The initial consonant clusers like <bts> can be really intimidating for an English speaker. However, pronouncing these words don't need to be as difficult as they seem. The most common Tibetan transliteration system, the Wylie system, as well as Tibetan as actually written in Tibetan scripts, are written with the pronunciation of Classical Tibetan in mind. This is not too different from how English or French spelling reflects an older form of the language (think words like bought). In the most common Tibetan language/dialect as actually spoken, the Lhasa dialect, the pronunciation is very different from the spelling, and most of the consonant clusters are no longer pronounced; the actual sounds are not that difficult for the English speaker at all. The following is a basic layout on how to pronounce words written with the Tibetan Wylie transliteration in the Lhasa dialect (note that other Tibetan languages/dialects do preserve the pronunciation of some of the initial clusters). The information is based off multiple Lhasa Tibetan textbooks as well as Wikipedia but streamlined for someone that just wants to know how to roughly pronounce words based on Enlish transliterations; this page is born from a frustration that there was no straightforward explanation widely available on the internet for this, and this will probably be most useful for someone who wants to know how to pronounce a Tibetan name with the deserved respect. There are probably mistakes, and this is not verified by an actual Tibetan speaker, although I do plan to attend classes on Tibetan in the future.

Overview

Tibetan words are typically transliterated with spaces between every syllable. This mimics the Tibetan Uchen script and how it marks every syllable apart, without marking spaces between words. Syllables can go from simply just 1 vowel to (as structured by the Tibetan Uchen writing system) 1 vowel, preceded by 1 prefix consonant, 1 superscript, 1 root consonant, 1 subscript consonant, and followed by 2 suffix consonants, for a total of 7 potential elements. In the Lhasa dialect, at most 2 of those pre-vowel consonants and 1 post-consonant vowel are actually pronounced. However, the Lhasa dialect also has a tonal system, although this can largely be predicted based off the written consonants. In short, while the extra consonants may no longer be pronounced, their presence still lingers in the tones.

Root consonants

Wylie writes out all 30 of the Tibetan consonants. The following is their base pronunciation, although they can change depending on the consonants that are prefixed before them. In general (but not always, see subscript section), the consonant before the vowel is the root consonant. Note that the Tibetan script is an abugida, where consonants are written on the main line and vowels are written as diacritics that modify the base consonant, which is assumed by default to have the vowel 'a'. For the intuition of the English speaker, I am only writing the consonants. Note that because Tibetan's structure and pre-Western analysis is based on the Indian linguistic tradition, the order of consonants is supposed to be neatly arranged by place of articulation first and then manner of articulation. However, because pronunciation changes are mainly based on manner of articulation, I list them by manner and then place. In addition, note that the root consonant also determines the tone of the syllable, although the tone and its pronunciation may change depend on other factors covered in the tone section (if none of these words make sense just ignore).

Wylie	Pronunciation	Tone	Notes
k	[k], 'c' as in cat but without aspiration (Chinese pinyin 'g'), so that it may sound more like 'g' as in gap.	High
c	[tɕ], Like 'ch' as in chat but without aspiration and more palatized (Chinese pinyin 'j'), so that it may sound more like 'j' as in jeep.	High
t	[t], Like 't' as in tat but without aspiration (Chinese pinyin 'd'), so that it may sound more like 'd' as in dark.	High
p	[p], Like 'p' as in pat but without aspiration (Chinese pinyin 'b'), so that it may sound more like 'b' as in bark.	High
ts	[ts], Like 'ts' as in cats but without aspiration (Chinese pinyin 'z'), so that it may sound more like 'gs' as in dogs.	High
ts	[ts], Like 'ts' as in cats but without aspiration (Chinese pinyin 'z'), so that it may sound more like 'gs' as in dogs.	High
zh	[ʑ], Like 'g' as in mirage (unspirated Chinese pinyin 'x').	Low
r	[ʐ], Kind of like 'g' as in mirage (Chinese pinyin 'r' in some dialects). Can also be prounced like the Spanish or Russian rolled [r].	Low
h	[h], Like 'h' as in hat.	High
kh	[kʰ], 'c' as in cat.	High
ch	[tɕʰ], Like 'ch' as in chat but more palatized (Chinese pinyin 'q').	High
th	[tʰ], Like 't' as in tap.	High
ph	[pʰ], Like 'p' as in pat.	High
tsh	[tsʰ], Like 'ts' as in cats>.	High
ts	[ts], Like 'ts' as in cats but without aspiration (Chinese pinyin 'z'), so that it may sound more like 'gs' as in dogs.	High
z	[z], Like 'd' as in zebra.	Low
l	[l], Like 'l' as in lack.	Low
g	[k], 'c' as in cat but without aspiration (Chinese pinyin 'g'), so that it may sound more like 'g' as in gap.	Low	Note that the following consonants are pronounced the same as the first sequence, but mostly with a different tone. They will also have different interactions with other consonants.
j	[tɕ], Like 'ch' as in chat but without aspiration and more palatized (Chinese pinyin 'j'), so that it may sound more like 'j' as in jeep.	Low
d	[t], Like 't' as in tat but without aspiration (Chinese pinyin 'd'), so that it may sound more like 'd' as in dark.	Low
b	[p], Like 'p' as in pat but without aspiration (Chinese pinyin 'b'), so that it may sound more like 'b' as in bark.	Low
dz	[ts], Like 'ts' as in cats but without aspiration (Chinese pinyin 'z'), so that it may sound more like 'gs' as in dogs.	Low
'	[ɦ], Like a more forceful 'h' as in hat, or as ''' as in stereotypical British bottle of water.	Low
sh	[ɕ], Like 'sh' as in shirt (Chinese pinyin 'x').	High
ng	[ŋ], 'ng' as in bong.	Low	Like some of the other sounds here that usually only appear at the end of an English word, this sound can come before a vowel in Tibetan.
ny	[ɲ], Like 'ny' as in Spainard.	Low
n	[n], Like 'n' as in nice.	Low
m	[m], Like 'm' as in mat.	Low
w	[w], Like 'wh' as in what.	Low
y	[y], Like 'y' as in yak.	Low
s	[s], Like 's' as in sack.	High

Vowels

Wylie writes out the four orthodox vowels, but Lhasa Tibetan really has a couple more that appear when the vowel appears next to certain other sounds. Other writing systems may used different characters for the changed sounds. Assume pronunciations of English words are in a General American accent.

Wylie	Pronunciation	Notes
a	[a], 'a' as in hat	Changes to an [ʌ] sound ('u' as in but) when followed by a <b> or when in a closed syllable. Changes to [ɛ] ('e' as in bet) and is slightly lengthened when followed by <d>, <n>, <l>, and <s>. Also nasalized (think French honhonhon) when followed by <n>
i	[i], 'ee' as in beet.
e	[e], 'e' as in bait (without the English extra vowel at the end).	Changes to an [ɛ] sound ('e' as in bet when in a closed syllable. Nasalized when followed by <n>.
o	[o], 'o' as in hoe (without the English extra vowel at the end).	Changes to an [ɔ] sound ('o' as in bought when in a closed syllable (the ending consonant ((or the prefix of the next syllable, will get into this later)) is actually pronounced). Changes to [ø] (You're just going to have to look up this one if you don't know IPA) and is slightly lengthened when followed by <d>, <n>, <l>, and <s>. Nasalized when followed by <n>.
u	[u], 'oo' as in boot (without the English extra vowel at the end).	Changes to an [ʌ] sound ('u' as in but when in a closed syllable. Changes to [y] (look it up chief) and is slightly lengthened when followed by <d>, <n>, <l>, and <s>. Nasalized when followed by <n>.

When followed by the consonants <r>;, <l>;, or <'>, the vowel is lengthened slightly. <r>; and <l> themselves might be pronounced in more formal contexts but are otherwise dropped.

Subscript consonants

Four consonants can come after the root consonant and before the vowel, <y>, <r>, <w>, and <l>. Each subscript can only appear after certain root consonants, and they can modify the root consonant.
The subscript <w> is never pronounced and never effects the pronunciation of other sounds; treat it as a silent letter.
However, when <r> follows certain consonants, the consonant cluster's pronunciation and tone is completely transformed to high-toned [tr], high-toned [tʰr], or low-toned [tʰr]. When <y> follows <k>, <kh>, <g>, the two sounds are pronounced together, like 'gya' and 'kya'.
However, when <y> follows , <ph>, or <m>, they are instead pronounced like the following root consonants (also written in Wylie).

Written Wylie	Wylie Pronunciation	Notes
py	c	pya > ca
khy	ch	khya > cha
g	j	gya > ja
my	ny	mya > nya

When <r> follows <n>, <m>, and <s>, the <r> is not pronounced and the root consonant is unchanged. When it follows <h>, comebine the two sounds together like you are breathing out the 'r' (ɹ̥).
However, when <r> follows certain consonants, the consonant cluster's pronunciation and tone is completely transformed to high-toned [tr], high-toned [tʰr], or low-toned [tʰr].

Wylie	Pronunciation	Tone	Notes
kr, tr, pr	[tr], like Spanish trabajo. Unaspirated <t> as in Wylie and followed by a trilled <r>.	High	kra / tra / pra > tra, with the tone becoming high-tone regardless of the previous tone of the root consonant.
khr, thr, phr	[tʰr], like Spanish trabajo but aspirated. Aspirated <t> as in Wylie and followed by a trilled <r>.	High	khra / thra / phra > thra (sometimes written trha), with the tone becoming high-tone regardless of the previous tone of the root consonant.
gr, dr, br	[tʰr], like Spanish trabajo but aspirated. Aspirated <t> as in Wylie and followed by a trilled <r>.	Low	gra / dra / bra > thra (sometimes written trha), with the tone becoming low-tone regardless of the previous tone of the root consonant.

When <l> follows <k>, <g>, , <r>, or <s>, the root consonant is not pronounced and is replaced by the [l] sound. The tone of the syllable also becomes high. When <l> follows <z>, the [z] is also subsumed, but instead of an [l], it is only prounounced [t], with a high tone as well.

Superscript consonants

There are three superscript consonants, <r>, <l>, and <s>. These come before the root consonant and after a prefix is there is one. The superscript consonants largely can change the tone of the syllable. Otherwise they do usually do not effect pronunciation, although, like prefix consonants, may be pronounced as the ending coda of a syllable prior to it (covered more in the coda section).
The main pronunciation change is when the root consonant <h> is preceded by the superscript <l>. The two are pronounced together like an 'l' but also breathing out at the same time to make [l̥], as in the word for the Tibetan capital, Lhasa
. Otherwise, for the pronunciation of syllables on their own, unless otherwise noted below, does not change. For example, treat <rka> as just <ka>, <lta> as <ta>, and so on.
When the three superscripts come before <g>, <j>, <d>, , or <dz>, the root consonants become fully voiced and the low tone becomes even lower (see tone section). The superscript sounds themselves are not pronounced. If you have a knowledge of linguistics, you might recognize these letters as being the ones that used to be voiced but in modern Lhasa are pronounced the same usually as their unaspirated unvoiced counterparts, but in this condition, their original voiced feature returns.
When the three superscripts come before <ng>, <ny>, <n>, or <m> (like <rnga>), the root consonants becomes a high tone and the entire syllable is nasalized.

Prefix consonants

The prefix consonants can come before the superscript consonant (or just before the root if there is no superscript consonant present). The give possible prefix consonants are <g>, <d>, , <m>, and <'>. Prefixes do not change the sound unless noted here. Prefixes are never pronounced except potentially as the ending coda consonant of a previous syllable, but they can change tone and pronunciation of the root consonant. Like the superscript consonants, if &l;g>, <d>, , <m>, and <'> come before <g>, <j>, <d>, , or <dz>, the root consonants become fully voiced and the low tone becomes even lower (see tone section). When a superscript comes before the nasals <ng>, <ny>, <n>, or <m>, the root consonants becomes a high tone and the entire syllable is nasalized.
The only exception is when the prefix comes before the root <d>, the root consonant completely changes to a <w> and takes on a high tone.

First suffix consonants

In formal contexts such as reading scripture, suffix consonants are pronounced, but in regular speech, they are not, or they are pronounced differently. The patterns seem to be pretty irregular in colluquial speech, but here are some general rules. The suffix &l;g> is not pronounced in monosyllablic words, but lengthens the vowel before it. In the first syllable of a word with multiple syllables, the &l;g> is pronounced. The &l;g> causes a high-tone syllable to sound high-falling, and and a low-tone to sound low-rising-falling.
The suffix &l;ng> is pronounced in monosyllablic words. In the first syllable of a word with multiple syllables, the &l;ng> is not pronounced but lengthens and nasalizes the vowel before it.
The suffixes &l;d>, &l;n>, &l;l>, or &l;s> lengthen the vowel and change them if they are &l;a>, &l;o>, or &l;u> (see Vowel section for more). &l;n> also nasalizes the vowel before. They are not pronounced by themselves.
The suffix &l;m> is generally pronounced, and causes the vowel before it to shorten and change into a schwa (like 'a' in Benjamin).
The suffix &l;b> is also generally pronounced, and causes the vowel before it to shorten and change into a schwa (like 'a' in Benjamin).
The suffixes &l;r> and &l;'> lengthen the previous vowel, and the &l;r> is sometimes pronounced.

Second suffix consonants

The second suffixes can come after the first suffix consonant and are &l;s> and &l;d>. They are never pronounced.

Tone

There are only two tones in Lhasa Tibetan, high and low. The high tone is simply a flat high pitch, while the low tone is a flat low pitch. However, prefixes and suffixes can change the tone, either switching them from low to high or vice versa, or by changing how the tone is pronounced, as noted in other sections. The high tone can change to a falling tone, where it starts out high and falls down. The low tone can change to a low-rising-falling tone, where it starts low, goes slightly up, and then falls back down.

Morphophonological changes

Some grammatical suffixes and words are not written as they are pronounced. The most common is the genetive/possessive marker, -gi/-'i.