Automatic Phonemic Transcriber

Danish Version 1.3. New: non-syllabic vowels are in SAMPA now marked with postfixed diacritic '^' (u^, i^, 6^). The transcriber now destinguisches between the full vowel Q and the unstressed center-vowel 6. Performance has been improved. Support for exceptions.

This free version only allows 40 word transcriptions per submission. Login / obtain license to get unrestricted access.

Input text for phonemic transcription

Output transcription only text-transcription lines word-transcription column
Select language: Select alphabet:   

An automated phonetic/phonemic transcriber supporting English, German, and Danish. Outputs transcriptions in the International Phonetic Alphabet IPA or the SAMPA alphabet designed for speech recognition technology.

Note: The German transcriber has been trained on dual word forms in cases their spellings were changed due to the orthography reform of 1996 (e.g both "muss" and "muß", "Riss" and "Riß" etc.)

Performance statistics for German Transcriber

Updated 20.10.2020

GERMAN CORPUS (unique word forms):22016
Training/test percentage of corpus 100/100 words correct: 21814words incorrect 202= 99.08 % correct
Phoneme Error Rate (PHER) = S+D+I/N* = 37+139+37/190698 = 0.11%
Training/test percentage of corpus 90/10 words correct: 1713words incorrect 488= 77.83 % correct
Phoneme Error Rate (PHER) = S+D+I/N* = 72+522+76/19345 = 3.46%
Training/test percentage of corpus 80/20 words correct: 3413words incorrect 990= 77.52 % correct
Phoneme Error Rate (PHER) = S+D+I/N* = 144+1080+150/38374 = 3.58%

*) The common metric of the performance of an automated phonemic transcriber is Phone Error Rate (PHER) defined as S+D+I/N, where
S is the number of phoneme substitutions,
D is the number of phoneme deletions,
I is the number of phoneme insertions,
N is the total number of phonemes in the reference (test set)
Valid PHER results presuppose, that the transcriber is tested on words that do not occur in the training corpus. The table above shows the performance when 10 or 20 % of the corpus are reserved for testing. When both training and testing using the full corpus (100/100), PHER rather indicates the limits of the modelling technique utilized.

View performance statistics for Danish, German, English

The transcription tool is based on a Decision Tree derived from a training lexicon (a list of orthographic forms and their phonemic counterparts). It doesn't look up words in a lexicon, but transcribes in accordance with the general rules it "learned" from the training lexicon. More specifically, the Decision Tree is some machine generated code that decides how a grapheme should be transcribed phonemically given its left and right context. It has been generated by a program that based on an Expectation–Maximization algorithm aligns graphemes and phonemes of the training lexicon and subsequently based on the alignments builds the tree structure [1].

The transcription tool is not error free. For "normal" native words it mostly produces correct results, however for words of foreign origin, some proper names, abbreviations etc. it often fails. Other systems may be mainly lexicon-based and only resort to machine-generated transcriptions when words are not found in the lexicon. Since version 1.2, the present system utilizes exception lists, small "lexica" with words (typically of foreign origin) that cannot be transcribed properly even if they are included in the training lexicon.

[1] The data-driven, predictive model is suited only for languages with alphabetic orthografies (where one grapheme largely corresponds to one phoneme). This excludes languages like Chinese (with a syllable based orthography) and Hebrew (consonantal orthography). Moreover, for languages with alphabetic orthographies the problem of mapping graphemic symbols to phonemic ones does not have equal complexity. There are extremely "easy" languages like Turkish where the problem largely can be solved simply by substituting orthographic symbols with phonemic ones without considering the context. And there are "difficult" languages like Danish where certain historical sound changes (weakening of plosives and lowering of vowels in certain contexts etc.) have resulted in a complex relation between orthography and pronunciation.