Danish Version 1.3. New: non-syllabic vowels are in SAMPA now marked with postfixed diacritic '^' (u^, i^, 6^). The transcriber now destinguisches between the full vowel Q and the unstressed center-vowel 6. Performance has been improved. Support for exceptions.
This free version only allows 40 word transcriptions per submission. Login / obtain license to get unrestricted access.
An automated phonetic/phonemic transcriber supporting English, German, and Danish. Outputs transcriptions in the International Phonetic Alphabet IPA or the SAMPA alphabet designed for speech recognition technology.
Performance statistics for Danish Transcriber
Updated 20.10.2020
DANISH CORPUS (unique word forms): | 38977 | ||
Training/test percentage of corpus 100/100 | words correct: 37556 | words incorrect 1421 | = 96.35 % correct |
Phoneme Error Rate (PHER) = | S+D+I/N* = | 298+1106+212/290496 = | 0.56% |
Training/test percentage of corpus 90/10 | words correct: 2911 | words incorrect 986 | = 74.70 % correct |
Phoneme Error Rate (PHER) = | S+D+I/N* = | 163+1056+128/29006 = | 4.64% |
Training/test percentage of corpus 80/20 | words correct: 5838 | words incorrect 1957 | = 74.89 % correct |
Phoneme Error Rate (PHER) = | S+D+I/N* = | 314+2121+286/58053 = | 4.69% |
*) The common metric of the performance of an automated phonemic transcriber is Phone Error Rate (PHER)
defined as S+D+I/N, where
S is the number of phoneme substitutions,
D is the number of phoneme deletions,
I is the number of phoneme insertions,
N is the total number of phonemes in the reference (test set)
Valid PHER results presuppose, that the transcriber is tested on words that do not occur in the training corpus.
The table above shows the performance when
10 or 20 % of the corpus are reserved for testing. When both training and testing using the full corpus (100/100), PHER
rather indicates the limits of the modelling technique utilized.
View performance statistics for Danish, German, English
The transcription tool is based on a Decision Tree derived from a training lexicon (a list of orthographic forms and their phonemic counterparts). It doesn't look up words in a lexicon, but transcribes in accordance with the general rules it "learned" from the training lexicon. More specifically, the Decision Tree is some machine generated code that decides how a grapheme should be transcribed phonemically given its left and right context. It has been generated by a program that based on an Expectation–Maximization algorithm aligns graphemes and phonemes of the training lexicon and subsequently based on the alignments builds the tree structure [1].
The transcription tool is not error free. For "normal" native words it mostly produces correct results, however for words of foreign origin, some proper names, abbreviations etc. it often fails. Other systems may be mainly lexicon-based and only resort to machine-generated transcriptions when words are not found in the lexicon. Since version 1.2, the present system utilizes exception lists, small "lexica" with words (typically of foreign origin) that cannot be transcribed properly even if they are included in the training lexicon.
[1] The data-driven, predictive model is suited only for languages with alphabetic orthografies (where one grapheme largely corresponds to one phoneme). This excludes languages like Chinese (with a syllable based orthography) and Hebrew (consonantal orthography). Moreover, for languages with alphabetic orthographies the problem of mapping graphemic symbols to phonemic ones does not have equal complexity. There are extremely "easy" languages like Turkish where the problem largely can be solved simply by substituting orthographic symbols with phonemic ones without considering the context. And there are "difficult" languages like Danish where certain historical sound changes (weakening of plosives and lowering of vowels in certain contexts etc.) have resulted in a complex relation between orthography and pronunciation.