SI-PRON Pronunciation Lexicon: A New Language Resource for Slovenian.
Informatica 2006, Dec, 30, 4
-
- 79,00 Kč
-
- 79,00 Kč
Publisher Description
We present the efforts involved in designing SI-PRON, a comprehensive machine-readable pronunciation lexicon for Slovenian. It has been built from two sources and contains all the lemmas from the Dictionary of Standard Slovenian (SSKJ), the most frequent inflected word forms found in contemporary Slovenian texts, and a first pass of inflected word forms derived from SSKJ lemmas. The lexicon file contains the orthography, corresponding pronunciations, lemmas and morphosyntactic descriptors of lexical entries in a format based on requirements deigned by the W3C Voice Browser Activity. The current version of the SI-PRON pronunciation lexicon contains over 1.4 million lexical entries. The word list determination procedure, the generation and validation of phonetic transcriptions, and the lexicon format are described in the paper. Along with Onomastica, SI-PRON presents a valuable language resource for linguistic studies and research of speech technologies for Slovenian. The lexicon is already being used by the Proteus Slovenian text-to-speech synthesis system and for generating audio samples of the SSKJ headwords. Povzetek: Clanek opisuje novjezikovni vir za slovenscino, slovar izgovarjav SI-PRON.