Information required to create TTS language module

S
Written by Sergiiy
Updated 1 year ago

These rules have to be clarified in order to create the TTS language module:

  • Rules for number pronunciation.
  • Rules to pronounce ordinal forms of numbers.
  • Rules to pronounce a sequence of a digits.
  • Rules to pronounce a time interval.
  • Rules for datetime pronunciation.

Rules for number pronunciation

This is most difficult part of text synthesizer. Number creation rules can be very different in different languages. It's purpose is to convert number into sequence of prompts that represents this number audibly. For example the number 12345 in English is converted into phrase "twelve thousand, three hundred forty five".

It's impossible to present universal set of required prompts that will be valid for all languages. Some languages can have cases and other modifications of the numbers. So as an example here is the set of required prompts for English language but don't forget that it can be completely different for target language.

one two three four five six seven eight nine ten eleven twelve thirteen fourteen fifteen sixteen seventeen eighteen nineteen twenty thirty forty fifty sixty seventy eighty ninety hundred thousand million minus more than one billion less then minus one billion

Rules to pronounce ordinal form of numbers

This feature enables the TTS subsystem to pronounce ordinals such as "first", "twenty seventh", etc. Some languages can have cases and other modifications of the ordinals.

The following additional words in English are required for this feature to work:

first second third fourth

and so on.

Rules to pronounce a sequence of a digits

This is usually one of the most simple rules but English for example uses a trick here - the sequence of digits in English is pronounced with the word 'oh' instead of 'zero' and 'double oh' in place of two zeros in a row, so the sequence "1 0 2" will sound as "one oh two". So it is necessary to apply such language dependent rules.

Rules to pronounce a time interval

In English the intervals are pronounced for example as "one hour twenty four minutes and thirty seconds". This feature requires the following words when using English:

hour hours minute minutes second seconds

Please note that there is no need to make translation for plural forms if the target language has no plural forms. But at the same all the forms are to be specified for the languages that have one and more plural forms.

Rules for datetime pronunciation

In English date and time are pronounced like this: "October twenty first nineteen eighty seven two fifty five PM". This feature usually requires at least words representing the months:

January February March April May June July August September October November December

It may require also the days of week and words:

today yesterday tomorrow 
Did this answer your question?