Talkey¶
Simple Text-To-Speech (TTS) interface library with multi-language and multi-engine support.
Contents:
Getting Started¶
Installation¶
Install from pypi:
pip install talkey
Engines supported¶
By default it will try to locate and use the local instances of the following TTS engines:
- Flite
- SVOX Pico
- Festival
- eSpeak
- mbrola via eSpeak
Installing one or more of those engines should allow the libary to function and generate speech.
It also supports the following networked TTS Engines:
MaryTTS (needs hosting)
Google TTS (cloud hosted) Requires:
pip install gTTS
Usage¶
At its simplest use case:
import talkey
tts = talkey.Talkey()
tts.say('Old McDonald had a farm')
If you get a talkey.base.TTSError: No supported languages
error, it means that you don’t have a supported TTS engine installed. Please see below.
Simple configuration¶
For best results you should configure it:
import talkey
tts = talkey.Talkey(
preferred_languages = ['en', 'af', 'el', 'fr'],
espeak = {
'languages': {
'en': {
'voice': 'english-mb-en1',
'words_per_minute': 130
},
}
})
tts.say('Old McDonald had a farm')
Installing TTS engines¶
Ubuntu/Debian:¶
For festival:
sudo apt-get install festival
For flite:
sudo apt-get install flite
For SVOX Pico:
sudo apt-get install libttspico-utils
For eSpeak:
sudo apt-get install espeak
For mbrola and en1 voice (example, there are many other mbrola- packages):
sudo apt-get install mbrola-en1
Windows:¶
Install eSpeak:
Go to http://espeak.sourceforge.net/download.html and download and installsetup_espeak-<version>.exe
For mbrola and its voices:
Go to http://espeak.sourceforge.net/mbrola.html and download and installMbrolaTools<version>.exe
and follow directions to install voices from http://www.tcts.fpms.ac.be/synthesis/mbrola/mbrcopybin.html
For google TTS:
install python package gTTS
Download ffmpeg from http://ffmpeg.zeranoe.com/builds/
Extract with 7Zip, and add the bin folder to the PATH.
- e.g.:
- extract to C:ffmpeg and add C:ffmpegbin to the PATH
(In cmd.exe you should be able to just run ffmpeg and see it showing information, then it is working right)
Usage¶
talkey module:¶
-
talkey.
enumerate_engines
()¶ Returns list of engine SLUGs in order of preference
-
talkey.
create_engine
(engine, options=None, defaults=None)¶ Creates an instance of an engine. There is a two-stage instantiation process with engines.
options
:The keyword options to instantiate the engine class
defaults
:The default configuration for the engine (options often depends on instantiated TTS engine)
-
class
talkey.
Talkey
(preferred_languages=None, preferred_factor=80.0, engine_preference=None, **config)¶ Manages engines and allows multi-lingual say()
preferred_languages
- A list of languages that are weighted in preference. This is a weighting to assist the detection of language by classify().
preferred_factor
- The weighting factor to prefer the
preferred_languages
list. Higher number skews towards preference. engine_preference
- Specify preferred engines in order of preference.
**config
Engine-specfic configuration, e.g.:
# Key is the engine SLUG, in this case ``espeak`` espeak={ # Specify the engine options: 'options': { 'enabled': True, }, # Specify some default voice options 'defaults': { 'words_per_minute': 150, 'variant': 'f4', }, # Here you specify language-specific voice options # e.g. for english we prefer the mbrola en1 voice 'languages': { 'en': { 'voice': 'english-mb-en1', 'words_per_minute': 130 }, } }
-
classify
(txt)¶ Classifies text by language. Uses preferred_languages weighting.
-
get_engine_for_lang
(lang)¶ Determines the preferred engine/voice for a language.
-
say
(txt, lang=None)¶ Says the text.
if
lang
isNone
, then usesclassify()
to detect language.
-
exception
talkey.
TTSError
(error, valid_set=None)¶ The exception that Talkey will throw if any error occurs.
Engine options:¶
espeak:¶
-
class
talkey.engines.
EspeakTTS
(**_options)¶ Uses the eSpeak speech synthesizer.
Requires
espeak
and optionallymbrola
to be available.Initialization options:
enabled
Is enabled?
type: bool default: True espeak
eSpeak executable path
type: exec default: [‘espeak’, ‘c:\Program Files\eSpeak\command_line\espeak.exe’] mbrola
mbrola executable path
type: exec default: mbrola mbrola_voices
mbrola voices path
type: str default: /usr/share/mbrola passable_only
Only allow languages of passable quality, as per http://espeak.sourceforge.net/languages.html
type: bool default: True
festival:¶
-
class
talkey.engines.
FestivalTTS
(**_options)¶ Uses the festival speech synthesizer.
Requires
festival
to be available.Initialization options:
enabled
Is enabled?
type: bool default: True festival
Festival executable path
type: str default: festival
flite:¶
-
class
talkey.engines.
FliteTTS
(**_options)¶ Uses the flite speech synthesizer.
Requires
flite
to be available.Initialization options:
enabled
Is enabled?
type: bool default: True flite
FLite executable path
type: str default: flite
pico:¶
-
class
talkey.engines.
PicoTTS
(**_options)¶ Uses the svox-pico-tts speech synthesizer.
Requires
pico2wave
to be available.Initialization options:
enabled
Is enabled?
type: bool default: True pico2wave
pico2wave executable path
type: str default: pico2wave
mary:¶
-
class
talkey.engines.
MaryTTS
(**_options)¶ Uses the MARY Text-to-Speech System (MaryTTS) MaryTTS is an open-source, multilingual Text-to-Speech Synthesis platform written in Java. Please specify your own server instead of using the demonstration server (http://mary.dfki.de:59125/) to save bandwidth and to protect your privacy.
Initialization options:
enabled
Is enabled?
type: bool default: False host
Mary server address
type: str default: 127.0.0.1 port
Mary server port
type: int default: 59125 min: 1 max: 65535 scheme
HTTP schema
type: enum default: http values: http, https
Voice options:¶
generic:¶
language
- Language of voice
voice
- Specific voice to use
espeak:¶
Config options:
pitch_adjustment
pitch_adjustment option
type: int default: 50 min: 0 max: 99 variant
variant option
type: enum default: m3 values: , croak, f1, f2, f3, f4, f5, klatt, klatt2, klatt3, klatt4, m1, m2, m3, m4, m5, m6, m7, whisper, whisperf words_per_minute
words_per_minute option
type: int default: 150 min: 80 max: 450
TTS Engines¶
Engine interface¶
-
class
talkey.base.
AbstractTTSEngine
(**_options)¶ Generic parent class for all speakers
-
SLUG
= None¶ The SLUG is used to identify the engine as text
-
classmethod
_get_init_options
()¶ AbstractMethod: Returns dict of engine options
-
_get_languages
()¶ AbstractMethod: Returns dict of supported languages and voices
-
_get_options
()¶ AbstractMethod: Returns dict of voice options
-
_is_available
()¶ AbstractMethod: Boolean on if engine is available
-
_say
(phrase, language, voice, voiceinfo, options)¶ AbstractMethod: Let engin actually says the phrase
Phrase: The text phrase to say Language: The requested language Voice: The requested voice Voiceinfo: Data about the requested voice Options: Extra options
-
configure
(**_options)¶ Sets language-specific configuration.
Raises TTSError on error.
-
configure_default
(**_options)¶ Sets default configuration.
Raises TTSError on error.
-
classmethod
get_init_options
()¶ Returns a dict describing the engine options.
Uses cls._get_init_options()
-
get_languages
()¶ Returns dict of supported languages and voices.
Raises TTSError if not available.
-
get_options
()¶ Returns dict of voice options.
Raises TTSError if not available.
-
is_available
()¶ Boolean on if engine available.
Checks if enabled, can output audio and self._is_available()
-
play
(filename, translate=False)¶ Plays the sounds.
Filename: The input file name Translate: If True, it runs it through audioread which will translate from common compression formats to raw WAV.
-
say
(phrase, **_options)¶ Says the phrase, optionally allows to select/override any voice options.
-
Creating your own engine¶
Subclass talkey.base.AbstractTTSEngine
, and provide the abstract methods:
from talkey.base import AbstractTTSEngine
class SampleTTS(AbstractTTSEngine):
SLUG = "sample"
@classmethod
def _get_init_options(cls):
# Engine options
return {
'enabled': {
'description': 'Disabled by default',
'type': 'bool',
'default': False,
},
}
def _is_available(self):
# Checks for engine availability/readiness
return True
def _get_options(self):
# Same format as _get_init_options
# This is the voice options
return {
'mooing': {
'description': 'Cows sound effect',
'type': 'bool',
'default': False,
},
}
def _get_languages(self):
# Dict of languages containing voices
return {
'en': {
'default': 'english',
'voices': {
'english': {
# Any extra options describing this voice
# (for private use)
},
'cowlish': {
# Any extra options describing this voice
# (for private use)
}
}
},
...
}
def _say(self, phrase, language, voice, voiceinfo, options):
# Actually run the phrase through the TTS Engine.
# All parameters will be always provided for you
...