Wilson Mar bio photo

Wilson Mar

Hello. Hire me!

Email me Calendar Skype call 310 320-7878

LinkedIn Twitter Gitter Google+ Youtube

Github Stackoverflow Pinterest

Make your Pi say things it doesn’t mean

Gitter

Overview

This page describes ways to get your Pi to turn text to speech you hear on a speaker.

Cloud-based services leverage powerful servers to provide the most precise speech synthesis.

But programs can run locally on board the computer.


Programs that run on-board the Pi to output voice include Festival and its derivative Flite.

Festival

Festival is written by The Centre for Speech Technology Research at the University of Edingburgh (UK). It offers a framework for building speech synthesis systems. It offers full text to speech through a number APIs: from shell level, via a command interpreter, as a C++ library, from Java, and an Emacs editor interface.

Festival is multi-lingual (currently British English, American English, and Spanish. Other groups work to release new languages for the system. Festival is in the package manager for the Raspberry Pi making it very easy to install.

  1. Install

    sudo apt-get install festival -y

    -y avoid confirmation of 19.2 MB disk space usage.

  2. Identify voices from

    https://packages.debian.org/jessie/festival-voice

    NOTE: A 16khz sample rate is clearer than 8khz, but require more disk space and takes up more CPU.

  3. Install a voice file specifically for processing by Festival on Debian:

    sudo apt-get install festvox-rablpc16k -y

    This “British English male speaker” voice takes 9 MB.

    PROTIP: Only one voice is needed.

  4. Send from command line:

    echo “Hello Wilson!” festival –tts

    NOTE: There may be some “electrical” sound behind a robot talking quickly.

  5. Use a Chrome browser to see the documentation and on-line demo:

    http://www.cstr.ed.ac.uk/projects/festival/

    The Firefox browser needs a plug-in to be installed.

  6. Python code to invoke TTS from text in a variable and in a file:

    import subprocess
    text = '"Hello world"'
    subprocess.call('echo '+text+'|festival --tts', shell=True)
     
    text = '"You are listening to text to speech synthesis using Festival package from the University Edingburg in the UK."'
    filename = 'hello'
    file=open(filename,'w')
    file.write(text)
    file.close()
    subprocess.call('festival --tts '+filename, shell=True)
    
    
    
    

Flite

Flite is a lighter version of Festival built specifically for embedded systems. It runs faster than Festival because it doesn’t have Festival’s complex scripting language or phoneme handling.

eSpeak

http://espeak.sourceforge.net/

Amazon Polly

https://aws.amazon.com/polly/ is a service that uses advanced deep learning technologies to synthesize speech across 24 languages. It emits sounds using 47 lifelike voices human voices.

Type text, select a language and voice, then click to speech at
https://console.aws.amazon.com/polly/home/SynthesizeSpeech

The default English, US has more voices than other languages:

  • Ivy sounds like a young female
  • Justin sounds like a young male
  • Joey

I like hearing British Amy, who has a more breathy voice than British Emma.

Select English, Indian and Raveena speaks with an Indian accent.

Edit the SSML to vary the sound or upload the whole lexicon, which can be up to 4,000 characters and 1,000 rules.

What amazed me is that English text is translated before being spoken.

Payment is by the number of characters converted to speech. Sound files reused do not incur a cost. Sounds can be saved in MP3, OGG, and PCM formats (at 8,000, 16,000, and 22,050 Hz).

Use AWS Lambda to generate pre-signed Polly URLs based on events from the AWS IoT rules engine, then use Device Gateway to send these URLs to your IoT devices to allow them to request lifelike speech.

https://portal.aws.amazon.com/gp/aws/developer/registration/index.html

Polly is within Amazon’s Artificial Intelligence services that include Lex (to build chatbots), Rekognition (to ecognize objects and scenes), and Machine Learning.

Others

  • AT&T

  • IBM Watson

    https://dzone.com/articles/integrating-watson-text-to-speech-into-an-android

  • Google

  • Microsoft


More on IoT

This is one of a series on IoT:

  1. IoT Acronymns and Abbreviations

  2. IoT Apprentice school curriculum
  3. IoT use cases
  4. IoT reminders prevent dead mobile battery
  5. IoT ceiling dumper

  6. IoT text to speech synthesis
  7. IoT AWS button
  8. Intel IoT
  9. IoT Raspberry hardware
  10. IoT Raspberry installation

  11. IoT Clouds
  12. Samsung IoT Cloud

  13. Predix basics
  14. Predix installation
  15. Predix services
  16. Predix programming