Tagline

The Studio of Eric Valosin

Saturday, October 15, 2016

Processing and FreeTTS speech synthesizer - voice, pitch, tempo, volume control

This summer I participated in an experimental sound art exhibition, and in my quest to create a computer vision based Text-to-Speech sketch in Processing, I discovered there are really no good options out there.

The FreeTTS java library is more than sufficient in theory, but I could not find a Processing friendly wrapper that actually takes advantage of all FreeTTS has to offer.

Luckily, 

deep in the recesses of the Processing forum, the user Kof offered this "messy class called Basnik." in response to another user baffled by the same problems.

Kudos to Kof and his brilliant tutorial. He said to "modify as you need," so I did:
I'm not a polished full-stack programmer by any stretch, but this should get the job done.

Here's a cleaned up, beefed up version of his Basnik class that allows for dynamic control of the voice, pitch, volume, tempo, range, transposition, and more.

  1. Follow his advice and download the FreeTTS speech synthesizer

  2. Put all the .jar files you download inside a folder called "CODE" within your Processing sketch's folder

  3. Open a separate tab in your Processing Sketch and paste in this class:
  4. /* THANKS TO KOF FOR SHARING HIS "BASNIK" CLASS AND TUTORIAL SKETCH ON THE PROCESSING FORUM:
    https://processing.org/discourse/beta/num_1204335245.html
    REVISIONS MADE BY ERIC VALOSIN. FEEL FREE TO USE AND ADAPT.
    REVISIONS:
    - I put the Basnik (which, in case you were wondering, is Slovak for "poet") class in separate tab to tidy up the sketch
    - separate instances of the class can be threaded together for polyphonic effects
    - added missing freeTTS controls for volume and rate.
    - moved setPitch, setPitchRange, setPitchShift, and setRate controls out of the constructor's setup phase
    and made them custom functions so each instance of the class can be dynamically and individually controlled
    in the draw phase by calling a function (ex: basnik1.setPitch(300); basnik2.setPitch(150);
    - I know freeTTS allows for output speaker pan control with speakLeft() and speakRight() functions - I could not
    get these to work here. Maybe you can.
    */
    // CLASS CONSTRUCTOR /////////////////////////////////////////////////////////////////////////////////////////////////
    public class Basnik {
    String voiceName = "kevin16";
    VoiceManager voiceManager;
    Voice voice;
    Basnik(String name){
    voiceName = name;
    this.setup();
    }
    void listAllVoices() {
    System.out.println();
    System.out.println("All voices available:");
    VoiceManager voiceManager = VoiceManager.getInstance();
    Voice[] voices = voiceManager.getVoices();
    for (int i = 0; i < voices.length; i++) {
    System.out.println(" " + voices[i].getName()
    + " (" + voices[i].getDomain() + " domain)");
    }
    }
    void setup() {
    listAllVoices();
    System.out.println();
    System.out.println("Using voice: " + voiceName);
    voiceManager = VoiceManager.getInstance();
    voice = voiceManager.getVoice(voiceName);
    // I haven't gotten setStyle to make any difference, but maybe it depends on the voice used
    voice.setStyle("casual"); //"business", "casual", "robotic", "breathy"
    // (The other control functions like setPitch were previously listed here. See below)
    if (voice == null) {
    System.err.println(
    "Cannot find a voice named "
    + voiceName + ". Please specify a different voice.");
    System.exit(1);
    }
    voice.allocate();
    }
    // CUSTOM FUNCTIONS to allow instances to have their own pitch, range, tempo, volume control//////////////////////////
    void setPitch(int p){ // root pitch in Hz. (261.626 = Middle C)
    voice.setPitch(p);
    }
    void setPitchShift(int s){ // transposition. Same scale as setPitch
    voice.setPitchShift(s);
    }
    void setRange(int r){ // range of pitches within a phrase. 1 = monotone, 30 = roughly natural
    voice.setPitchRange(r);
    }
    void setVolume(float v){ // output volume. 0 = mute, 1 = full volume (.5 is still barely audible on computer speakers)
    voice.setVolume(v);
    }
    void setTempo(int t){ // rate of speech in words per minute. 150 = natural default.
    voice.setRate(t); // (Listed as "setSpeakingRate()" in the FreeTTS API documentation)
    }
    // PLAYBACK FUNCTIONS ////////////////////////////////////////////////////////////////////////////////////////////////////
    void say(String _a){
    if(_a==null){
    _a= "nothing";
    }
    voice.speak(_a);
    }
    ////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
    void exit(){
    voice.deallocate();
    }
    }
  5. In your main code, use this as a basic template:
  6. import com.sun.speech.freetts.Voice;
    import com.sun.speech.freetts.VoiceManager;
    import com.sun.speech.freetts.audio.JavaClipAudioPlayer;
    Basnik voice1;
    Basnik voice2;
    void setup() {
    size(640, 480);
    voice1 = new Basnik("kevin16");
    voice1.setRange(1);
    voice2 = new Basnik("kevin16");
    voice2.setRange(1);
    }
    void draw() {
    // call functions to set the voice controls. Use variables to set them dynamically
    voice1.setPitch(220);
    voice2.setPitch(330);
    voice1.setVolume(1);
    voice2.setVolume(1);
    voice1.setTempo(150);
    voice2.setTempo(150);
    // speech
    voice1.say("this is voice 1");
    voice2.say("this is voice 2");
    // harmony
    thread("voice2"); //second voice threaded to get both voices to speak at once
    voice1.say("this is harmony");
    }
    void voice2() {
    // delay(50); //may or may not need a small delay to get the voices in sync
    voice2.say("this is harmony");
    }
  7. create as many instances of Basnik as you want - you can thread them together to create polyphonic effects, use variables for dynamic control of the voices, and iterate through string arrays ("String[]") of words so you can change those controls mid sentence.

I hope you find this useful. Feel free to use and adapt this and make it even better.

No comments:

Post a Comment