Unsupported — We cannot guarantee that this software will work properly on Mac OS 10.8 and above. Please be careful.

Release: Speech Synthesis, v0.3

Release Type: Production
Version: 0.3
Release Notes

Speech Synthesis Patch, v0.3

This update to the Speech Synthesis Patch includes:

[ More info on this patch ]

PreviewAttachmentSize
SpeechSynthesisPatch-0.3.zip15.36 KB
SpeechSynthesisPatch-0.3-src.zip55.71 KB

tobyspark's picture
Not leopard compatible?

looking at the release date, this looks like a tiger-era patch. i just tried it in leopard, and the patch created itself on screen, but on trigger no sound came out.

...i used speech synth as part of the hillegass cocoa book, dating from 10.3, so i guess its something silly thats stopping this from working as opposed to some huge synth api change.

toby

cwright's picture
speech

Most of our tiger patches don't work in Leopard out-of-the box because of some subtle internal api changes in QC (we've since learned how to make plugins more cross-platform friendly, but this, as you said, predates that research).

the reason we haven't updated this patch is because of the built-in (or maybe sample?) plug-in that comes with QC leopard. If there's some obvious difference between the two, or you just think ours is cooler or something, let me know and I can spend a bit of time freshening the plugin for leopard usage.

tobyspark's picture
of course...

...i forgot it was a leopard example, thanks.

don't waste a second, there are far more glamorous plugins to work on!

toby

scalf's picture
Can't join the reindeer games

I really thought I knew QC better than this.

I downloaded the plugin and source, moved over the plugin to exactly the right spot and it will not work. I tried all plugins offered on the page in both 32/64 and within the patch folder from user level and system level. But no go...

Every time, I load QC it is not listed in the library. I can only get the stock mac speech synth: which just says the words. I cannot see the one as illustrated on top. I have tried making sure the folder names were good, even other plugins within the patch folder work, so it is for sure a working folder.

Did anyone else have this much trouble?

I have worked through my solutions from before and am stuck, I am not sure what could be causing this trouble.

Casey

dust's picture
Re: Can't join the reindeer games

here is source to an official api speech synth that outputs phonemes like this patch.

http://kineme.net/composition/dust/phonemeopcodespeechsynthvideoavatar

you will need to build the plugin for your computer. the example is a talking video. it doesn't have word output just phonemes as that is better for animation.

however you could queue up the opcode output and then compare it to phonemes from your language model.

see the list posted in the thread.

if phoneme opcode "14" is followed by phoneme opcode "2" then that means the word "bat" has been spoken. this type of phoneme language model is how you do speech recognition as well.

the basic idea.... "kineme" is composed of these phonetic symbols "k IH N EH m EH EY" these symbols all have an opcode or number. so your interpreter or recognizer basically waits for a k followed by an LH followed by an N etc....

scalf's picture
Re: Can't join the reindeer games

Thanks Dust, that clarified the opcode a bit, which is rather useful I must say. Definitely got that one to work.

With that said, I was hoping for a bit of clarification on the Kineme SpeechSynth patch. Does this automatically convert the phonemes to words? If I you had any inclination that would be great.

Also, do I need to build Kineme's Speech patch for my computer? I assume you mean go into Xcode and run/build? Are those all of the operations necessary...

Casey

dust's picture
Re: join the reindeer games, with spoken word plugin

Not sure the system your running. I tried to build this speech plugin but it didn't show up. Seems to be a tiger era patch, that's way before my time. I'm thinking you would have to download the latest skanky sdk and the convert this plugin to be snowleopard comparable.

You are just looking for the words output ?

No the speech synth doesn't convert phonemes to words. It takes text input in the form of words then speaks and generates the phoneme output. Speech recog converts phonemes to words though ;).

I'm thinking the word out put is just the word being spoken. To me the opcode is more interesting as I already know the words because I typed them in. I do however suppose it would be useful to know when a particular word is being spoken to then conditionally do something based on that info.

I can add word output to the official sdk phoneme plugin I made. I'm thinking that may be a lot easier than trying to refactor this patch. Shouldn't really be any trouble at all just got to add an output and query the speech synth manager.

(edit) i had a look at the kineme patch here and i don't think it would function the way you want, meaning it is returning a number value for the words output much like opcode output.

however the speech manager does provide a delegate method for spoken words that i have implemented into a plugin so no worries it is possible. it will now output the word as a string right before it is synthesized. not sure how you want to use this but you can now visualize as text. ;)

see attachment for plugin source, plugin, example etc.. you may have to rebuild it for your machine. quit quartz composer and open the src for plugin in x code 4. then just build/ run. it will install the plugin for you and open qc to test ;)

(end edit)

PreviewAttachmentSize
spokenWordsPlugin.zip49.46 KB

chiara's picture
Re: Release: Speech Synthesis, v0.3

Hi there,

Some news regards the Leopard version for this patch? Suggestions on similar patches to use?

Thanks in advance, Chiara

monobrau's picture
Re: Release: Speech Synthesis, v0.3

I'm still using the Xcode 3.1 sample plugin for speech synthesis, it works on SL (32-bit mode), should work on leopard as well. I've enclosed the source+build

PreviewAttachmentSize
SpeechSynthesis.zip2.39 MB

gtoledo3's picture
Re: Release: Speech Synthesis, v0.3

Here's a universal build... it should work 32bit/64 bit, leopard or sl. I've only tested it in SL 64bit though.

PreviewAttachmentSize
SpeechSynthesis.plugin.zip8.17 KB

cybero's picture
Re: Release: Speech Synthesis, v0.3

Works in both 32 bit and 64 bit .

Coupled that up with an incorporation of your OpenCL LERP Morph setup , GT and Kineme Audio Tools Input patch. :-)

& a GL Tools friendly version [for usefuldesign.au and other Leopard users]

PreviewAttachmentSize
WigWamBamCL.qtz168.63 KB
WigWamBamGL.qtz82.62 KB

chiara's picture
Re: Release: Speech Synthesis, v0.3

Hi Cybero, Do you konw how record voice (from audio input, internal mic.)? And to visulize as out put (for ex. on billboard or sprite) the words/phrases? I've try the speech s. and recogn. but seems not works so fine. The patch recogn. crashes :-( I work on leopard. Thanks in advance for your suggestions. Best, Chiara

cybero's picture
Re: Release: Speech Synthesis, v0.3

Quartz Composer visualizing -[what is built to do] - input data - in this case audio - [which it has in built patch support for the input of] - I prefer to use Kineme Audio Tools to capture audio and push data into the graphic patch setup to be visualized.

If the audio is auto generated by Speech Synthesis , or input via a mic, then I think SoundFlower will provide a means by where I use QuickTimePlayer or Audacity to record via the SoundFlower.

With an Audio Tools Audio Input patch to provide the audio data, in this case with a track running in iTunes, System Sound Preferences set to SoundFlower and the Recording preferences in QuickTime Player set to SoundFlower, Play the music, visualize in QuartzComposer and record audio in QuickTime Player.

Funny thing is the composition visualizes , the track records and plays back , excellent quality of audio, works with SpeechSynthesis too, though a little clipped, still working to resolve that, but it records even though I can arrange matters so that the track plays to all apparent intent quite mute, still visualizes, still records.

Putting this through a dedicated Audio Recording program is going to work better than what I've done, but with straight audio file playthrough with input and output set to SoundFlower, the QuickTime Player method has gleaned good results.

chiara's picture
Re: Release: Speech Synthesis, v0.3

Thank you very much for your suggestions, I'll try quick time to record and sound f. to pipe back the sound.

best regards, Chiara

cybero's picture
Re: Release: Speech Synthesis, v0.3

you could also try audacity or garageband.

chiara's picture
Re: Release: Speech Synthesis, v0.3

I'm doing some test but... not yet good result. What about AU Lab? We have it as default on osx. I've attached an image showing that the audio input, with spectrum, etc. doesn't visualize the voice but numbers from 0 to 9 because is a 'string'.

PreviewAttachmentSize
Immagine 1.png
Immagine 1.png301.27 KB

dust's picture
Re: Release: Speech Synthesis, v0.3

I think your a bit confused the audio input patches provide you spectrum data to visualize or create visualizations based on peak amplitude or audio spectrums. Which are provided in a mimetic format so you can hook the spectrum up to most things in qc. If you are wanting to visualize the voice as words or spoken words you will need the speech recognition patch. Or may have to make one of your own that suites your needs. Basically with the kineme recognition patch you add you words as a language model. When the word is recognized it gives you a Boolean truth value letting you know that it has detected your word in the language model. You would then use this Boolean truth value to trigger a synth patch to speak the word and also to trigger a sprite with a string to image patch on it that has the word from your model already loaded up to display the word.

chiara's picture
Re: Release: Speech Synthesis, v0.3

Thank you for the reply, Dust.

Yes, with the kineme s. recognition patch you have to WRITE the words in order to create a model for the recognition. No way to bypass this model to write in Q.C.? I think no... just using Processing with the library VOICE. Anyway, as you can see from the comp. I've attached, it doesn't works, even if the multiplexer is BOLEAN. I've prbl. with the speakable itams, that, even if it is activate, doesn't recon. the words I pronunce. Thanks again, C.

PreviewAttachmentSize
voice.qtz3.5 KB

Installation Instructions

Place the plugin file in
/Users/[you]/Library/Graphics/Quartz Composer Patches/
(Create the folder if it doesn't already exist.)