"Waveform" Structure.. what does that data represente

mrpackethead's picture

I'm just looking at the output of the Kineme Audio File, in Particular the 'Waveform' structure..

What does the data in that structure represent?

cwright's picture
meanings

The structure is arranged as a bunch of channels, which are arranged as a bunch of samples.

Channels typically mean "Left" and "Right" (though they're numbered 0-n, to support arbitrary numbers of channels), so the substructures are named "channel00" - "channelxx".

Inside each "channelxx" structure, there are 512 numbers, 0 - 511. These represent sample data in order (sample 0 is the first sample, sample 1 is the next, sample 511 is the last).

If you're unsure what sample data means, you should read up on PCM here: http://en.wikipedia.org/wiki/Pulse-code_modulation

mrpackethead's picture
structure..

Oh, ok, that's sort of sensible. From that data I might be able to do something useful with it, like pass that data to a Fast Fourier Transform..

I'm starting to get a handle on Quartz, its a way lot more powerful than i had ever thought it would be..

cwright's picture
FFT

In a future version, we'll probably be providing FFT data as an output from these patches -- doing it by hand in QC right now is largely academic, and almost certain to be too slow to be useful.

TheRandomDude's picture
tone

Even taking it to tone detection would be a step up (if not a giant leap) from the rather one dimensional volume detection. Any way that could be easily implemented?

cwright's picture
stil fft

tone detection requires checking the frequency domain, rather than the time domain, of the audio -- thus, FFT is still necessary :) It's not exceptionally difficult (Accelerate.framework has some FFT functions), we just need to learn how to use it (in our ever decreasing amount of free time...)

mfreakz's picture
Tone Detection !

Year ! Tone Detection could be a great feature for the next version. Everybody can imagine what we could do with this output... Could it works in real time ? (with audio input patch)

cwright's picture
of course :)

FFT's aren't too expensive, so this could probably work perfectly fine in realtime.

toneburst's picture
Scaled Pitch Output

A pitch output scaled to MIDI note values 0 > 127 would also be potentially quite cool. I could envisage a quantised output as an integer between 0 and 127, plus pitch-bend data on a separate output which could optionally be combined with the MIDI note data to more accurately reflect the pitch of the signal.

Just random thoughts...

a|x http://machinesdontcare.wordpress.com