QuartzCrystal: Non-Linear recorder

mfreakz's picture

Hi there, Here is an idea for a new feature in QuartzCrystal.

THE IDEA: QuartzCrystal is the best way to translate QC compositions into Quicktime movies, but there is a limitation (like quicktime for sure): There is nomore interactivity in our compositions. I don't talk about interaction in the final movie (i know, video is a linear media ;) But it could be great to record our interactions (live) before rendering the resulting movie. Most of our compositions are interactives. It could be a midi-in signal, keyboards keys, a video input, an audio spectrum, etc...

THE FEATURE: It could be a sort of Events Sequencer with "Start/pause, stop, reset" options (with customizable Keyboard Shortcuts) that records our interactions/modulations before rendering.

WHY IT'S ESSENTIAL: - For those who made linear animations there is no need of this "Event sequencer" feature, but there is also no need of Quartz Composer itself !!!! QC doesn't generate more or best animations than basic Compositing softwares. QC is great for interactivity !

  • For the others (i mean the most of us): Either we make composition only for interactivity (and then we don't use QuartzCrystal's Hi-quality rendering to put a crappy video on youtube !) Or we want to render a Hi-quality movie to intergrate it in another composition or another software and we want to "keep our hands" on our interactivity during the process.

In my opinion, if you consider that for 2D or 3D compositing you may use a better compositing software, interactivity is the only reason that we should use QC to create linear videos. QC and QuartzCrystal could be a great couple of real-time compositing softwares in that way.

THE LIMITATION: Recording audio and video inputs before rendering them by composition in a Quicktime movie could be difficult for starting (but it should works for shure) but we can begining with recording some simple Events like keyboards strokes, mouse mouvment, midi in, etc...

I'm not a programmer, so i can't help doing that feature... I would like to know what do you think about this idea ?

Mr Freakz.

PS: Excuse my poor "Frenchy English" !!!

cwright's picture
pseudo

This is on our road map, don't worry.

The problem (and it's a huge one) comes when we try to determine a few things:

First, is a composition at all interactive? We can check for interactive patches manually, but for unknown ones (3rd party plugs, for example), we'd just be guessing, and likely wrong.

Second, even if we know what patches are interactive, we don't know where they get their interactivity from (camera, mouse, keyboard, midi, osc, etc), and we'd have to write a wrapper for every single one to record every kind of data. This is possible for simple built in ones (keyboard, mouse, video, audio), but impossible again for plugs.

Third, even if we're able to find all interactive patches with certainty, and record input of every kind perfectly (or, with acceptable quality), we have no clean way of patching that into the composition. We'd have to modify the composition (in-memory most likely) to put in our fake patches in place of the real ones, and get all the noodles right. And all the side effects (quick: does video go blank when capture is disabled, or does it retain its last captured frame? Was it that way for every version?).

Combining all of these problems, and an entirely automatic solution doesn't seem realistic (QC wasn't designed for this. If they added a few layers that allowed interception/injection of interactive data, this would be easy, but they didn't, so it's not).

But fear not: I've got an alternative solution. It's not as elegant for sure, but it works around all of the above problems. It will be a new patch called "Value Historian". It will look like a large pass-through patch (1 input for 1 output), with a couple control inputs (Record, Play, Pass-Through). You, as the composition designer, will have to determine where your interactive stuff comes from, and will then pass it all through the Historian. It will record all the input values for each frame. In play mode (which can be triggered by using some currently undocumented QuartzCrystal features ;) it will ignore the real input present, and will output what it recorded. It will approximate between frames, of course (either by Lerping, or by nearest neighbor, depending on which makes the most sense), allowing for interactive stuff to be captured live and then re-rendered offline with impossibly cool settings :)

Thoughts on this analysis?

gtoledo3's picture
Chris, I think that is a

Chris, I think that is a good approach.

What would be ideal (for me) is if you threw a .qtz in something like Quartz Crystal (gui wise) where you can "see" what is published.

The rest would resemble audio automation, where you could even do multiple passes. So I could do all of the "fader" automation in one pass, but also "punch in" at a particular frame, and re-adjust.

Just being able to render with something of a quality of Quartz Crystal, with a similar interface, and being able to control whatever is published on the fly as the .qtz plays would be great, even without the idea of it resembling audio mixing board automation, ie "flying faders".

But what you are describing, in its exact implementation, is amazing. I am really dying for that.

mfreakz's picture
What about a good managed Root Macro Patch ?

Thank you cwright ! Well, it's late in the night in France now, so my "Brain frame rate" is falling down ! But i think i understand what you plan to ad to QuartzCrystal and it could be great. Just a last idea before sleeping: If i follow you explanation, one of the major difficulty is to find what is interactive in a composition and what kind of input is in use.

Well, maybe i'm wrong, but when a composition designer work on screensaver, for exemple, he let you introduce boolean and numerical value in the screensaver settings. It's the same thing when you play composition in "Quartz Desktop" or "Desksaver" you could manage with settings. Those parameters are the first input in the composition in the Root Macro Patch. Why don't follow that way ? Those "settings" could be variable in real time not ? (Maybe i don't understand something !)

If all the variable inputs of the composition are in the Root Macro Patch (Using Input Splitter) and connected to specific Controllers, QuartzCrystal could receive and record this data not ? Even audio/video inputs could be in especialy include in to the root not ?

Well, i'm shure you have already think about that... Sorry... I'm a dreamer...

tobyspark's picture
seconded

hello, i've been writing a near identical proposal to crystal in my head, just haven't had the time to put pen to proverbial paper.

through the lens of my particular usage case, crystal is useless to me unless it can render audio-analysis, at which point it enables a whole plateau of production possibilities. i've been able to tinker a bit with that - recording the screen with ishowu was ultra painful for the one paying job i said i could render out reactive visuals for. i swore not to try it again unless there was a proper frame-by-frame renderer.

interactive control data is relatively easily solved as you outlined - you could possibly even hack it yourself as a user by dumping a queue patch to xml - but the audio analysis is slightly more involved as we've discussed previously around here. that case might warrant a dedicated patch you swap out the native audio input patch for.

but yep, qcrystal+audio+recordinginput would make it a professional tool and i'd pay an order of magnitude more for it.

fwiw, i had great hopes for apple's motion in this regard. currently, it will render out a non-interative qc comp, and if you get noise industries' fxfactory you can publish inputs into motion's interface, including audio analysis. however, motion's audio analysis implementation is backwards for patching it in making it all an exercise in futility.

cwright's picture
difference

How is audio any different from any other input data? It's a tiny structure and a number, as far as I recall, and saving that to disc is pretty simple, just like the other forms of input.

The only difference I can think of would be feeding it perfect audio data for analysis instead of what's recorded from the mic. This would require a separate patch, as you mentioned, as well as some reverse engineering to find out the frequency ranges of the current patch (I'm guessing each band is about the same size, and covers the full spectrum, making it simple to acquire via an FFT) And even this can be accomplished by recording with Value Historian and SoundFlower without any new patches or composition fiddling.

Are there any other reasons why it would need to be treated specially?

tobyspark's picture
...perfect audio from a file

needs to be treated specially if the aim is to create perfect audio-reactive rendered output, my use would be motion graphics but lets say an oscilloscope as an example. as the temporal resolution of audio is order upon order of magnitudes greater than human input, capturing it live in a live qc patch would tends towards meaningless data as the sample rate would fluctuate and best case of 60hz is nowhere near 48000hz.

so my suggestion would be the equivalent of the audio analysis patch that computes its output from a file, knowing what time its being called ('cos its a plugin) and what framerate the movie is being rendered at (to gather the data for a video frame's worth of audio samples).

and talking about framerate, have you considered motion blur - shutter degrees and number of subsamples?

toby

cwright's picture
timespace

Re: Motion Blur (for 1.1 release, relatively soon-ish)

The AudioVideo plugin already does close to sample-correct recording, so ~60fps is possible with that. If you disable VBL sync, you can get significantly higher framerates if the composition is fast enough. 48000Hz of sample rate data in the stock Audio input patch would be meaningless, unless you're suggesting rendering 48000 frames per second and bluring them (that's about 48 times more than our current highest motion blur setting, and would take embarassingly long to render). This also plays hell with compositions that attempt to track frame counts (infiniteSprites does, for example).

plugins don't know the framerate they're running at (time is an input, but they can only know how long since the last frame, but not how long this frame will be. There's no guarantee that time will move forwards, or that it will move in uniform steps).

Then we also run into the degenerate audio case:

assuming an output video of 60fps, and 512 samples of motion blur, the composition is rendered at an effective frame rate of 60*512 = 30720 fps. At that kind of frame rate, the number of audio samples per frame would be 2. If the audio was 96kHz, it would be 3 or 4. This is laugh-out-loud ridiculous for demonstrating audio visualizations :) I think AudioVideo's model is perhaps best suited for this; a constant number of frames that represents the last N samples recorded. This is frame rate invariant, doesn't degenerate under any form of load.

psonice's picture
Motion blur / audio levels

2 quick points: first, your example there with the motion blur would be great! You'd get much more realistic movement for some cases. A common one: you're animating a speaker. You normally blur it by an average (or worse, point) sample for the frame. If you rendered a frame per sample and combined them, you'd actually get very realistic motion blur. I guess though that this should be optional :D

Second: are you taking into account audio volume? The audio patches in QC do - you set something up to react nicely to sound, then turn the volume down and the effect is gone. You can get around it by scaling to the peak audio level, but it's far from perfect. Any way of getting around that for your playback stuff? Or would it be better to capture the output of the audio level patch say, and replace that patch for playback?

tobyspark's picture
"doesn't degenerate under any form of load"

supersupercool to see motion blur working

on the audio tip, it was largely the discussion that lead to the audiovideo patch i was thinking of, but hadn't got that far as i actually haven't used that patch in anger yet - that should actually change later today!

the thing that confuses me here, is that as i read it you seem concerned by varying framerate and load in the context of qcrystal. to me, the point of qcrystal's existence is that it doesn't render according to realtime constraints: you can create from any kind of composition a movie with each frame perfectly rendered and each frame perfectly positioned in time.

so, in essence, i want to be able to get super-whack with an audio-reactive comp, not worry that i get a framerate of 5fps running in realtime (iterator caused or not!), and then render it out using the highest quality form of the audio i have, ie the source file.

psonice's picture
Absolutely

Absolutely. The whole non-realtime thing actually makes QC a much more useful tool too.

As an example, the slitscan videos i've been doing are very likely impossible realtime. I'm getting 1fps at the moment, and that's with a low-quality 320x240 version.. for a more reasonable 640x480 it'd be at LEAST 1/2 speed, likely a lot more because of extra bus transfers etc. It's of no use to anyone unless I can render it out as a video and play it back at real speed.

The other thing is frame-rate dependent effects. For these same videos, I have a video playing, and a frame from the video gets added to a queue each time a frame is rendered in QC. If QC is running at 1fps, and the video is playing at 30fps, that means only 1 in 30 frames get added to the queue, and the resulting slit scan is 1/30 of the length it should be.. and looks predictably crap. In crystal, this just isn't an issue because I set it to render at 30fps, and that's what it does, even if it does take an age to do it ;)

Speaking of which, any chance of an fps counter in crystal? It'd be handy for working out how long it'll take (careful with any 'predicted time' type things though - for this composition the frame rate decreases exponentially with time for the first 20 seconds or so and I guess there will be plenty of similar cases :)

cwright's picture
wrong reference

I'm not concerned with the render load in QuartzCrystal (ram pressure is my current worry in that application) -- what I am referring to is your "frame's worth of samples" statement a post or two up.

If you render a composition at 48000kHz (for 1 frame per audio sample), you lose all frequency information because there's only 1 audio sample. 1 audio sample has no frequency information, because it's just 1 sample.

Example:
Here's the audio value: 42. What's the frequency of band 0-15? Show your work.

This is what I meant by "degenerate" -- lots of information is simply not available in this use case because of how the math works.

however, if it's rendered at less than 1 frame per audio sample (at 60fps, we get about 800 audio samples per frame), we can derive frequencies because we have a set of samples to work with.

we're trying to solve 2 different problems: 1) Getting raw sample data into QC (hint: Audio Input is not how to accomplish that, AudioVideo does that far more usefully), and 2) getting meaningful audio data on a per-frame basis (Audio Input does this well, as does AudioVideo. But not in Offline mode... I'm trying to solve problem 2, you're focusing on 1 :)

tobyspark's picture
recrossedwirescrossing

not entirely crossed wires - didn't mean rendering at 48000hz, just that that quantity of information had to be processed: "to gather the data for a video frame's worth of audio samples"

so the ideal scenario would be if the audiovideo patch magically realised it was within quartz crystal, or perhaps with 'quartz crystal' as a device that was hard coded into the list. [edited for sanity]

toby

psonice's picture
Tangent

On a slight tangent.. is there any way to change task priority easily on osx? If not, could the priority that crystal runs at be changed in some future version? I say that because right now it's absolutely killing my system, even typing is lagging, and I think it'll take about a day to finish this render :/

If there's no obvious reason for crystal to be doing that, could it be something else? I'm using probably 1gb of textures (per frame) on a 256mb video card, so I suspect the bus is panting hard too. And there's a 640 iteration iterator, so it's also straining the cpu hard :/

tobyspark's picture
time for a render bender

more seriously, look up "nice" or "renicer", but to be honest, i'd just accept your fate.

i wonder if software rendering would actually be quicker, if it has to shuttle 1gb of data onto the card every frame.

cwright's picture
wrong tree

A sluggishly responding system won't be restored (much) by task priority: sluggish means that you're paging (using swap on disk) hard, which means you're actually not using much CPU (because it's sitting there waiting for the harddrive to keep up). lowering task priority typically just lowers cpu usage (but that's already low due to swappage), and sometimes lowers task I/O. I've not seen any convincing demonstrations of IO throttling on low priority tasks though. (Linux nerds were waving their hands about this like 5 years ago, and I don't think anything came from that)

To add insult to inury, Leopard apparently did away with task priority handling stuff, so even low priority stuff takes an even load on the system (when it shouldn't). Lots of complaints there...

To re-nice it (as toby suggests), fire up terminal, and then find the QuartzCrystal pid (ps aux | grep QuartzCrystal -- the pid is the number right after your user name), and then do renice 20 -p <i>pid</i>. This will make it as low priority as possible. Maybe it'll help a bit...

[edit re: software render -- It usually only hits swap if you're doing high levels of supersampling. In those situations, the renderer usually drops to software rendering automatically because of the huge surfaces it draws on (>4096 pixels high or wide). shuffling GBs of data across the bus would suck for performance, but it wouldn't realistically stall your system unless it was paging (which actually stalls processes other than the one doing work)]

psonice's picture
No paging

It's not paging that causes it. I have 4gb of ram, 1.1gb free, no paging going on. When I wrote earlier, crystal was running ~100% cpu time (1 core fully occupied) so it's definitely hitting the cpu very hard. I quit the render after that (noticed a bug.. which reminds me, i fixed one and forgot the other ;( ).

I've tried renice and it's rendering away, let's see what happens.

update - system is now sluggish again including typing. I added 2x AA to the render this time, it shouldn't push it into software GL (i'm rendering at 640x480) but I think it's now GPU limited, crystal is now only using 70-90% cpu time. Disk activity is fairly flat (the odd read, and regular but small writes as it writes the video out), I have 600mb+ (it varies) of ram left. Page ins are at 776mb, page outs at 826mb, but neither are changing and the mac hasn't been rebooted for ages, so I don't think it's actually paging at all.

Perhaps it is just saturating the bus and that's crippling everything else?

cwright's picture
impressive

That's impressive... I've never seen any kind of workload drop a system without cheating and paging stuff. I guess it's possible then, and I stand corrected (what an interesting resource limit, having bus saturation stall the system...)

what happens if you try again without AA at all? Just curious to know how that affects things.

gtoledo3's picture
How is this coming along...

How is this coming along... I am actually trying to decide if I want to wait on rendering a bunch of stuff!

smokris's picture
QuartzCrystal 1.2 (with Motion Blur) released

We just released QuartzCrystal 1.2, which includes the motion blur feature (among several other things).

gtoledo3's picture
I have been really liking

I have been really liking the motion blur effect, which is used on the stuff I have posted recently. One of the BEST ideas ever.

The other thing that is truly awesome is the anti-aliasing.

If I may pry, what settings do you and Chris usually use.... is there a preferred video format or fps, or are you all finding it is different for each render?

cwright's picture
depends

For me, I typically use between 24 and 60 fps -- 60 looks beautiful to me, but sometimes it's just too crisp. Motion blur at 60fps can look really convincing sometimes :)

for AA, it's all over the place. I typically don't go higher than 4x (16x now), simply because it takes too long, and doesn't improve quality as much. I typically use 1.5x-2.5x -- that gives a fair amount of smoothing to edges and textures, without taking much longer. I didn't use any AA for the Clouds video.

If you're using motion blur, and everything is moving, you don't need to set AA very high because temporal bluring will take care of that for you. :)

smokris is a huge fan of the PNG codec (because it's lossless), I typically use MPEG-4 or H.264. However, we've recently uncovered some weird behaviour with QTKit (our video output backend) that prevents temporal compression, which makes the video files much larger than they need to be (for temporally compressed codecs like mpeg4). Looking into a good solution for this...

gtoledo3's picture
That makes sense, because I

That makes sense, because I rendered to mp4 for the first time last night, and the file seemed larger than I would have expected.

I figured as much about the motion blur/anti aliasing, and I am glad to see you reconfirm it. Some compositions do not seem to benfit from antialiasing at all....

I notice that the one thing that can cause errors is jacking up the anti-aliasing a decent amount on any composition that is particularly intensive anyway, ie. low frame rates in real time. Another thing that can sometimes cause errors is if I am doing a high quality render, and open up Quartz Composer with a fairly intensive file. I am far from whining, and this is actually expected behavior on anything that is having to crunch numbers like this. Any other tool I have had like this has been inferior...

Now you have me wondering about QTKit, and I am going to check this problem out, since I have been using this extensively as well.

cwright's picture
first, do no harm

With some compositions (specifically, ones that use fixed-pixel sizes and line art with the line patch) actually get pretty mangled by supersampling, since it makes them artificially smaller. The FPS patch also does this, which is kinda cute in some ways :)

We've isolated (And very recently found a fix for) a coreimage bug that causes it to use more memory than it's supposed to when motion blur is enabled. The overuse is proportional to the amount of motion blur, so 1024x gets really bad really fast. I'm doing regression testing on the fix we have, and will have a 1.2.1 out shortly if it all goes well. This should help with the intensive rendering stuff a bit.

In a nutshell, QTKit doesn't temporally compress images that are added with [QTMovie/QTTrack addImage:forDuration:...]. When you call these methods, it creates the specified encoder, encodes exactly one frame, and then closes the encoder. This makes every frame individually compressed ("I-frames", in mpeg parlance), which is actually kinda nice for quality, but really bad for fast decoding and low file size. The correct way to get temporal compression with QTKit is rather roundabout -- you have th addImage:forDuration: for all your frames to a temporary movie, then [QTMovie writeTo:...], which will then take your temporary movie, and run the whole thing through the encoder (not just each frame individually). This is a bit tricky to implement in QuartzCrystal's model, so I'm not sure how we'll handle that yet...

smokris's picture
Celluloid Freak

gtoledo3 wrote:
If I may pry, what settings do you and Chris usually use.... is there a preferred video format or fps, or are you all finding it is different for each render?

I tend toward 24fps --- in general I dislike the 'hyperreal' feel of 60fps.

..and given the lower framerate, more steps of temporal supersampling are necessary. I've typically used between 32x and 128x so far, depending on the content. As for spatial supersampling, typically 16x (in QuartzCrystal 1.2 units).

And, yeah, PNG is nice because it's lossless (though it's reallllly slow), so I often use it for intermediate work.

Then, typically, for my VJing work, I render the final version as "Motion JPEG", as I've found it to be one of the more efficient-to-decode codecs, while still providing good compression ratios and video quality. Plus it doesn't do temporal compression, which is good for random-seeking in video files (which is sometimes useful when VJing).