OpenCL Image Processing Routines

cybero's picture

Please find attached an example file that exhibits the use of six OpenCL image processing routines.

They have all been ported over from noise_kernels.cl example from Apple's developer code examples.

They will happily take a static image or a moving one.

The posted example uses a static image, the online screen grab posted to vimeo uses a video feed [some HD footage of my own] and the output of the kernels is used there as a mask image to the main image, which is also the product of a CoreImage processing routine that inverts the source image .

PreviewAttachmentSize
OpenCLImageRoutines.qtz203.22 KB

toneburst's picture
Re: OpenCL Image Processing Routines

These look cool, thanks!

I wonder what a side-by-side speed comparison between those and CIFilter and GLSL versions of the same algorithms would show. Since they're all supposed to run on the GPU, in theory, there should be no practical speed difference between them. Wonder if this is actually the case.

a|x

cybero's picture
Re: OpenCL Image Processing Routines

Do you know what, I haven't actually got a straight CL or CI choice that are or would be directly comparative.

Optical Flow, for instance; my OpenCL might take the same number of image inputs as the CI version, but it has a whole lot of different parameters to it.

I don't think there would be too much discernible difference until you asked the routines to deal with large amounts of data, then I would expect OpenCL to begin to win out.

Be interesting to create some well designed CoreImage versus OpenCL experiments.

I had been trying to see what sort of stable equivalents to CoreImage processes I could make and have a fair old number of them under the project Tonic, which I could post some of the results of here.

However that was simply to create a CoreImage variant of an idea that had it's beginnings as an OpenCL image visualizer.

Too complex [& somewhat procedurally different] to compare.

Also with Core Image one is pushing pixels from start to finish , more or less.

With OpenCL, one might well be creating a mesh derived from image data and then outputting that to a mesh, upon which the image can be redrawn from buffer. OpenCL can __rd and __wr, but it also does outputs that CoreImage doesn't do, even if both are being looked at purely from a graphics creation / processing perspective.

///////////////////////////////////////////////////////////////////

& now I've had my first cup of tea and a spot of breakfast - no wonder there'd be little discernible difference when it comes to the sort of things CoreImage does, for wasn't it upgraded with OpenCL in 10.6.x ?

Still, some simple comparatives would be interesting, say which simple iterated [CI or CL] image works best over time. Same source image input.

toneburst's picture
Re: OpenCL Image Processing Routines

I know sometimes the three formats aren't directly-comparable. In this case though (2D image-manipulation), they could be compared quite easily, since you should be able to do exactly the same things in all three languages (subject to CoreImage Slang limitations, obviously).

My instinct would be to go with CIFilters for strictly 2D-only image manipulation where possible, use GLSL for vertex-manipulation and per-pixel lighting, and OpenCL for Mesh-manipulation and creation, possibly particle systems, for dealing with large arrays of data, and converting between image and mesh formats.

I'm also considering attempting to port a raycast shader to OpenCL at the moment. I'm thinking I could then write to multiple output textures and do some kind of fancy deferred-rendering type of thing. Got a few more things to test with my GLSL implementation first, though.

a|x

cybero's picture
Re: OpenCL Image Processing Routines

I've just taken a look again at the CoreImage Optical Flow presented by gtoledo3 [works great with a webcam sized footage] and mine and to be honest, they just don't do the same thing at all.

My OpticalFlow really thrashes out with video, does a great job even on a 1:1 pixel ratio with 1280 * 720 HD footage it can, in certain constructs, take a while to load [ 1 - 13 secs], but it just keeps on chugging away, flowing out at 1280 * 720 - makes for great footage [sometimes].

It doesn't do all that wacky and wonderful stuff that it hasn't had built into it that the flow step, iteration construct does in the CI OpticalFlow.

Some kind of compare CI / GLSL to CL / Mesh as you proposed sounds like an idea though.

However, I think the real meat and potatoes is in looking at how both language formats deal with images.

Nice and simple, 'like for like'.

If possible :-)

cybero's picture
Re: OpenCL Image Processing Routines

What you're saying raises another question in my mind that I hadn't thought to explore directly in QC before.

Does QC OpenCL support

cl_apple_gl_sharing

?

gtoledo3's picture
Re: OpenCL Image Processing Routines

Apples and oranges when you compare it to the CI Optical Flow in the compositions I posted recently...

The OpenCL filter you posted is analogous (sorta) to Vade's Optical Flow (which is used in my composition as a component of the GPU Liquid Sim portion)... it isn't analogous to the CI Optical Flow in function (which, I'm sure you know because you state this, I'm just clarifying).

When one replaces the v002 Optical Flow patch in my compositions that were all in that vein, with the OpenCL variant that does similar function, FPS slows by a couple frames per second. Not a deal breaker, but it's no better in function.

As far as one plus that is sort of conceptual, is that OpenCL can be ported to other operating systems, whereas Core Image is closed.

So, if one had a chain that was a video->openCL kernel->sprite, it might be more likely to be able to make a system that could analyze the qtz and compile a related app that could run on iOS or Windows (again, theoretical).

cybero's picture
Re: OpenCL Image Processing Routines

That's a good point about the higher likelyhood of portability of a construct.

As it happens I've just downloaded the incredible Core Image Droste.qtz from the Developer code section [ requires 10.6.x ].

http://developer.apple.com/mac/library/samplecode/Droste/Introduction/In...

This does some beautiful things to an image or video feed.

Given my 'local' and current experience to date with OpenCL image processing, I'd almost persuaded myself of a total falsehood, namely that CI did images up to a certain point really well, whilst OpenCL was better with video.

http://kineme.net/forum/Discussion/SnowLeopard/DrosteSampleCode

As it happens my most recent results are that CI is way faster and cleaner.

OpenCL is interesting and shines when there's a lot of work to do and the routine is good solid code [ a point I can fall foul off achieving].

However, it is much more inclined to knock other perfectly good CL and CI kernels of their perches when there is a failure to find kernel or to compile LLVM or something similar. Those incorrect kernel routines only need to be on the Editor stage, they don't even need to be active.

Their presence alone seems sufficient to create a problem!

That's the really annoying thing about it, it can leak memory, creep into the cache, knock out really good solidly coded kernels and scripts, the GUI interface glitches GPU clutter, the list goes on and I'll just finish up with the flickering fluorescent light bulb of OpenCL created GPU failure [ my personal favourite ].

Still once you've gotten used to what makes for tidy and solid kernel code and maybe check kernels at the command line, one can usually avoid the more frequently apprehended inconveniences that [my :-)] working with OpenCL in QC sometimes provokes.

BTW, it's kind of ironic that you should pick up on the similarity betwixt the OpenCL Flow and vade's Flow patches.

There's a very,very good reason for that.

The GLSL code used in that patch was the original porting base for Marek Bazera who did the command line / XCode app version of that routine and I did the really trivial work of porting that routine into Quartz Composer.

How these things do go the rounds.