OpenCL- Great, but Flaky as F*ck?

toneburst's picture

I'm aware I'm coming to the OpenCL party weeks after everyone else has (leaving me surrounded by empty beer bottles and the lingering smell of cigarettes and stale vomit), but OpenCL is GREAT.

I just managed to achieve something using OpenCL that I've been wrestling with (unsuccessfully) for ages using GLSL shaders- make a radially-displaced heightmap sphere, with working normals! My joy at this lasted until my machine locked up completely, about two minutes later, necessitating a force-shutdown.

Hopefully, future OS updates will make QC's OpenCL support a bit less flaky, and widen GPU support a little.

a|x

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

toneburst's picture
Re: OpenCL- Great, but Flaky as F*ck?

Here's the QTZ, if anyone's interested.

You'll need a machine with a GPU that supports OpenCL to run it, unfortunately.

a|x

PreviewAttachmentSize
tb_Radial_VDM_OpenCL_071009_01.qtz26.98 KB

cwright's picture
Re: OpenCL- Great, but Flaky as F*ck?

Quote:
Hopefully, future OS updates will make OS X's OpenCL support a bit less flaky, and widen GPU support a little.

T,FTFY ;)

dust's picture
Re: OpenCL- Great, but Flaky as F*ck?

welcome to the party. thought you would get down with open cl. its kind of nice to be able to get values out of the gpu in regards to the comparisons of glsl to cl. the crashing part sucks, the worst part is it doesn't crash qc but the whole system. to bad there not a warning bell that says force quit now before you pass the point of no return.

toneburst's picture
Re: OpenCL- Great, but Flaky as F*ck?

dust wrote:
welcome to the party.

Thanks, man!

Quote:
thought you would get down with open cl. its kind of nice to be able to get values out of the gpu in regards to the comparisons of glsl to cl.

Yeah, true. I'll have to look at structure in/out options, too. Have you tried getting an arbitrary structure out of an OpenCL kernel yet?

Quote:
the crashing part sucks, the worst part is it doesn't crash qc but the whole system. to bad there not a warning bell that says force quit now before you pass the point of no return.

My guess is it's the GPU that crashes, which brings down the OS. I certainly got some graphics glitches when working with OpenCL patches, and when I my machine crashed solid.

a|x

cybero's picture
Re: OpenCL- Great, but Flaky as F*ck?

I had begun to respond to your post and example composition, then crashed :-( ; rendering in situe, what I found was that it would run AOK unless one was trying to change the video settings whilst rendering live. I did not find that rendering , without altering any of the inbuilt settings caused a crash on my machine, but did find that if I say, required a clean aperture, the system froze up glacially , as it were, but if I were to leave it running, it just simply ran and ran.

Now, having said that altering other parameters on other patches did not adversely affect the running of the composition.

Stopping render and then amending the / altering the video input then re-commencing the render did not provoke such an error.

I did try using the Apple Gaussian error , and this dealt no better or worse in regards of altering the video input parameters , same rough and ready rule of thumb, alter video input settings when not rendering OK, altering when rendering, dicey to say the least.

The freeze up did not always automatically occur with either type of blur patch, but would occur with both, so type of patch , not especially an issue.

The inconsistency regarding the amending of video input settings is interesting, when non-fatal - the render would cease, except for black empty space, until the composition had settled into the new video input settings.

If you want , I could send you the Console messages, but I'm pretty sure you've got loads of your own to examine that will evidence just how altering a video input setting results in the failure of OpenCL execution before the composition re-settles upon the new video setting; even starting a render results in the following Console message - both in your original composition and also in the one I used Apple's Gaussian blur within , instead of vade's lovely blurs plugin -

07/10/2009 16:24:55   Quartz Composer[225]   *** <QCOpenCL = 0x1540E3B0 "OpenCL_1">: Error creating output stream for index 0

Finally, if you have been getting freeze ups without altering a thing, then I suspect either a hardware difference , I'm running on nVidia 9400, iMac Intel Core 2 Duo, or perhaps some 'unwanted' patches / plugins that do not pre-flight correctly. Look in your Console for such indicative advice as listed at application boot time.

Is it partly to do with 'nice' values, I can't help but wonder, oh and still rendering AOK [BTW, this is with your composition, can and have alter(ed) displacement, reflection type, thrown in an Interpolation to make the little baby rotate - all good - lovely work a|x {aka toneburst}]

gtoledo3's picture
Re: OpenCL- Great, but Flaky as F*ck?

I haven't looked at this yet, because I'm booted in Leopard, but...

I was noticing that when working with the displacement and video, I had to make sure to render to something first, and then have the h/w values dictated by something other than the video source (like a crop patch that isn't actually attached to an image dimensions, just cropped correct dimensions).

I also had to make sure to do the same to the h/w of the render in image. If not, I would get the same problems you described. In fact, my machine wouldn't even boot again until the eprom was cleared. It would just stall on the blue opening screen.

toneburst's picture
Re: OpenCL- Great, but Flaky as F*ck?

Think it might choke on large input images. If you look at the OpenCL code, what's actually going on is that each pixel of the input image becomes a vertex of the mesh, so with larger images, it's quite easy to end up pumping out several million vertices. As far as OpenCL is concerned, I suspect, 2,000,000 verts is no different to 2,000,000 pixels/texels (one of the cool things about OpenCL, in fact), but something further down the chain might have problems with that.

Just a guess.

a|x

toneburst's picture
Re: OpenCL- Great, but Flaky as F*ck?

Hiya cybero,

thanks for the detailed reply. I've not had any more crashes lately. I've taken to stopping rendering before connecting/disconnecting stuff, then restarting afterwards. I think you might be right about changing input image sizes. I think OpenCL is sensitive about texture memory, so maybe changing image dimensions on-the-fly causes memory issues....

No idea what I'm on about though... ;)

a|x

dust's picture
Re: OpenCL- Great, but Flaky as F*ck?

actually haven't really tried getting a struct out of a qc cl kernel, well other than some of the mesh patches, my initial thought is to use an image output and store lets say your xyz variables in the rgb channels then extract the struct from the rgb. or maybe output your cl kernel as float4 pos.. kind of like this.

__kernel void init(__global float4 *pos) { int tid = get_global_id(0); pos[tid] = (float4)(0., 0., 0., 1.); }

im also thinking there must be a way to use the cl kernel info patch and some sort of javascript patch to combined with maybe a apple script patch to save and quite qc before you blow up the gpu.

honestly have only done a bit of testing seeing the standard apple default qc kernel crashes my machine when im running my nvidia in 256. i have actually just edit the gpu kernel plist or energy saver plist so my machine will default to 512. so im hoping that will stop the crashing so i can learn more about cl.

im excited about the 3d image feature that is in the convert cl patch but negated from the kernel.

gtoledo3's picture
Re: OpenCL- Great, but Flaky as F*ck?

It is definitely the case that one is getting a vertex per pixel (like you said, it's resoundingly obvious from the code... it might even be commented).

In the particular problem scenario I'm talking about, it's just that kernels (CI and OpenCL) can have problems when they are deriving the h/w from an external source, in certain circumstances. If it was only vertex count issue, then 640x480 simply wouldn't work on a given computer... it would not be able to be resolved by using a render in image or crop (or combo as it were).

I think this could very well be a bit of a remnant from an old Leopard bug that was only somewhat resolved. QC would go crazy and start fluttering the image when getting h/w from the video patch in some scenarios involving core image, but be "ok" when using a different source, or doing the "render in image" workaround. I haven't had the time to look at it too much to see if it's related or just coincidental.

toneburst's picture
Re: OpenCL- Great, but Flaky as F*ck?

dust wrote:
im excited about the 3d image feature that is in the convert cl patch but negated from the kernel.

Not sure what you mean here. I've just noticed that there does seems to be support for 3D images as input to CL Kernels in QC, though, which might be very cool. Any ideas as to how to use this/which file-format might be usable for 3D textures? I notice when you add a __rd image3d_t tex3d input, it also creates ports for Size X, Y and Z, which does happen when you create a 2D texture port.

a|x

dust's picture
Re: OpenCL- Great, but Flaky as F*ck?

i think the 3d image is an some sort of image array, like your med scan or maybe a pixel volumetric representation like a voxel. im thinking in qc with the open cl converter it would be some sort of immediate 3d rendering,with vert and color data. i have been doing some tutorials on open cl from mac research and so far i have learned a bit about parallel computing, in regards to open cl. so far i am just getting the basics down and understanding the differences between cuda and open cl plus things like what tasks are computationally expensive, what tasks will run faster on the cpu, when to use the gpu, how to cache on the gpu, and when to use multiple cores and gpu.

the pci bus is a bottle neck in open cl, and although vector and array type calculations or cross products etc are inherently faster to calculate on the gpu sometimes a multiple core solution is just as good because getting your data off the gpu via the pci slows everything down. so far the tutorials i have watched and read only mention image3d as data type and haven't gone into any great detail but to answer your question image3d takes a square image that can can be exponentially raised to the 2nd power or divided by 2. as far as size is concerned. the actual encapsulation or encoding of the image data, i don't think matters, meaning a .jpg will work. this is coming from a ocl cuda perspective which is a bit different in qc. as far as getting a image3d to volumetrically appear in qc ? i don't know as of yet it is something that is interesting to me.

right now im thinking of how i can do a quaternion in parallel with opencl. i have some c funtions for this that im going to try and implement in opencl so i can mess with some openal spatial sound which i have found seems to like 8bit 41khz mono wav files.

cybero's picture
Re: OpenCL- Great, but Flaky as F*ck?

Wow, I really am back tracking today.

Enquiries from across the globe [10.4, 10.5 & 10.6 stuff] whereby I find my straggling way to re-discovering this little gem.

[Just re-read the thread - something has definitely changed since this thread was commenced and no mistake, this now performs without a hitch and speedily too].

What I'd quite forgotten was that this example had yet another one of your OpenCL variants of the rgba8888 kernel, made to take a width and height input in addition to the displacement and image inputs and outputting a sphere.

Given a goodly dose of Kineme Audio Input & variegating the meshes, mixing meshes & structures, an incredible result ensues.

This has definitely helped point me in a refreshed direction.

I really like making imagery with OpenCL and am still finding out just how amazing some of the facilities on sampling within an image are ; I'm looking forward to getting this setup to "play" nicely with some dynamic , abstract imagery .

Post Script

I have found that bringing in some of my dynamic image kernels has forced me to re-examine just what and why I have several of those kernels constructed as within this particular construct you posted some time ago, I find only one of four kernels to produce imagery in a consistent and regular fashion.

For some reason currently obscure to me, the other kernels that seemed to be working AOK in other constructs and making imagery with ease seem destined by default to devolve down to a little grey blip even when they are actually receiving shed loads of "stimulus".

It isn't actually the fault of your posted routine and skipping the CI Macro doesn't make a bit of difference where needed by me :-). Still have 3 half on / half off examples.

In addition this type of construct it seems , is best suited to processing a static image [size here is less of an issue, except for a speed hit] , texturing to a dynamically animated sphere mesh.

However, if a video, movie or stable dynamic graphic kernel are sized sensibly down prior to committing to the CI or even going straight to the CL pipeline point, then we have a far happier result.

Which leaves me to seriously re-examine those other dynamic image kernels.