NBody Simulations

dust's picture

so here are some screen shots of my nbody simulation benchmarks. it seems the apple software render really boost my gigaflops. i think i got over 30 giga flops with a apple software and sim geeforce 9600.

i would love to see other peoples benchmarks in comparison. i found the internal pci lcd seemed to perform better than the external pci but there really isn't much difference between the 512 and 256 as we have been discussing earlier. there seems to be an 8 giga flop difference between the two cards with an apple software render.

cwright's picture
Re: NBody Simulations

Hey dust, where do you find this benchmark?

(shocking that GPU sim -> GPU rendering is slower than GPU sim -> CPU rendering... wonder who dropped the ball on that one? Most of the nvidia performance quirks from Leopard seem to be fixed (finally), which is nice....)

dust's picture
Re: NBody Simulations

here run this program. if you want to check out the render file its in the open cl documentation. i can't upload the source because of file size restrictions maybe it will work compressed. let me know, i thought it was shocking that I was getting more giga flops on a software render setting. maybe im wrong but im getting the highest flop rate with apple software rendered and a simulation render of my graphics card. the max i have hit is 32 giga flops. there could be some tweakage, this also shows that on a apple software render with a sim single core it is possible to get maybe 7- 10 giga flops, so that leads me to believe that oen cl can be used without a graphics card. but like i said you would know more about that stuff. here are the directions.

OpenCL Parallel Reduction Example

=========================================================================== DESCRIPTION:

This example performs an NBody simulation which calculates a gravity field and corresponding velocity and acceleration contributions accumulated by each body in the system from every other body. This example also shows how to mitigate computation between all available devices including CPU and GPU devices, as well as a hybrid combination of both, using separate threads for each simulator.

Click on the corresponding buttons to select the active simulation and render devices, and/or press the following keyboard shortcuts:

1-6: Select an N-Body System Configuration g: Select the next Graphics Device s: Select the next Simulation Device r: Enable/Disable Auto Rotation d: Show/Hide Dock UI h: Show/Hide HUD UI u: Show/Hide Simulation Updates Meter f: Show/Hide FPS Meter space: Pause/Unpause Simulation

Note that the .cl compute kernel file(s) are loaded and compiled at runtime. The example source assumes that these files are in the same path as the built executable.

PreviewAttachmentSize
Galaxies.zip5.38 MB

SteveElbows's picture
Re: NBody Simulations

I get about 24 Gigaflops on an 8600M with Apple software render. The framerate is terrible in that mode though, due to the software renderer. I get best visual results using 8600M OpenGL with one of the CPU modes for the OpenCL stuff. I like the idea of using 2 graphics cards, one for the visual output and the other for OpenCL, the Macbook Pro's with 2 GPUs seem appealing right now.

I will try it on my Mac Pro later even though my GPU doesnt support OpenCL, the 8 CPU cores might be good.

SteveElbows's picture
Re: NBody Simulations

Some interesting OpenCL benchmarks:

http://www.barefeats.com/opencl.html

I want a GTX285.

cybero's picture
Re: NBody Simulations

I've found one other posted example of this test / benchmark application. It states it is VSYNC_OFF. Is your example VSYNC_OFF too, dust.

dust's picture
Re: NBody Simulations

im not sure i just built this app from the developer example. and tested some settings. it is interesting the app shows how you can do a software render to speed things up. i have not got there yet im still working on open cl in QC but i assume the vsync is off. isn't vsync for older crt monitors ? How would one go about turning it off and on. I have seen some options to do it programmatically but in QC is it an options prefs setting ?

i must have vsync on if that site is offering the vsync off galaxy version. i didn't do anything build the app. What ever the default is, is what my pics are from. I found out later there are controls you can trigger.

SteveElbows's picture
Re: NBody Simulations

The thing Ive found with that version is that only simulation 6 works, keys 1 to 5 dont do anything. And I dont know about anyone else but simulation 6 behaves differently on my GPU, the particles fly out, but on the CPU they attract.

edited to clarify that its the vsync disabled one that only has simulation 6, but both versions have the same issue with my 8600 giving different particle behaviour when running sim6.

dust's picture
Re: NBody Simulations

i tried hitting the numbers as well they didn't do anything. another apple developer example partially implemented. all the same this app proves that you can run open cl on the cpu.

SteveElbows's picture
Re: NBody Simulations

The number 1-6 work for me with the version you provided, its a version floating round elsewhere that someone modified to disable vsync that they dont work on, maybe it was based on an earlier version of the example, Im not sure.

ANyway this demo certainly works quite well on the CPU, I think I got around 60 gigaflops on a 2008 mac pro 2.8ghz, cant quite remember.

Do you get the updates meter if you press u? The visual results is what counts at the end of the day which is why I was slightly confused by your initial comments about the software renderer, it may free up your GPU to deliver more gigaflops for the OpenCL stuff but the framerate sucks.

cybero's picture
Re: NBody Simulations

One things for sure, your compilation of the application gives us all far more in the way of feedback, been spending a fair amount of time pressing Command + Shift + 3.

I have as yet to do a thorough comparison of the two versions of the applications, but the one linked to via http://www.barefeats.com/opencl.html seems to run slower fps wise than your compilation.

SteveElbows's picture
Re: NBody Simulations

With that said, I am interested in your comparison of the 9600 vs 9400 used for the opencl sim (with the other gpu being used for the opengl). If you can get the update meter working, how do the updates/sec compare?

Plus I suppose my assumption that on 2 gpu systems, using one gpu for opencl and the other for opengl may not be most efficient in every scenario - isnt there a feature in opencl that allows some opencl data to pass to opengl on the card without having to go back to the rest of the system? Not sure if this demo makes any use of that.

dust's picture
Re: NBody Simulations

well hitting U works and for some reason all the numbers work as well. Last time i was running two monitors now im only running 1, and your right the frames are much better on a opengl 9600 with a hybrid cpu + gpu. i used all the default setting on this but if you cycle 1-6 you get various results here are bunch of screen shots im not sure what to make out. obviously for the kind of work i do a better fps = happy but from a scientific standpoint maybe a higher flop rate is better not sure how the fps is calculating. does a better fps mean more calculations per second even though the flop rate is smaller ? well here are a bunch of screen shots. maybe the updates per second is better. with both cards the 9600 and sim 9400 the U meter got buried it kept bouncing off 60, apparently 60is max but i could tell it wanted to go higher. check these shots out.

PreviewAttachmentSize
galaxy-U-Settings.zip7.54 MB

SteveElbows's picture
Re: NBody Simulations

Ta for the info. Regarding the 60fps limit, thats where the Vsync disabled version comes into play, it wont hit that brickwall, but as mentioned earlier it has other issues. What we really need is the vsync disabling in the sourcecode of the more complete version.