GL Tools Feature Request

toneburst's picture

How about the ability to use 2-dimensional input arrays for the Line/Triangle/Quad Structure patches?

This way, you could have the patch generate several discrete lines, for example (I mean a number or unconnected multi-point lines). Or, you could make multiple unconnected quad strips without the need to create multiple patches. I know it's possible to do effectively the same thing by creating overlapping coordinates, but it would be much easier to simply feed the patch a number of structures/arrays, each of which would create one mesh/line.

I know we're edging towards the much-discussed 'spreads' functionality here, but I still think it would be worth doing, if it's relatively easy to setup.

What d'ya thing, guys?

a|x http://machinesdontcare.wordpress.com

usefuldesign.au's picture
Re: GL Tools Feature Request

Yes, sounds reasonable.

Kinda unrelated, but I'm wondering why the line family patches (APPL and Kineme) are so much faster than feeding a structure to Kineme GL Line Patch (which can pair co-ordinates into multiple discrete lines but not multi-point-lines). For an unchanging set of lines, the fps on GL line family swamps GL Line patch on my old set up.

Unfortunately the Line Family patch delivers the lines in ways that don't always go near where you want them.

Was wondering what the "spreads" are you're referring to, Toneburst?

toneburst's picture
Re: GL Tools Feature Request

Quote:
Yes, sounds reasonable.

Kinda unrelated, but I'm wondering why the line family patches (APPL and Kineme) are so much faster than feeding a structure to Kineme GL Line Patch (which can pair co-ordinates into multiple discrete lines but not multi-point-lines). For an unchanging set of lines, the fps on GL line family swamps GL Line patch on my old set up.

I'm guessing it's as much to do with slow structure handling in QC as anything else, since I imagine both options make the same OpenGL calls behind the scenes. cwright?

Quote:
Was wondering what the "spreads" are you're referring to, Toneburst?

It's something there was a lot of discussion of on the Kineme site a while back. It's a concept used a lot (and to great effect) in the Windows-only modular video application vvvv. The idea is that you feed a patch (or node, in vvvv-speak) a single value, and it will create a single instance of the node. Feed it instead an array (or spread) of values, and it will an instance of the node for each value in the array. It's essentially an alternative way of making lots of copies of things, but is more elegant than QC's Iterator patch, in many ways. And faster too, I assume...

a|x

gtoledo3's picture
Re: GL Tools Feature Request

I agree... Structures are to to be avoided, as is directory scanning and extrapolating to structure. However, once you are in structure land, it makes no difference how you sort. The fps counter and performance inspector confirm that. When something is hard wired you only tend to see the performance hit once. The iterator blows because it is constantly evaluating. When you use spreads it seems that the hit is really minimal. Loading structure that is dynamic is iffy. Things that have a struct set in stone are always faster...

cwright's picture
Re: GL Tools Feature Request

usefuldesign.au wrote:
Kinda unrelated, but I'm wondering why the line family patches (APPL and Kineme) are so much faster than feeding a structure to Kineme GL Line Patch.

Line family just reads some end points, and generates the intermediates itself. Structure patches have to poll the structures at every point, and structure lookups are painfully slow (several message-sends, sometimes a key/value search for each item (X, Y, Z, U, V, R, G, B) -- it all adds up very quickly).

spreads are kinda like C arrays (orders of magnitude smaller/faster than QCStructures). They don't exist in QC, but we're not developing them till we see what SnowLeopard does when it hits the scene.

cwright's picture
Re: GL Tools Feature Request

toneburst wrote:
How about the ability to use 2-dimensional input arrays for the Line/Triangle/Quad Structure patches?

This way, you could have the patch generate several discrete lines, for example. Or, you could make multiple unconnected quad strips without the need to create multiple patches.

The "Line Type" input and "Quad Type" input will switch between line strips and line segments, and quad strips/segments. I'm pretty sure you know this already though (but want finer grained control of them).

Multi-level structure support is possible, but it adds an additional performance hit (having to interrogate more structure elements), and makes the logic more convoluted (what happens if you mix strip segments, and then raw points at the top level...)

  • root
    • quad strip
    • quad strip
    • point1 (current setup)
    • point2 (current setup)
    • point3 (current setup)
    • point4 (current setup)
    • quad strip

If your drawing is that sophisticated, a custom patch is probably a much better way to deal with that than trying to hack it in currently (until JS picks up the pace, and structures are more nimble, using them for this is like using Image Pixel in an iterator -- theoretically correct, but practically wrong).

gtoledo3's picture
Re: GL Tools Feature Request

I recently did an install of VVVV (running with VM Ware Fusion), and I LIKE it. I think that the spread system has tremendous strengths vs QC, and I've been seeing that recently, because of specific things I'm doing.

The concept of spreads and "boygrouping" is fascinating. It seems tremendously more efficient than QC at certain things. I am a little non-plussed at structure handling in QC, the more I start to consider it. The licensing scenario strikes me as a bit odd... haven't delved into it enough, and I'm not a big Windows guy. Using it with Fusion is funny because it almost looks like it's running in QC with that window interface!

I also find it a bit interesting that it's xml based.

Have you messed with VVVV any Chris? I remember you did that noodle thing that looked like VVVV, so I'm thinking so?

Do you find that there is a hit from simple structure sorting, or is it from actual manipulation of the structure values, and repacking? It seems like it is the manipulation of values that is the hit. I took a structure and did a test where I unpacked/repacked (I think I'm using the right terminology on that), and I wasn't seeing a hit, but maybe my methodology wasn't valid. However, when I tweak structure and then put it back in a structure, it doesn't seem too efficient.

I like how the shader system works, and the concept that it updates when you change something... it doesn't wait, ala "lazy evaluation". I think that the lazy evaluation thing makes certain seemingly reasonable ideas a bit hairy.

cwright's picture
Re: GL Tools Feature Request

gtoledo3 wrote:
Have you messed with VVVV any Chris? I remember you did that noodle thing that looked like VVVV, so I'm thinking so?

About 15 seconds worth. I really try to avoid windows at all costs.

gtoledo3 wrote:
Do you find that there is a hit from simple structure sorting, or is it from actual manipulation of the structure values, and repacking? It seems like it is the manipulation of values that is the hit. I took a structure and did a test where I unpacked/repacked (I think I'm using the right terminology on that), and I wasn't seeing a hit, but maybe my methodology wasn't valid. However, when I tweak structure and then put it back in a structure, it doesn't seem too efficient.

It's nothing to do with sorting/"repacking" etc (I don't know what your terms mean in these contexts). It's all about the overhead.

in C, an array is a contiguous block of bytes. You address them using a base pointer (a pointer), and an offset (an index). So to compute where you look for a piece of data, the compiler does this: "BaseValue + offset * sizeof data_element", where sizeof data_element is usually know in advance, so it's simply built into the offset (premultiplied). If you're accessing elements contiguously, you can just take your previous value, and add the sizeof value (a constant), for very fast, very cheap data access. This boils down to a single addition (and a single memory fetch), which takes about 1 ns or less to execute (most of the time spent on this is waiting for the ram chips to warm up and reply to you, or the L2 cache if the data's already cached, which is often). RAM delays can be up to a couple hundred ns, cache delays are typically a dozen ns or so. the CPU also starts the access as far in advance as possible (before it actually needs the data), so the memory/cache delay is usually hidden. There are programmer tricks (__builtin_prefetch) to help guide this process, but the Core2 architecture is actually very good at doing this automatically (Intel's engineers did a fantastic job on automatic prefetching).

QCStructures contain a GFList object (so we've got 2 objects). A GFList contains 2 C arrays of pointers to ids (generic ObjC objects), _keys and _values.

So, when you want to access an item from a QCStructure, you say "QCStructure element at index _blah" (similar to C array[index], conceptually). However, whereas C arrays have a cheap single-addition overhead, here's what QCStructure does:

  • get reference to GFList object (1 message send)
  • get list's _values array (1 message send)
  • check bounds (a couple ns)
  • calculate offset (base+offset C array stuff, 1-4ns)
  • return indexed object.

Also, you can't store raw integers/floats in the array (they're ids, not void *'s), so you have to wrap everything in NSNumber or NSValue objects. So to access the returned indexed object, you then have to do [returnedObject pointerValue] or [returnedObject integerValue] -- these incur a single message send (or more), and a possible type-check/conversion (it has to compare, even if it doesn't convert, to know if it should or not).

So after all that, you've got 4+ message sends (at ~12-24ns a piece), some address calculation (1-4ns), some object value extraction (2-12ns), for a grand total of 15-40ns, not to mention all the extra cache bandwidth (using 4x more stack at least, doing several extra loads/stores), and you see how QCStructure is ~15x slower than a C array Best Case (and if you search by key, it has to pull the _keys array, search it, then do the address calculation into the _values array, making it 20-70x slower or more). So it's the fundamental design of QCStructures (currently) that makes them slow -- nothing about sorting/searching/repacking (those are all pretty simple, since they operate just 1 message-send above the C-array level, on the GFList arrays). But when anything actually needs to access that data, the structure cost of incurred. (so synthetic benchmarks that don't do anything with the structure data appear cheap, since they're not factoring in the actual interface access costs).

gtoledo3 wrote:
I like how the shader system works, and the concept that it updates when you change something... it doesn't wait, ala "lazy evaluation". I think that the lazy evaluation thing makes certain seemingly reasonable ideas a bit hairy.

Lazy-evaluation is "optimal" for render-driven processing (what QC is) -- top-down (non-lazy) is better for supply-driven processing (Audio/video filtering, or anything where external "events" push through a bunch of filter-like things that may or may not produce output). I personally prefer lazy-eval (though I admit it's very weird to think that way at first) for 90% of what QC does, as it makes more sense. the 10% makes no sense at all in a lazy-eval system though, which is a shame. So the engineering tradeoff in this scenario is "do top-down, and waste a ton of processing power on stuff that might not matter", or "waste as little power as possible, but make something things unsightly" -- almost every other tool chooses option 1, so I like the diversity of having option 2 get entertained to a working system like QC.

gtoledo3's picture
Re: GL Tools Feature Request

I'm curious about other people's observations and if they are consistent with mine on structure based things...

It seems like in QC you can have a structure loaded, change one element, and it has to evaluate the whole structure again, instead of the one structure member. This makes sense, given the way that QC works. However, if you are drawing from an xml file (using the xml loader plugin), or something like directory scanner, it doesn't actually update the info from the original source. So, it's as if you get all of the negatives, without any upside.

It would seem as though with a true "lazy" evaluation, it would go all the way back to the front of the chain, and truly "reload" the structure.

For instance, if I put a bunch of jpegs in a folder and load them with a scanner, add a new one, cut the feed to the renderer via a multiplexer, and then pop it on (while in runtime), it won't load a new image source that has been put in the folder. I believe that it won't even if you turn off the consumer patch (have to double check that). What you actually need to do it turn off the renderer, and turn it on again. That doesn't seem quite right.

It also seems wrong that pink patches that aren't renderers will cause numbers to evaluate, even when something isn't rendering. I understand WHY, but it's not friendly!

It's amazing that something like what toneburst is talking about, would be so unruly! It seems kind of like a cakewalk in some ways. Is this because at the underpinnings of it, QC wasn't designed to be much of a "generator"? It seems that this all carries over from the paradigm of CI, but that it doesn't quite work with what most people want to do with QC.

gtoledo3's picture
Re: GL Tools Feature Request

ahhhhhh! Great post Chris. You really broke this down for me. I wrote what will be "below" this post while you were writing this.

Yep. That 10% would be great if it could be handled differently when putting certain types of patch combos together. I obviously agree about your take on lazy evaluation, and that it is optimal for most scenarios. It's a no brainer that it helps and is an extremely beneficial setup for most things.

What I meant about the thing with the lines and "repacking" (I know that's not the right word), is that if you have a structure in QC, it seems like you can re-order it via various methods (kineme structure maker, javascript, the structure patches) and not get a hit as long as everything is in a stand still mode. So, it would seem that setting up a bunch of points wouldn't be too taxing, it would just be doing something to change the structure/manipulate that is costly... however, your post clarified things a bit for me.

It's like, if I load something with Kineme 3D, or pull a structure from XML or whatever, I just see the hit once. I can take the structure patches/ javascript, or the kineme patch, re-shuffle, and it doesn't seem to be a big deal. When I setup something like a picture loader, even though the structure is static once it's loaded, it seems to constantly be "churning"... but your post totally breaks down how things work, and has given me a bit to chew on. I think I'm a little tired too! :o)

franz's picture
Re: GL Tools Feature Request

Thanks for that detailed explanation Chris. For these extra 10%, i usually use a DUMMY renderer to get my chain ALWAYs evaluating. Plug a structure count in some out-of view Sprite's rotation (for instance), and the structure will be evaluated constantly.

cwright's picture
Re: GL Tools Feature Request

when I need an always-eval'd branch, I tie it to the enable port of an iterator with a count of zero -- this might be a tiny bit cheaper than a sprite?

(sprites do some texture work/configuration, even if no textures are attached. It's next-to-nothing, but every little bit helps, right?)

franz's picture
Re: GL Tools Feature Request

right ! every ns counts... thanks for the tip !

usefuldesign.au's picture
Re: GL Tools Feature Request

Thanks for this post Chris we (well I am at least) really fortunate to have somebody with the programming chops to shed light on these mysteries.

Sounds like C arrays are something to look forward to in Snow Leporadr or beyond. I gather they are fixed byte length per array element ie. Float, Int, Text256Byte etc. whereas QC structures are completely open which is often an unnecessary luxury in an animation context.

Quote:
QCStructures contain a GFList object (so we've got 2 objects). A GFList contains 2 C arrays of pointers to ids (generic ObjC objects), _keys and _values...
...
Also, you can't store raw integers/floats in the array (they're ids, not void *'s), so you have to wrap everything in NSNumber or NSValue objects.

Does this mean QC structures are using memory handles on top of pointers (that's about the extent of my C++ conceptualisation I'm afraid)?

May be academic question (depending on answer) but once a QC structure of floats has been copied to the GPU is it processed as quickly there as any other float data, say the computational Line Family Point Data into start and end points of segments? If so sending over a quite a few sets and switching b/w them at renderer could be a way to manifest speed gains.

For that matter is JS Patch all CPU based or somewhat GPU'ish?

Guess Snow Leopard stands to add tools to the set, seeing as OpenCL is a fanfare feature promise. Will QC be the beneficiary? (Rhetorical. NDA assumed) Time to ditch my venerable G5 if it is!

One last question or two about Arrays in JS, Is it possible to do matrix maths on Arrays in JS patch eg myNewArray = originalArray * transformationArray or must one iterate?

I think pre-process large float data sets into multiple useful subsets for animation multiplexing is the goal I've had in mind for some time now and this discussion seems to confirm it for at least til 10.7. I can guess a little C programming for the pre-processing of data, how difficult is XCode environment for a non-modern programmer to get going in?

usefuldesign.au's picture
Re: GL Tools Feature Request

I guess if creating programmers GUI elements was the main reason for bringing QC (whatever it was called before QC) over to OS X it's a pretty sweet tool. Then like all 'disruptive' technologies a bunch of under-employed hacks (me and by choice, not you guys!) start saying "Cheap tool that's cool, I can do this and that with it..." until it's pushed to the limits and beyond intended use.

Moore's Law comes to the rescue and before you know it you have a Flash killer for free (sorta) and a technology that enables all kinds of undesired applications that become desirable with more Hardware power and consumer interest.

On strucutres: doesn't XML patch cache the file until the Update signal is True (although I usually have it ticked which contradicts what I just said, duh)?

gtoledo3's picture
Re: GL Tools Feature Request

I feel like I sidetracked the conversation by my structure wha-who-zits comments.

This is definitely a cool feature request. The idea of setting something up like that is desirable.

It will be interesting to see how Open CL really works with the next OS, and if that impacts the way that structures are handled, or the current QC paradigm of the way that structure is handled... or if the thought is simply that the gpu will be able to pick up slack, and nothing really changes.

(Interesting comment about plugging into an iterator. I have done the sprite thing way in the past. My usual habit to get evaluation going is to plug into a 3D transform.)

usefuldesign.au's picture
Re: GL Tools Feature Request

Thanks for the explaination, tb.

toneburst's picture
Re: GL Tools Feature Request

Thanks as ever for the excellent explanation, cwright.

a|x

cwright's picture
Re: GL Tools Feature Request

usefuldesign.au wrote:
Sounds like C arrays are something to look forward to in Snow Leporadr or beyond. I gather they are fixed byte length per array element ie. Float, Int, Text256Byte etc. whereas QC structures are completely open which is often an unnecessary luxury in an animation context.

I don't think they're fixed length. Rather, they're just slabs of contiguous bytes, and you interpret them however you want (I'm guessing 32-bit per element, for integers/floats will be the most common/optimized, with 8-bit per element and 64-bit per element being the runners up).

usefuldesign.au wrote:
Does this mean QC structures are using memory handles on top of pointers (that's about the extent of my C++ conceptualisation I'm afraid)?

Please don't ever say "Handle" ever again. (I'm not angry/upset, I just find Handle to be an unnecessarily vague term in programming.) Win32 uses them all over, and ancient Mac OS (pre-X) stuff used them a bit too. Most unhelpful). QCStructures are objects (a pointer + some smallish chunk of memory) that contain other objects/data, one of which is a GFList (another object), which in turn contains 2 C-Arrays (keys, and values). So it's triply-indirect, but with well-defined types all the way down. "Handles" are typically used in place of "Opaque objects" (objects/structures you can't look inside, like almost all of CoreImage), or in place of "generics" (void*, id) -- both of the latter terms are more informative than the vague Handle term.

But in plain C, a triply indirect point (unsigned int ***i) is still relatively cheap (3-6x as expensive as a singly-indirect pointer, which is actually just an array). In this setup, two of those layers (QCStructure, GFList) are ObjC indirection, so they're a a bit more expensive.

usefuldesign.au wrote:
May be academic question (depending on answer) but once a QC structure of floats has been copied to the GPU is it processed as quickly there as any other float data, say the computational Line Family Point Data into start and end points of segments? If so sending over a quite a few sets and switching b/w them at renderer could be a way to manifest speed gains.

For that matter is JS Patch all CPU based or somewhat GPU'ish?

Very little non-graphical processing happens on the GPU right now (in fact, I'd say it's essentially none at all). JS has absolutely no connection with the GPU, and QCStructures are not at all currently GPU-backed (stored in vram/operated on in gpu).

Ever since people heard of GPGPU, and OpenCL, there's been a huge "is it GPU?" push for all kinds of silly stuff (one of the larger time-consuming projects I'm working on has a bunch of sprite work, and the project owner occasionally asks me if I'm processing sprite position stuff "on the GPU" -- since they're not moving most of the time, and not moving in any uniform way (they're all conditional, which GPU sucks at), so wasting any time making it work on the GPU is stupid, and it's such a small simple dataset that there's honestly no benefit from it -- less than 15,000 data elements, and you're Guranteed to lose if you insist on GPU due to upload/download overhead + data swizzling). If QCStructures were "raw C structures", you could deal with literally hundred-thousand element structures on the CPU (single Core even) and not break a sweat. The overhead is seriously that dramatic. To get any GPU benefit (currently, in 2009), your datasets need to be non-trivial (>16k elements or more), and need to have rather expensive maths done on each element, and all the maths for each element need to be exactly the same. If all those conditions are met, you've got a good GPU candidate, but if any of them is false, doing GPU work isn't a win.

usefuldesign.au wrote:
One last question or two about Arrays in JS, Is it possible to do matrix maths on Arrays in JS patch eg myNewArray = originalArray * transformationArray or must one iterate?

JS has no notion of matrix math, so you have to do that yourself.

usefuldesign.au wrote:
I think pre-process large float data sets into multiple useful subsets for animation multiplexing is the goal I've had in mind for some time now and this discussion seems to confirm it for at least til 10.7. I can guess a little C programming for the pre-processing of data, how difficult is XCode environment for a non-modern programmer to get going in?

I don't know any standard units of measurement on difficulty, nor a base metric for "non-programmer" -- If you're familiar with JS and other scripting languages, it shouldn't be too difficult, but it could be almost impossibly difficult if you brain isn't wired to think "like a programmer" (similar to me trying to write music or paint pictures... my brain simply isn't wired to do that, so it sucks and I'll probably never learn to do it better).

toneburst's picture
Re: GL Tools Feature Request

cwright wrote:
Multi-level structure support is possible, but it adds an additional performance hit (having to interrogate more structure elements), and makes the logic more convoluted

Yes, but to draw a mixture of joined-up and discrete meshes/points using the current setup also requires either multiple instances of the patches (which would tend to necessitate the use of an Iterator patch, with all the overhead that implies, if you wanted a dynamic number of meshes), or some much more complicated code to create the meshes in the first place. For example, to create several discrete lines, you'd currently need to use Line Segment rendering-mode, and make twice as many points, and overlap start and end points. Much more complicated to setup, and quite inefficient, since you're using many more points. With Triangles and Quads, the problem would be compounded.

Quote:
...(what happens if you mix strip segments, and then raw points at the top level...)

I'd guess the same as now: if it's a Line Structure patch, you get a line, if the structure contains 2 or more points, if it's a Triangle Strip, you get triangles if the structure contains a multiple of 3 points etc.

Quote:
If your drawing is that sophisticated, a custom patch is probably a much better way to deal with that than trying to hack it in currently (until JS picks up the pace, and structures are more nimble, using them for this is like using Image Pixel in an iterator -- theoretically correct, but practically wrong).

I did a nice composition with an Image Pixel patch inside an Iterator once, actually. It was slow though....

I think speed is important (obviously), and it would be great to be able to make multi-million-vert meshes/particle systems, but there's a lot you can do with just a few hundred points, before structure-related slowness really becomes an issue...

a|x

cwright's picture
Re: GL Tools Feature Request

toneburst wrote:
Yes, but to draw a mixture of joined-up and discrete meshes/points using the current setup also requires either multiple instances of the patches (which would tend to necessitate the use of an Iterator patch, with all the overhead that implies, if you wanted a dynamic number of meshes), or some much more complicated code to create the meshes in the first place. For example, to create several discrete lines, you'd currently need to use Line Segment rendering-mode, and make twice as many points, and overlap start and end points. Much more complicated to setup, and quite inefficient, since you're using many more points. With Triangles and Quads, the problem would be compounded.

I agree entirely with you - my current stance on this though is that 1) fixing this using existing tools (QCStructure and friends) is no-go (because there's no possible way to get significant amounts of mesh data at a reasonable framerate -- you can load static geometry using XML, which is what I spent a year tuning Kineme3D to do much more gracefully, and you can use JS, which has bad performance. Even your fantastic Random Walk composition was crippled by JS, not by any actual rendering load. and 2) The playing field can change radically when SL arrives, and I don't feel like burning lots of effort when I'm quite certain there will be dramatically superior, apple-approved ways of solving this in the very near future.

If you don't believe me that JS is the slow poke in RandomWalk, I'll show 2 pieces of evidence:

All drawing disabled, still only getting 20 fps. Drawing enabled, PI shows JS consuming 75% of the frame time.

(I'm pretty sure you know this, but to make my point, I stand by the complete and utter lack of generating significant amounts of data).

toneburst wrote:
I'd guess the same as now: if it's a Line Structure patch, you get a line, if the structure contains 2 or more points, if it's a Triangle Strip, you get triangles if the structure contains a multiple of 3 points etc.

However, adding this makes it uniformly 12.5-25% slower (because of the additional interrogation to see if an item in the structure is a substructure, or a point, and keeping track of point state. Also, what happens in triangle mode, with 2 root-level verts, and then a substructure? If we support single-deep structures, why not doubly deep? why not arbitrarily deep? code-wise it's not difficult, but performance wise it's Really Bad)). I generally don't like to make everything measurably slower across the board without adding something to show for it, and sadly I don't know that sub-structure strips are that feature. (people can vote with their comments if they disagree, and I'm fully prepared to reverse my stance on this if many noisy people would like this addition).

toneburst wrote:
I did a nice composition with an Image Pixel patch inside an Iterator once, actually. It was slow though....

That's what I meant by "practically wrong" -- it's functional and correct, but not at all practical for real-world usage.

toneburst wrote:
I think speed is important (obviously), and it would be great to be able to make multi-million-vert meshes/particle systems, but there's a lot you can do with just a few hundred points, before structure-related slowness really becomes an issue...

I agree, to some degree. Adding degenerate points right now works around the multi-patch needs, and won't be significantly slower (in fact, I'd almost expect a tie between degenerate data, and additional structure interrogation), so it's awfully tempting for me to do nothing (in light of all the reasons above, and that there's a functional workaround already that isn't too horrible).

PreviewAttachmentSize
RandomWalkNoDrawing.png
RandomWalkNoDrawing.png117.36 KB
RandomWalkPIView.png
RandomWalkPIView.png87.77 KB

toneburst's picture
Re: GL Tools Feature Request

Ah, but (there's always a 'but')... that was my first Random Walk example, I think. In this one, it was the JavaScript patch that was storing the structure of points. After that, I decided to use the Queue patch to store the structure data, and things speeded-up a lot! Having JS store and manipulate largeish structures definitely cripples performance, but that's exactly as-expected, surely.

I take your point(s) though- a 12.5 > 25% speed hit clearly isn't acceptable.

On a more general point, and I'm thinking about custom QC patches here, what's the best way of generating and manipulating arrays of values in a plugin that will eventually spit out a structure at an output port in QC? C arrays seem to be the way to go generally, speed-wise, but what's the best way to convert them into structures for output (given QC in its current form not having direct support for passing-around other kinds of data)?

Or should I just wait until after Snow Leopard has moved the goalposts again, before I even think about it?

a|x

cwright's picture
Re: GL Tools Feature Request

C arrays are fast and cheap to make, but they're sadly completely incongruent with cocoa -- everything needs to get wrapped in an NSNumber or NSValue (both of which are ridiculously expensive for what they do), and then everything needs to get added to an NSArray or NSDictionary -- NSArray (for index-only lists) is pretty fast for this, but NSDictionary (for keyed structures) is really slow.

(we fight with this issue in Audio tools, where we can get a bunch of samples quickly and cheaply, but converting them into QC's QCStructure (and intermediate NSArray/NSDictionary forms) is embarrassingly expensive (as in, I don't think it's possible to create a dictionary with a million NSNumbers from floats in under a second, even though it's only scooting around a mere 4MB of data -- My quickly-whipped-up test app takes 3.7 seconds to do this, which is unspeakably slow -- just the C array part takes 0.055 seconds (so the remaining 3.645 sec is cocoa suckage)).

If you can wait till SL, I'm sure life will be much better for you. If not, there aren't many workarounds, other than subclassing NSDictionary and cheating (and that has its own pitfalls and caveats)

toneburst's picture
Re: GL Tools Feature Request

Here's another idea:

how about some simple mechanism for 'breaking' the mesh/line, and causing another discrete mesh/line to be drawn? Say, you have a structure of point data

Point 0
 
Point 1
 
Point 2
etc.

would draw a line with 3 points. This, maybe if the next item was a blank structure (with no values for X Y and Z), it would 'break' the line, and begin drawing again with the next point.

Or, a better alternative might be to have an optional property for each point which would cause a new mesh/line segment to be created. That way, you're not dealing with multi-dimensional arrays, and the overhead is (presumably) just extra OpenGL drawing calls.

Point 0 XYZ         |
 
Point 1 XYZ         | Line 1
 
Point 2 XYZBreak      |
 
Point 3 XYZ         |
 
Point 4 XYZ         | Line 2
 
Point 5 XYZ         |

etc.

Excuse the mangled formatting, but I'm sure you get the point.

Just a thought...

a|x

toneburst's picture
Re: GL Tools Feature Request

Incidentally, did you see my other query re. Point/Line Structure patches and GLSL shaders?

http://kineme.net/Release/Beta/GLToolsLinePointStructureGLSLWeirdness

a|x

usefuldesign.au's picture
Re: GL Tools Feature Request

Hi toneburst,

I don't wanna be the smarty pants who tells you something you already know but I think you can already do what you're describing (as cwright and I have already mentioned the "line type" input I'm unsure if we're on same page).

Perhaps I'm missing your point completely. Anyway here is a comp drawing discrete lines (not a strip like used in random walk) that all have start point (0,0,0) and random end points.

use key=board to interact
3-3D point data
z-down by 10
x-up by 10
↓-pause
↑-step
←-down by 1
→-up by 1

PreviewAttachmentSize
Kineme Line structure.qtz16.59 KB

toneburst's picture
Re: GL Tools Feature Request

Hiya,

I'm aware of this. What I'm talking about is an easy way to draw a several discrete multi-point lines. It's possible to simulate this by using the Line Segment mode, but you have to make the start-point of each new line the same as the end-point of the previous line to get this it work. This obviously means defining twice as many points, and makes the coding needlessly complicated.

a|x

usefuldesign.au's picture
Re: GL Tools Feature Request

Right, sorry. I had realised what you were talking about yesterday and then forgot it all today when i just read the latest posts :)