White Detector Grid + troubles with LockBufferRepresentationWithPixelFormat

Lango's picture

Hey all

I have nearly finished a custom patch that I am happy to use in my project. However i've come across two stumbling blocks and am hoping for some guidance from the intelligent people here.

First what the patch does

Aim: Detects the intensity of white in different parts of an image

Why: A difference image shows motion in black and white. With this patch you will be able to have general areas of movement. Other patches that detect movement I have found to be too precise and finicky. This trys to simplify it.

How: The patch creates a grid of cells, each cell generates a 'weight' from the amount of white in the corresponding part of the image. This grid is outputted as an array. The amount of white, currently, is detememined by the intensity level that has the highest pixel count in the red channel.

Video of it in basic use:

Bugs:

  1. If the y origin is set to 25% and the height % is set to 100 the Quartz Composer crashes with a bad access error. However this doesn't happen if you do the same combination with the x origin and width. The last time I did memory management was about 3 years ago at uni, so i'm probably doing something really stupid here.

In the attached QC file set Y Origin (%) to 25 and Height of Grid(%) to 100 to reporduce this bug.

  1. The next one is a method call that causes lag. It is the lockBufferRepresentationWithPixelFormat call. This takes double the amount of time as the same call in the HistogramOperation tutorial. I have tracked this down to the fact in my call, within lockBufferRepresentationWithPixelFormat, a call to createPixelBufferFromImageBuffer is made. This call is never made in the histogramOperation tutorial. And I don't know why. (That was very complicated, looking at the attached profiles.png will better illustrate what I mean).

Any help would be so very much awesome

Cheers

Lango

** EDIT: attached the actual profile files as a zip. One that is the HistogramOperation tutorial and one is from my patch. **

** EDIT 2: Corrected the source attachment **

dust's picture
Re: White Detector Grid + troubles with ...

this looks very interesting. i like to see how people do motion detection. i am afraid that your x code project is not entirely zipped meaning there are no class files or resources bundled or encapsulated inside your x code project therefore i am unable to build the required plugin to see what is going. im not sure i could solve your problem anyways but just thought you might need to include your source if you want people to help. this is the classic error you get with QC... when the plugin is not present.

(null) : Patch with name "QCPlugInPatch:White_Detector_GridPlugIn" is missing

Overlay Cannot create node of class "QCPlugInPatch" and identifier "White_Detector_GridPlugIn"

Overlay Cannot create connection from ["outputArray" @ "PlugInPatch_White_Detector_GridPlugIn_1"] to ["inputStructure" @ "StructureCount_1"]

Overlay Cannot create connection from ["output" @ "Splitter_1"] to ["inputImage" @ "PlugInPatch_White_Detector_GridPlugIn_1"]

Overlay Cannot create connection from ["outputArray" @ "PlugInPatch_White_Detector_GridPlugIn_1"] to ["Rows" @ "Iterator_1"]

Overlay Cannot publish input port ["inputNumberOfRows" @ "PlugInPatch_White_Detector_GridPlugIn_1"]

Overlay Cannot publish input port ["inputNumberOfColumns" @ "PlugInPatch_White_Detector_GridPlugIn_1"]

Overlay Cannot publish input port ["inputHeightInPercent" @ "PlugInPatch_White_Detector_GridPlugIn_1"]

Overlay Cannot publish input port ["inputWidthInPercent" @ "PlugInPatch_White_Detector_GridPlugIn_1"]

Overlay Cannot publish input port ["inputYOriginInPercent" @ "PlugInPatch_White_Detector_GridPlugIn_1"]

Overlay Cannot publish input port ["inputXOriginInPercent" @ "PlugInPatch_White_Detector_GridPlugIn_1"]

Macro Patch State restoration failed on node "Patch_2"

(null) State restoration failed on

cybero's picture
Re: White Detector Grid + troubles with ...

well spotted , dust. I hadn't got around to looking at the project file, but knowing now that it lacks some crucial code, I shall await further information prior to investigating this further.

Thanks Lango for the Shark output, however, I do think that an Instruments dtrace [my preference] might well have been more informative as would any errors [or caveats], however slight, popping up in XCode. Did you get a totally clean build in XCode for White Detector patch?

Shark's very good for optimizing, and does provide insight into underlying code problems though. Instruments, I find, is just that much quicker, generally useful and it doesn't mind being asked to focus upon one application, although you could point that at everything else. I find Shark demands rather of one's patience, although it is pretty thorough.

I think I'd still need to have the actual code or the problematic segments to better figure out what's vexing your carefully crafted, solid coding.

One thing is perfectly obvious, your Composer process is hogging a whole lot of CPU time, so a definite candidate for total improvement , and I can understand you choosing Shark.

Another thing that is obvious is that nil CPU time is spent upon LockBufferRepresentationWithPixelFormat - which your patch obviously needs to work upon a subregion of the image source using the provided pixel format and color space . I'm sure you've pointed us in the right general direction though.

Have you checked upon the endianness of your code, the Developer samples have got a few problems, see the quartz developer list for most examples of these, indeed, I often find that if I have a niggling problem with QC & other code stocks, that a quick visit to the dev lists often reaps rewards.

#if __BIG_ENDIAN__
    pixelFormat = QCPlugInPixelFormatARGB8;
#else
    pixelFormat = QCPlugInPixelFormatBGRA8;
#endif

Are all the unknowns shown the result of a lack of related code / patch /app in place at our end?

Lango's picture
Re: White Detector Grid + troubles with ...

dust wrote:
... i am afraid that your x code project is not entirely zipped meaning there are no class files or resources bundled or encapsulated inside your x code project therefore i am unable to build the required plugin to see what is going...

ahhh! what a simple mistake, not sure what I was thinking. How embarrassing. I will update with the correct files this afternoon (bout 8 hours), the profiles have the code embedded in them, but that is a bit cumbersome to view the project as a whole.

dust wrote:
...im not sure i could solve your problem anyways...

Any advice would be helpfull. The first bug i believe is a memory leak, and i'm just not using release somewhere, or I try to when I shouldn't. The second issue I have no idea how to fix.

Thanks for letting me know I uploaded the incorrect file!

Lango's picture
Re: White Detector Grid + troubles with ...

Hi cybero

For the first bug I was trying to use Instruments with the memory leak template, however I couldn't get much information from it. I think it was my lack of understanding of what I was looking at.

For the second bug/issue, the lag, I was using Shark to see what was taking most of the time. And from my memory that was the lockBufferRepresentationWithPixelFormat call, which took 6 seconds of my 30 second profile. But you said it didn't take any time, so I may be wrong. I'm at work on a PC at the moment so i can't look at the profiles till i get home.

The endian didn't even occur to me, I thought it wouldn't of been an issue since Apple produces their own CPUs. Wouldn't every make have the same endian?

Thanks for the help so far! Though i don't think much more can happen till I upload the correct files, which I will do tonight.

Cheers

Lango

Lango's picture
Re: White Detector Grid + troubles with ...

I have updated my post with the correct source and yes, this is a bump, since the edit wasn't picked up by the Recent Topics :)

cwright's picture
Re: White Detector Grid + troubles with ...

Lango wrote:
The endian didn't even occur to me, I thought it wouldn't of been an issue since Apple produces their own CPUs. Wouldn't every make have the same endian?

Apple does not produce their own CPUs, and never has -- they've had Motorola and IBM do much of that, and Intel too in more modern times. Using x86 basically requires x86 endian (opposite of PPC endian) due to a number of conventions used in the instruction set. It could be changed, but it would require a zillion tiny changes, and if one was missed, it'd risk causing a kernel panic or a hard system lock.

And there's the problem of rewriting most of the low-level x86 code that's available. And applications that make use of such low level code (out of apple's control). And etching your own silicon (Very Expensive).

cwright's picture
Re: White Detector Grid + troubles with ...

cybero wrote:
Shark's very good for optimizing, and does provide insight into underlying code problems though. Instruments, I find, is just that much quicker, generally useful and it doesn't mind being asked to focus upon one application, although you could point that at everything else. I find Shark demands rather of one's patience, although it is pretty thorough.

Shark can focus on one app as well. We do that all the time when optimizing our release builds. It just requires some dropdown selection to make it only look at the one app.

Instruments, on the other hand, is very useful (as you've already stated) for finding numerous problems (file IO bottlenecks, memory leaks, etc).

cybero's picture
Re: White Detector Grid + troubles with ...

Well, maybe its me but I find that Shark takes a long while doing a thorough 'scan' of all active processes.

Am I simply missing the obvious, that I could have saved myself a lot of time by simply using the drop down menu to force Shark to focus solely upon the process queue of interest?

cybero's picture
Re: White Detector Grid + troubles with ...

Firstly, so long as one has placed Histogram.h into the process queue, your Grid compiles AOK.

Findings - Initial

Both x and y percentage values can cause a crash - first finding after compiling your grid patch.

x inputs OK up to 200 percent, thereafter it also crashes

y crashes with even 1 percent input.

will communicate more on this later today.

cwright's picture
Re: White Detector Grid + troubles with ...

cybero wrote:
Well, maybe its me but I find that Shark takes a long while doing a thorough 'scan' of all active processes.

Am I simply missing the obvious, that I could have saved myself a lot of time by simply using the drop down menu to force Shark to focus solely upon the process queue of interest?

I honestly see almost no point in doing a "full" scan of all processes. So I've only done it maybe once, out of curiosity. I don't know how long it takes, since it's not a fresh operation to me.

Single-process may be faster (much less stuff to deal with), but it's also very very slow in some cases. Apps that have lots of loaded frameworks+plugins (QC) seem to take ages to "process" once the profiling is complete. I generally compile, test in QC, shark it, stop the rendering once the profile "pops" (the sound it makes when it stops sampling) to free up some CPU, and then go read up on some blogs on another space while I wait ;) (that's how I can post on the mailing list so frequently, and post here too when necessary -- Thanks Shark! ;)

Our stand-alone apps (QuartzCrystal, QuartzBuilder, internal projects) don't take very long at all to process, even though they may use QC (and thus have the same frameworks+plugins loaded) -- I'm not sure exactly why that is...

In any event, single-process sharking gives us a very detailed view of what's taking time/blocking in our software, and there aren't any instruments that provide that kind of detail (I'll often go so far as to look at the actual assembly instructions that are taking large amounts of time, and figure out how to restructure the calculations to operate more smoothly, or have the compiler work for us, rather than against us) -- there's a recent perpendiculo.us post that I wrote that stemmed from that kind of research. And for that kind of work, Shark blows Instruments out of the water. Conversely, for leaks, Instruments blows shark out of the water. All comes down to using the correct tool for the job :)

Lango's picture
Re: White Detector Grid + troubles with ...

cybero wrote:
Firstly, so long as one has placed Histogram.h into the process queue, your Grid compiles AOK.

Ahh, i'm not having much luck getting my code to other people. That is left over code, it can be deleted. I was trying to call the histogram class that the tutorial uses.

cybero wrote:
x inputs OK up to 200 percent, thereafter it also crashes

So it does. It has been suggested to me, that it might be a rounding issue when I try to cap the values. This may cause one of my cells to access a part of the buffer that doesn't exist (some pixels that are outside the bounds).

I was also told that if i change from NSNumbers to c types it will be faster. Is this valid? Nothing came up in shark to say that the NSNumbers where consuming time. I still don't want to change over to c types till I know what is wrong with my current code though.

I've been given instructions on how to use OCUnit with custom patches, but I haven't had time to go through it yet. Hopefully I will have it done today. Which can allow me to do more thorough testing.

Thanks for the help so far.

Lango

Lango's picture
Re: White Detector Grid + troubles with ...

cybero wrote:
Have you checked upon the endianness of your code, the Developer samples have got a few problems, see the quartz developer list for most examples of these, indeed, I often find that if I have a niggling problem with QC & other code stocks, that a quick visit to the dev lists often reaps rewards.

#if __BIG_ENDIAN__
    pixelFormat = QCPlugInPixelFormatARGB8;
#else
    pixelFormat = QCPlugInPixelFormatBGRA8;
#endif

I have put this into my code just before my lock buffer call. And it turns out I was using the wrong pixel format for my computers endian.

...
// Get the correct pixelformat (This reduces the time needed when generating the histogram)
   NSString* pixelFormat;
   #if __BIG_ENDIAN__
      pixelFormat =  QCPlugInPixelFormatARGB8; // This is the tutorials one
   #else
      pixelFormat = QCPlugInPixelFormatBGRA8; // This is my iMac
   #endif
 
   // Get a buffer representation from the image (This is an expensive call, keep outside loop)
   if (![imageToUse lockBufferRepresentationWithPixelFormat:pixelFormat
                                      colorSpace:colorSpace
                                       forBounds:[imageToUse imageBounds]])
 
      return NO;
...

And it has pushed up my framerate (on the settings I need) from 10 - 15 fps to 20 - 25 fps. Which is fantastic.

Two things I have noticed from this.

1/ I tried to do something similar for my call to generate the histogram. similar to

   #if __BIG_ENDIAN__
      pixelFormat =  vImageHistogramCalculation_ARGB8888(&buffer, histograms, 0);
   #else
      pixelFormat = vImageHistogramCalculation_BGRA888(&buffer, histograms, 0); // This call doesn't exist
   #endif

But there is no vImageHistogramCalculation_BGRA888() so I was unable to do this. So I left it with the vImageHistogramCalculation_ARGB8888() call.

I thought this may have increased the time of my histogram call since it may need to change the pixelformat. Looking at a new shark profile the histogram call consumes 3 seconds more of my 30 second profile then it did before I did the endian check.

However this could be misleading. Overall the framerate is quicker, this means there is now more opportunity for histogram call to be run which could cause it appear that it is now taking longer. What I really need in Shark, besides the amount of time in the method is also the number of times the method was called. This could clear this up for me.

2/ The lockBufferRepresentationWithPixelFormat call no longer calls createPixelBufferFromImageBuffer. Something about this doesn't make sense to me.

From this experience you could assume that the pixel format caused this. However the HistogramOperation tutorial uses QCPlugInPixelFormatARGB8 which is the wrong pixel format for my computer, however it's call to lockBufferRepresentationWithPixelFormat never goes on to call createPixelBufferFromImageBuffer. Perhaps its a combination of the pixel format and when the lock call is being made?

Either way the second issue has been solved for me, which leaves me with just my memory leak. Hopefully I can solve that one today.

One more question now that I think of it. Does anyone know how you can get the pixel to unit ratio inside the custom patch? Right know my dimensions are in pixels, it would also be handy to be able to return it in units.

Thanks everyone again for your help!

Cheers

Lango

Lango's picture
Re: White Detector Grid + troubles with ...

I have found the fix for my first bug.

It wasn't a memory leak as such. My cell would try to access a part of the image that isn't there. (Tom Butterworth from the newsgroup pointed out this may be the case).

The line that caused this is

self.yOffsetPixels = [ [NSNumber alloc] initWithFloat:[imageToUse bufferPixelsWide]*[self.yOffsetNormalised floatValue] ];

It should be using the bufferPixelsHigh, not the bufferPixelsWide

self.yOffsetPixels = [ [NSNumber alloc] initWithFloat:[imageToUse bufferPixelsHigh]*[self.yOffsetNormalised floatValue] ];

Very silly mistake indeed, but these things happen, especially in the world of copy and paste :)

Thanks for everyones help.

I'm going to have a bit of a break, then try and hook up OCUnit to it, then look at how to return the dimensions in units instead of pixels.

Cheers

Lango

jersmi's picture
Re: White Detector Grid + troubles with ...

Just to say, been following your progress with this. Exciting work, great idea, very promising!

Lango's picture
Re: White Detector Grid + troubles with ...

Thanks jersmi

I'm happy with how it is turning out. A few more tweeks and I should be able to release a stable version in the next couple of weeks.

Peter's picture
Re: White Detector Grid + troubles with ...

Regarding the memory leak: http://web.archiveorange.com/archive/v/jFOq6Li93zn734F7727w

For some reason QC doesn't release the memory when using an input image but no output one. Very annoying.

Does anyone knows a way around this?

-Peter