|
Visual Attention System in OpenCLHi All, Could someone with some openCL experience please check out the following two vids to comment on if this would be possible in QC (in real-time) http://www.loria.fr/~rougier/research/movies/INRIA.mpeg http://www.loria.fr/~rougier/research/movie/lemmons.mpeg You can learn more about the effect below but in general its a way of simulating a humans visual processing. http://www.loria.fr/~rougier/research/attention.html http://www.loria.fr/~rougier/research/DNF.html http://hal.inria.fr/inria-00000143/PDF/nn.pdf I've got a code example (outside QC) if anyone's interested in trying to implement it in openCL
|
I'm interested ! Couldn't it work with openCV ?
(the lemmons.mpg is down btw, and the INRIA.mpg is only 2secs long, so can't see anything relevant)
looks like that second link should be: http://www.loria.fr/~rougier/research/movies/lemmons.mpeg i am not well versed in shaders or opencl, and am cursed with an HD 6750M so i will watch from the sidelines and cheer.
Yeah, it's some matrix math, yay! Do you have the code sample? It looks doable in QC, just mildly involved.
Hi All - Sorry for the delay in getting back onto this
There are two big research groups looking into neuromimetic models of attention, both of which have some code (see below)
If your not 100% sure of what I'm talking about I'd highly suggest reading the following - http://ilab.usc.edu/bu/theory/index.html - it gives a good understanding of the process and should make it obvious why QC might be a good environment for this - ie easy/fast creation of input maps (using image filters).
1) The CORTEX group from INRIA has some python implementations
http://www.loria.fr/~rougier/coding/index.html http://www.loria.fr/~rougier/research/DNF.html http://www.loria.fr/~rougier/research/src/DNF.py (this is the simplest to understand and probably the most useful - essentially its a winner takes all algorithm)
http://webloria.loria.fr/~rougier/coding/software/DNF.py (this is a more complex version of the DNF that takes transmission speeds into account)
For a more complex set of examples look at
http://www.loria.fr/~rougier/research/attention.html - This pages has a few movies and links to some more complex python code.
2) The iLab at the Uni of southern California
http://ilab.usc.edu/research/ (A good overview of all of there research into attention)
http://ilab.usc.edu/bu/ (This is their bottom up attention research group which is most relevant to people round here...in my opinion anyway)
http://ilab.usc.edu/bu/movie/index.html (some movies of their work) http://ilab.usc.edu/toolkit/ (their code)
It would work in conjunction with openCV - have a look at the following movies and you will have a much better understanding as to how it could be VERY useful for object tracking.
http://www.loria.fr/~rougier/research/movies/noise-high.avi http://www.loria.fr/~rougier/research/movies/distractors-high.avi http://www.loria.fr/~rougier/research/movies/saliency-high.avi
All of which have code available from - http://www.loria.fr/~rougier/research/attention.html
This approach could significantly improve the reliability of hand detection when using open cv and a kinect. Instead of always needing to keep your body and hand in different planes you could lock onto your hand and then move around a lot more freely.
thats some very interesting stuff on visual attention. most of the mapping seems possible in quartz already like color intensity and orientation. optical flow could be used to compensate for the eye movement. left right up down....
once you get all maps and make your first pass you could sequentially take a picture attention location to train with open cv/cuda/gpu by brute force of an orb recognition. use the trained models planar homograph matrix to update and re calibrate your orientation map with positional and rotation maps which will then update the saliency map possibly making the system be able to learn and remember.
those are just my initial thoughts. depending on how robust the search is potentially could be done. adding the kinects depth map, disparity map, and normal maps etc.. plus point maps defiantly would improve on the biological model.
here is a video of some finger detection and tracking. I'm using it to menu pick augmented objects. next is to add the sub menus on the rings and skeleton tracking. etc...
i have attention deficit disorder so I'm all for any type of machine learning computer vision that will tell me what I'm supposed to be paying attention to.
Hence my thoughts that QC might be good for this
This made me think of one of my favorite quotes - "There's a fine line between genius and insanity - think of a circle with a fine split in it, on one end of the circle is genius and on the other insanity" (please don't read into this)
Whats the application doing the finger tracking?
Check your email from the 14/4/11
i have herd that quote as well. it was explained to on line....
looks like genius is far away from crazy....
but if you make that line a circle....
then there is a fine line between the two.
i have been told both actually so it doesn't matter if i read to much into that statement. lol
just from looking at the visual attention theory mind map it seems to me that what you are asking is possible. it wouldn't be as fast using just a depth map. given that you need orientation intensity and color maps.
i have not been able to load up any of the movies to see the working robot. ;( but defiantly very interesting and i plan on revising some of this....
im tracking fingers in open frameworks and sending the data to qc via osc. i'm using open cv to do this... I'm using a convex hulling method based off segmented contours so although I'm using the kinect the function should work with any cam supposing you pass in the correct image. i have a qc plugin version of this but it runs very slow. i think it has something to do with converting the qc image into cv image and back again ?
here is a paper actually explaining how to use color skin maps with orientation maps to detect gestures with hand. it seems hidden markov model has the best positive results but a much longer train.
http://research.microsoft.com/en-us/um/people/awf/bmvc02/project.pdf