Controlling an ARDrone with the Kinect at #Dev8D

Posted on February 19, 2011


This was a bit of fun me and @davetaz cooked up on the first morning of the event – I have been doing work on getting the Kinect data (specifically the skeleton mapping data) out so that ordinary code can use it easily and Dave had been working on the drone control beforehand. There was about 200ms before we both said that it’d be cool to combine the two πŸ˜‰

The controlling code works by using the hand and shoulder locations and has some simple binary logic to work out if a position counts as a gesture or not. For instance, as long as the hands are reasonably separated, raising them past a certain point above your shoulder would cause an ‘up’ command to be sent, and lowering them past a threshold would send a ‘stop going up’ command. All the commands required the hands to be quite far apart so that you could drop your arms to your side and the drone would ignore you.

After a half hour of hacking in the main hall, we had gestures for all three axes working, and had a fourth type of gesture for rotate. Easier to watch the video to see that they should be pretty intuitive (hint: think of schoolkids pretending to be fighter planes)

Luckily, there are no videos of earlier control mechanisms I wrote, where you had to flap your arms to get the drone to go up and stay up πŸ˜‰

I recently wrote a gradual, more analogue controller, where you should be able to control the drone more responsively and reflectively and in more than one axis at a time, but I don’t have the drone so cant test or even run the code πŸ˜‰

In fact, the next day, @davetaz started to get a little cocky with his flying skills, but backed it up with a bit of stunt-flying πŸ˜‰ (I wasn’t able to see this, as I was running a workshop at Dev8D when this was filmed :()

(That’s @juliancheal with his finger on the emergency land button btw!)

Code is at btw with the hacky gesture stuff done in from line 181 or so onwards, and the interpretation of this done in

The code is actually quite general and is actually something I am trying to code into the OSCeleton project so that it would perform the same role. I wrote the osc…py code to fix the drawbacks I saw in OSCeleton and wanted to fix, but without delving into C code and staying in light and fluffy python land.


The OpenNI, SensorKinect and (probably) the PrimeSense stack – I struggled to find good and reliable instructions here, which is why my thoughts became aimed at getting a simple method to get the interesting stuff from the kinect, without having to understand or even install all this awkward stuff.

A combination of this, this and this seems to cover enough of the things you are required to do to get this stack working on Ubuntu.

The key component to the plan of simple use of skeleton data is the OSC protocol. When Dave Murray-Rust originally suggested this protocol in the #pmrhack event in Cambridge, my first instinct was to google for it. Lo and Behold, someone has already created a working project to start from, OSCeleton.

This handles opening the kinect, setting up a basic skeleton workflow in OpenNI and looping round, detecting people and trying to fit joint skeletons to them and them broadcasting out the 3D coordinates for the joints using the OSC protocol, which is very lightweight and well suited for this ‘real-time-centric’ type of use.

The python code I have written is my adaptation of where I’d like to take this project, providing it an OpenGL visualisation and a means to retain the last known position of the hands and shoulder joints, as the underlying code will simple stop broadcasting joint coordinates if it cannot interpret where they are, which is fair enough but awkward for simple re-use.

The file also rebroadcasts this hand/shoulder information to multiple endpoints, also using the OSC protocol, allowing multiple computers to make use of the joint data, without having to install anything apart from an OSC client, such as liblo or pyliblo.

One example of this was the addition of a simple OSC listener to a program called JMol, allowing for the rotation and zoom applied to a rendered molecule or protein to be controlled by gesture.

The python visualisation does require three libraries to run:Β ο»Ώpyliblo, PyOpenGL and PyGame but hopefully this is not too hard to sort out, as the normal packages for these within Ubuntu 10.04 and 10.10 work fine.

Posted in: creativity, play