A webcam that truly looks at you

Episode 1

Sa 11. Jul 01:45:48 JST 2015

This has been lying around on my desk for at least two weeks, but I just couldn't motivate myself to document it a bit. Until now.

Seeing that I have gotten something to move in my previous project, I kind of wanted to try using a servo. However, just buying a servo and getting it to work just to see it working seemed kind of pointless. The idea what to do with it came during one of my daily Skype calls with my notebook's built-in camera, when I moved out of the picture frame again. The camera should just follow me.
This needs three ingredients. A moveable camera, a mechanism that moves the camera, and software that controls the mechanism that moves the camera.

The first is easy to acquire. I went to a shop and bought a Buffalo webcam. The quality is mediocre, but the price — ¥1850 — is just right for this kind of project. The only heavy part of it is the foot, so I snapped that of and got a nice and light camera. (I could have just opened the camera casing and removed two little screws instead, but I only figured that out five minutes later.)

The second, a mechanism that moves the camera, was also relatively easy. I bought a microservo off Amazon. It has no branding whatsoever, but I'm supposing that it is a TowerPro SG90, maybe an imitation of it. Nevertheless, it works nicely. I've got it wired up in my usual, whacky way. (This is probably the worst soldering I've done so far.)
To liven it up, I've also added some whacky cardboard holder that keeps the whole thing in place. Buffalo Webcam, Servo and cardboard holder While you can't see it that well on the image, yes, that is screws driven into the cardboard. And you'd be surprised how sturdy this whole thing is.

With the third part, the software that controls the whole thing, the real problems began.
From what I read, the SG90 servo wants a 50Hz PWM signal with pulses between 1 and 2ms. That translates to a duty cycle of 5 to 10 percent. My servo does seem to moving around a bit even with duty cycles of 2 to 13 percent, but in general, the specs of the SG90 seem to apply. (remember, I still don't know what kind of servo I have.) In theory, the Raspberry Pi, which fires the servo, should have a hardware PWM pin that allows accurate PWM signals. In practice, I was not able to get that to work at all.
Software PWM, on the other hand side, is just entirely unreliable. You might think: "Ooh, it's not a big problem if one of the pulses is a bit off.", but here is the thing: With just one pulse, the servo can move by about a fifth of its range. So if you keep software PWM continuously on, you're at a twitching loss.
Playing around with the Pi and PWM, I discovered that the first pulse generated by the Pi after PWM activation has the right length with a relatively high reliability. So I wrote a shell script that turns on a (50Hz, 20ms cycle time) PWM for 10ms, and keeps it off for around 80ms. The 10ms are obviously half of the cycle time to be sure to catch the first pulse but miss the second. The 80ms have been found out by trial and error — any shorter and it gets unreliable, much longer and the movement becomes really jumpy.
This works relatively decently: In a few rare situations, it still jumps uncontrollably though. Ultimately I've decided that this is not good enough, and it needs an improvement in the next episode.

Unfortunately, getting the servo to move where I want is just one part of the software. The other is figuring out where that actually is. OpenCV to the rescue. They have a face recognition system called Haar Face Cascade. It is supposedly efficient. That means, that even if I do it on my i7-4750HQ, it is not doable at the full camera rate of 30FPS in real time on one processor. Not a problem, dropping a few frames it works nicely in a single thread. The accuracy I have been getting so far is decent. (As long as I don't tilt my head, then it stops completely.)
But that leaves one problem: How am I going to feed the webcam to some recorder programme, e.g. Skype, and the recognition software at the same time? v4l2loopback should be the solution to that, but getting all of this to work together is certainly the stuff for another episode.

Episode 2

Di 1. Sep 11:58:33 JST 2015

It's been a while.

A solution to the movement accuracy problem came in a day after I wrote the last episode, but I could not find motivation to write something about it. Maybe because of university classes, maybe because it is so not motivating in itself. (Note before anyone sues me: Anything I write about it is based on my personal experience and does not necessarily have to reflect the general truth.) The solution is called Cypress PSoC®4 Prototyping Kit. (Someone teach me how to use ™®℠) It came in a flat letter without the pins soldered on, but that was not too much of a hassle.

How did I even? A friend told me that there should be cheap and easily programmable FPGAs out there. I thought that having one would be the perfect thing to generate a PWM. Communication with the Pi would happen by some self-hacked GPIO-driven protocol. So I searched for cheap FPGAs and stumbled upon a list that contained said Prototyping Kit. Considering that it was cheap (¥910), I ordered it right away without any further reading. 5 minutes of background check would have informed me that this is indeed not an FPGA, but a microcontroller with some additional circuits on there. Cypress says that you can use and connect them in any arbitrary way without worrying, but I couldn't find how many of the individual circuits are available, how the connection actually works in the background or what worry means. For my purposes, there is probably enough of everything. And even if there wasn't, I could do it in software.

All this may or may not be so nice. However, one thing that really bothers me is how you program it: Cypress PSoC®4 Creator™ So there is this klickibunti thing, from which C source is generated in a nice and slow MSVC-y IDE… Dah. Really not my style.

The actual programming happens through a serial bus. As you can see, the circuit board is separated into two parts, one being a USB-to-Serial bridge (and some GPIO, too), the other the actual SoC. The idea is probably that you can quickly plug in 10 of these, program them, take them out, snap of the bridge and have a relatively tiny prototyping board. How this goes together with their constant advertisement that if you made a mistake in wiring you can just fix it in software I do not know. (I guess via the second option to program it, the 5-pin programming/debugging kit connector in the lower right corner on the picture (which obviously costs more than the board itself).) What this programming method is not meant for is frequent reprogramming. They let you do it, and having such a cheap thing allow it is kind of fancy, but the way to do it is kind of scary:
to be reprogrammed, the software (Or is it firmware?) in the microcontroller needs to read the new program from the serial bus. This is activated by pressing the small button on the chip while powering it. However, if you fail to embed this loader functionality in what you programmed, or if the programming fails, that's it until you buy that debugging kit. Meh.

Nevertheless, it does what I wanted to do.

The details are a bit different though. Instead of devising some sort of communication protocol and implementing that over GPIO, I just used the USB-to-serial bridge. The plus side of that is that I don't even need the Pi anymore. The other plus side is that it is dead easy. The minus side of it is that I have no idea how to do serial communication properly. Sometimes, a byte just gets lost, and I don't seem to be able to send a 0 byte either.
Thus, I have made the communication protocol the following:
0xff 0xff 0xff 0xff 0xff 0xff 0x01 [index:1] [data:2] First, I send a sequence of 255-bytes. Any message that ends in 255 will be ignored, and so will be all subsequent bytes till a 1 is read. Then I send an address byte (currently only 17 and 16 do anything sensible, 17 is where I connect the servo, 16 (i.e., Pin 1.6) is the tiny blue LED). An integer between 0 and 65024 is encoded into data by:
data[0] = i / 254 + 1 data[1] = i % 254 + 1
This works reasonably well for now.

What's left is the software side. Next episode…