Fiducial detection based on topological region adjacency information with identification by angle information

In my last post we discussed blob extraction and event tracking. We will continue with that project by adding support for two-dimensional fiducial tracking. We will attempt to implement the fiducial detection algorithm used on the Topolo Surface1. We will first describe the fiducials and how their properties are encoded in their structure, and we will add a class to our project to support fiducial detection and rendering. When finished we will obtain the following renderings:

fiducial_balance
Fiducial detected with identification number 12345 (balance mode).
fiducial_threshold
Fiducial detected with identification number 12345 (threshold mode).

Below is a fiducial we will be attempting to detect in our project.  We will be using the blob extractor developed in my previous post to generate a container hierarchy of the blobs in the fiducial.  From this we will evaluate the properties of the fiducial including location, orientation, and identification. Go here to generate fiducials in SVG format.

fiducial
Example fiducial with identification number 12345.

Evaluation of the center black bob and its white child provide the necessary information to determine location and orientation.  In the diagram below the center of the black blob yields the fiducial center and the center of the white blob provides the offset necessary to generate an orientation vector.

fiducial
View of center blobs.

The remaining 16 child blobs (in fiducial number 12345 there are 6 white and 10 black) provide the 16-bit encoding to yield an identification number. We will encode the white blobs as ones and the black blobs as zeroes. These blobs will be evaluated counter-clockwise beginning at the orientation vector. For fiducial 12345 this encoding yields 00110000001110012 = 1234510.

We will first modify our cBlob class to overload the strictly less than operator for sorting purposes. We will initially sort our blobs from smallest to largest area (smaller blobs must be contained in larger blobs). In addition we will add two properties to this class, a children vector and node height property. Lastly, the contains() method determines whether a node is contained within another and, if so, appends itself to the children vector and updates the height property accordingly.

#ifndef BLOB_H
#define BLOB_H

#include <vector>
#include <math.h>

enum { BLOB_NULL, BLOB_DOWN, BLOB_MOVE, BLOB_UP }; // event types

struct point {
	double x, y;
};

class cBlob {
  private:

  protected:

  public:
	point location, origin;	// current location and origin for defining a drag vector
	point min, max;		// to define our axis-aligned bounding box
	int event;		// event type: one of BLOB_NULL, BLOB_DOWN, BLOB_MOVE, BLOB_UP
	bool tracked;		// a flag to indicate this blob has been processed

	std::vector<int> children;
	int height;

	bool operator<(const cBlob& b) const;
	bool contains(const cBlob& b, int i);
};

#endif

The definitions of the two methods are fairly straightforward. We use the area of the axis-aligned bounding box to sort the blobs. We could use the area of a circle, but this would yield additional overhead (a smaller circle will yield a smaller bounding box). The contains() method uses the property that a smaller circle is contained within a larger circle provided the distance between center points plus the smaller radius is less than the larger radius.

#include "blob.h"

bool cBlob::operator<(const cBlob& b) const {
	return ( (max.x-min.x)*(max.y-min.y) < (b.max.x-b.min.x)*(b.max.y-b.min.y) );
}

bool cBlob::contains(const cBlob& b, int i) {
	double radius0 = (max.x-min.x)/2 > (max.y-min.y)/2 ? (max.x-min.x)/2 : (max.y-min.y)/2;
	double radius1 = (b.max.x-b.min.x)/2 > (b.max.y-b.min.y)/2 ? (b.max.x-b.min.x)/2 : (b.max.y-b.min.y)/2;

	double dist = sqrt(pow(b.location.x-location.x,2.0) + pow(b.location.y-location.y,2.0));

	if (dist < radius0 - radius1) {
		children.push_back(i);
		if (!(b.height < height)) height = b.height + 1;
		return true;
	}
	return false;
}

Below we define a fiducial data type to store a fiducial's properties. The cFiducialIdentifier is a helper class to store the 16-bit encodings and sort them by orientation (relative to fiducial orientation). Lastly, the cFiducial object declares two properties, a vector of detected fiducials and a vector of identifiers (for evaluating identification via the generateID() method) and a method, extractFiducials(), which accepts the blob vector generated by our tracker and returns a reference to the detected fiducials.

#ifndef FIDUCIAL_H
#define FIDUCIAL_H

#include <algorithm>
#include <vector>
#include <math.h>
#include "blob.h"

struct fiducial {
	point center;
	point dimensions;
	double orientation;
	int id;
};

class cFiducialIdentifier {
  private:
	double orientation;
	bool   value;

  protected:

  public:
	cFiducialIdentifier(double orientation, bool value);
	~cFiducialIdentifier();
	bool operator<(const cFiducialIdentifier& f) const;

	friend class cFiducial;
};

class cFiducial {
  private:
	std::vector<fiducial> fiducials;
	std::vector<cFiducialIdentifier> identifier;

  protected:

  public:
	cFiducial();
	~cFiducial();

	std::vector<fiducial>& extractFiducials(std::vector<cBlob>& blobs);
	int generateID();
};

#endif

In our extractFiducials() method we first clear the fiducials from the previous pass. We need at least five blobs to detect a fiducial (no encoding bits yields an identification of zero). We then build the container hierarchy and process our blobs. In the loop we search for blobs with a node height of four (the white fiducial border used to isolate the fiducial from the background). The first child of this blob (the larger black circle) contains the white blobs (ones encoding) and the middle white circle. The middle white circle contains the black blobs (zeroes encoding) and the center black blob. At this point we evaluate the center of the blob and move on to the child white blob (the offset) to evaluate the orientation. Then we proceed to process the encoding by evaluating the orientation of the encoding blobs relative to the fiducial orientation. We push these onto the identifier for sorting, generate an id, and push the fiducial onto the vector.

#include "fiducial.h"

cFiducialIdentifier::cFiducialIdentifier(double orientation, bool value) : orientation(orientation), value(value) {
}

cFiducialIdentifier::~cFiducialIdentifier() {
}

bool cFiducialIdentifier::operator<(const cFiducialIdentifier& f) const {
	return (orientation < f.orientation);
}

cFiducial::cFiducial() {
}

cFiducial::~cFiducial() {
}

std::vector<fiducial>& cFiducial::extractFiducials(std::vector<cBlob>& blobs) {
	fiducials.clear();

	if (blobs.size() < 5) return fiducials;

	sort(blobs.begin(), blobs.end());

	for (int i = 0; i < blobs.size() - 1; i++) {
		if (blobs[i].event == BLOB_UP) continue;
		for (int j = i + 1; j < blobs.size(); j++) {
			if (blobs[j].event == BLOB_UP) continue;
			if (blobs[j].contains(blobs[i], i)) break;
		}
	}

	fiducial temp; // center, dimensions, orientation, id
	double theta;
	double x, y; int child1, child2, child3, child4;

	identifier.clear();
	for (int i = 0; i < blobs.size(); i++) {
		if (blobs[i].height == 4) {										// possible fiducial
			child1 = blobs[i].children[0];									// first child (ones)
			if (blobs[child1].height != 3) break;

			child2 = blobs[child1].children[blobs[child1].children.size()-1];				// second child (zeroes)
			if (blobs[child2].height != 2) break;

			child3 = blobs[child2].children[blobs[child2].children.size()-1];				// third child
			if (blobs[child3].height != 1) break;

			temp.center.x = blobs[child3].location.x;							// evaluate center
			temp.center.y = blobs[child3].location.y;

			child4 = blobs[child3].children[0];
			x = blobs[child4].location.x - temp.center.x;
			y = blobs[child4].location.y - temp.center.y;
			temp.orientation = atan2(y, x);									// evaluate orientation
			if (temp.orientation < 0) temp.orientation += 2 * M_PI;

			//process id
			for (int j = 0; j < blobs[child1].children.size() - 1; j++) {
				x = blobs[blobs[child1].children[j]].location.x - temp.center.x;
				y = blobs[blobs[child1].children[j]].location.y - temp.center.y;
				theta = atan2(y, x); if (theta < 0) theta += 2 * M_PI; theta -= temp.orientation; if (theta < 0) theta += 2 * M_PI;
				identifier.push_back(cFiducialIdentifier(theta, true));  // ones
			}
			for (int j = 0; j < blobs[child2].children.size() - 1; j++) {
				x = blobs[blobs[child2].children[j]].location.x - temp.center.x;
				y = blobs[blobs[child2].children[j]].location.y - temp.center.y;
				theta = atan2(y, x); if (theta < 0) theta += 2 * M_PI; theta -= temp.orientation; if (theta < 0) theta += 2 * M_PI;
				identifier.push_back(cFiducialIdentifier(theta, false));  // zeroes
			}

			sort(identifier.begin(), identifier.end());
			temp.id = generateID();
			fiducials.push_back(temp);
		}
	}
	return fiducials;
}

int cFiducial::generateID() {
	int id = 0;
	for (int i = 0; i < identifier.size(); i++) id += (identifier[i].value ? 1 : 0)*pow(2, i);
	return id;
}

We will add an OpenGL helper function for rendering the identification number. We will use SDL_ttf for rendering text.

void renderText(const TTF_Font *font, const GLubyte& r, const GLubyte& g, const GLubyte& b, const double& x,  const double& y, const double& z,  const std::string& text) {
	SDL_Color color = {r, g, b};
	//SDL_Surface *Message = TTF_RenderText_Solid(const_cast<TTF_Font*>(font), text.c_str(), color);
	//SDL_Surface *Message = TTF_RenderText_Shaded(const_cast<TTF_Font*>(font), text.c_str(), color);
	SDL_Surface *message = TTF_RenderText_Blended(const_cast<TTF_Font*>(font), text.c_str(), color);
	GLuint texture;
	setupTexture(texture);
	glTexImage2D(GL_TEXTURE_2D, 0, 4, message->w, message->h, 0, GL_BGRA, GL_UNSIGNED_BYTE, message->pixels);
	glEnable(GL_BLEND);
	glBlendFunc(GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA);
	glBegin(GL_QUADS);
	glTexCoord2f(0, 0); glVertex3f(x,              y,              z);
	glTexCoord2f(1, 0); glVertex3f(x + message->w, y,              z);
	glTexCoord2f(1, 1); glVertex3f(x + message->w, y + message->h, z);
	glTexCoord2f(0, 1); glVertex3f(x,              y + message->h, z);
	glEnd();
	glDisable(GL_BLEND);
	deleteTexture(texture);
	SDL_FreeSurface(message);
}

In our main function we create an instance of cFiducial and a vector of fiducials (returned from extractFiducials()). We have added the initializing of SDL_ttf and loading a font. Finally, we call our extractFiducials() method near the end of the event loop, render circles about the fiducials' centers, draw a line identifying the orientation, and render the identification number.

#include <sstream>

#include "src/filter.h"
#include "src/glhelper.h"
#include "src/tracker.h"
#include "src/fiducial.h"

// capture parameters
static const int CDEVICE = 0;
static const int CWIDTH  = 640;
static const int CHEIGHT = 480;

// filter parameters
static const int    FKERNELSIZE = 7;	// for gaussian blur
static const double FSTDDEV     = 1.5;
static const int    FBLOCKSIZE  = 63;	// for adaptive threshold
static const int    FC          = -5;

// display (native resolution of projector)
static const int WIDTH  = 640; //1280
static const int HEIGHT = 480; //800
static const int BPP    = 32;

// tracker parameters
static const double TMINAREA   = 16;	// minimum area of blob to track
static const double TMAXRADIUS = 24;	// a blob is identified with a blob in the previous frame if it exists within this radius

//////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// main
int main(int argc, char *argv[]) {
	bool active = true, balance = true, fullscreen = false;
	int render = 0;

	cFilter filter(CDEVICE, CWIDTH, CHEIGHT, FKERNELSIZE, FSTDDEV, FBLOCKSIZE, FC);
	filter.balance(balance);

	cTracker tracker(TMINAREA, TMAXRADIUS);

	cFiducial fid; vector<fiducial> fiducials;

	SDL_Init(SDL_INIT_EVERYTHING);
	SDL_Surface *screen = SDL_SetVideoMode(WIDTH, HEIGHT, BPP, (fullscreen ? SDL_FULLSCREEN : 0) | SDL_HWSURFACE | SDL_OPENGL);
	SDL_Event event;

	TTF_Init();
	TTF_Font *font = TTF_OpenFont("media/Coalition.ttf", 18);

	setupOrtho(WIDTH, HEIGHT);
	glClearColor(0.0f, 0.0f, 0.0f, 0.0f);

	GLuint texture;
	setupTexture(texture);

	while (active) {
		while (SDL_PollEvent(&event)) {
			switch (event.type) {
			case SDL_QUIT:
				active = false;
				break;
			case SDL_KEYDOWN:
				switch (event.key.keysym.sym) {
				case SDLK_f:			// toggle fullscreen
					fullscreen ^= true;
					screen = SDL_SetVideoMode(WIDTH, HEIGHT, BPP, (fullscreen ? SDL_FULLSCREEN : 0) | SDL_HWSURFACE | SDL_OPENGL);
					break;
				case SDLK_b:			// toggle balance
					balance ^= true;
					filter.balance(balance);
					break;
				case SDLK_1:			// captured frame
					render = 0;
					break;
				case SDLK_2:			// filtered frame
					render = 1;
					break;
				case SDLK_3:			// gaussian blur
					render = 2;
					break;
				case SDLK_4:			// adaptive threshold
					render = 3;
					break;
				}
				break;
			}
		}

		filter.filter(); // capture, filter, blur, (balance), threshold

		switch (render) {
		case 0:
			glClear(GL_COLOR_BUFFER_BIT);
			renderTexture(texture, filter.captureFrame(), false, WIDTH, HEIGHT);
			break;
		case 1:
			glClear(GL_COLOR_BUFFER_BIT);
			renderTexture(texture, filter.filterFrame(), true, WIDTH, HEIGHT);
			break;
		case 2:
			glClear(GL_COLOR_BUFFER_BIT);
			renderTexture(texture, filter.gaussianFrame(), true, WIDTH, HEIGHT);
			break;
		case 3:
			glClear(GL_COLOR_BUFFER_BIT);
			renderTexture(texture, filter.thresholdFrame(), true, WIDTH, HEIGHT);
			break;
		}

		tracker.trackBlobs(filter.thresholdFrame(), true);

		fiducials = fid.extractFiducials(tracker.getBlobs());

		renderBlobs(tracker.getBlobs(), 0.0f, 1.0f, 1.0f, 1.0f, 1.0f, 0.0f);

		for (int i = 0; i < fiducials.size(); i++) {
			renderCircle(fiducials[i].center.x, fiducials[i].center.y, 16.0f, 30, 1.0f, 0.0f, 1.0f);
			glColor4f(0.0f, 1.0f, 0.0f, 1.0f);
			glBegin(GL_LINES);
			glVertex3f(fiducials[i].center.x, fiducials[i].center.y, 0.0f);
			glVertex3f(fiducials[i].center.x+128*cos(fiducials[i].orientation), fiducials[i].center.y+128*sin(fiducials[i].orientation), 0.0f);
			glEnd();
			std::stringstream out;
			out << fiducials[i].id;
			std::string s = out.str();

			glPushMatrix();
			glTranslatef(fiducials[i].center.x, fiducials[i].center.y, 0.0f);
			glRotatef(fiducials[i].orientation*180/M_PI, 0.0f, 0.0f, 1.0f);
			renderText(font, 127, 255, 127, 0.0f, 0.0f, 0.0f, s);
			glPopMatrix();
		}

		SDL_GL_SwapBuffers();
	}

	deleteTexture(texture);

	TTF_CloseFont(font);
	TTF_Quit();

	SDL_Quit();

	return 0;
}

The updated Makefile follows:

all: main.cc lib/capture.o lib/filter.o lib/glhelper.o lib/blob.o lib/tracker.o lib/fiducial.o
	g++ main.cc lib/capture.o lib/filter.o lib/glhelper.o lib/blob.o lib/tracker.o lib/fiducial.o cvblobslib/libblob.a -o bin/main -L/usr/lib `sdl-config --cflags --libs` -lSDL_ttf `pkg-config opencv --cflags --libs` -lGL

lib/capture.o: src/capture.h src/capture.cc
	g++ src/capture.cc -c -o lib/capture.o

lib/filter.o: lib/capture.o src/filter.h src/filter.cc
	g++ src/filter.cc  -c -o lib/filter.o

lib/glhelper.o: lib/blob.o src/glhelper.h src/glhelper.cc
	g++ src/glhelper.cc -c -o lib/glhelper.o

lib/blob.o: src/blob.h src/blob.cc
	g++ src/blob.cc -c -o lib/blob.o

lib/tracker.o: lib/blob.o src/tracker.h src/tracker.cc
	g++ src/tracker.cc -c -o lib/tracker.o -I/usr/local/include/opencv/

lib/fiducial.o: lib/blob.o src/fiducial.h src/fiducial.cc
	g++ src/fiducial.cc -c -o lib/fiducial.o

clean:
	@rm -f *~ src/*~ lib/* bin/*

In my next post I may revisit the algorithm used by cvBlobsLib, Connected Component Labeling, and offer an implementation using the Disjoint Set data structure. As always, comments are welcome.

Don't forget to run make in the cvblobslib folder to generate the libblob.a file required to link this project.

Download this project: fiducial.tar.bz2

References:
1. Nishino, Hiroki (2010). Topolo Surface: A 2D Fiducial Tracking System Based on Topological Region Adjacency Information and Angle Information. Journal of Information Processing, 18:16-25.

Leave a Reply

Your email address will not be published.