Showing posts with label PLT. Show all posts
Showing posts with label PLT. Show all posts

Monday, 2 May 2011

Fascinating complexity


Here we go, another little step towards Master Degree! "Complex systems" is achieved and with full score, thanks to a crowd simulation based on Craig Reynolds' BOIDS and rendered by my own 3D engine "PLT".

I love emerging models - as one can easily foresee by reading the name of this blog - and BOIDS is no exception. A few rules and parameters, two constraints and a bit of randomness and we have a hell of algorithm, almost chaotic and showing a lot of unexpected and non-designed behaviours: formation of queues, spontaneous groups, and so on. Michele and I did a validation study, basing on the previous works of Crystals' Project, a joint effort in crowds simulation. Our model worked well, mathing the expected timings and behaviours, at least with reasonable agents' density; being BOIDS a simple reactive multi-agent system, our pedestrian do not embedded sufficient "intelligence" and can easily fall in many subtle cul-de-sac. It's a good result, anyway, and leaves a lot of space for further investigation.

Friday, 4 March 2011

The BOIDS are back in town!



During the last lesson of complex systems course, prof. Vizzari showed a crowd simulation based on BOIDS, an intriguing model invented by Craig Reynolds in 1986 that simulates bird flocks, fish schools and all that kind of self-organizing masses of living stuff. The basic model is based on three simple assumptions:
  1. each boid moves toward the other flockmates (cohesion)
  2. each boid moves away from crowded situations (separation)
  3. each boid follows the direction of the rest of the flock (alignment)
Even in its simplest formulation the model works very well, and gives a pleasant feeling of viability. I turned the model from 3D to 2D, by putting all boids on the y=0 plane, and here we go: we have a simple crowd simulator.

There exist several approaches to crowd simulation (like cellular automata or multi-agent systems) but I do think that BOIDS have many interesting capabilities.

In fact, one of the open issues in crowd simulations is the sponteneous formation of groups because of common goals, shared costumes, socio-political aspects, familiar relationships and so on. BOIDS leave a lot of space for embedding this wide spectrum of behaviours, whilst the model itself mantains the proxemic distances between people and let them gather peacefully, by the rules decribed above.

The video I embed shows my model in a simple scenario made of four groups (yellow, red, green and blue). Green people have their own goal to follow; red and blue have a shared one. Yellow people just walk around the space, with no specific purpose. In the middle of the screen there's a strong repulsor which keeps boids away. The dynamic works as I expect: there's the formation of clusters, boids belonging to the same group tend to aggregate and find their way even in overcrowded situations. What's fascinating is the emerging, unexpected behaviour of yellow ones: some of them hold still, some dodge the crowd, many other enter the moving groups and follow them wherever they go, conditioned by the first rule.Very nice.

Hajj - the annual pilgrimage to Mecca - is
one the most interesting scenario to simulate
Still, there's a lot of work to do that I'll share with my brilliant coursemate Michele. For instance, we'll try to identify the correct parameters that fit the experimental results, incorporating Elias Canetti's proxemic dynamics as well. We also could find a way to introduce different kind of non-penetrable contraints (walls, doors, and such). There's the problem of instructing complex goals (like finding a way to an exit in non-straight situations or following a leader), with an eye to stability and realism. Finally, there's lot of optimization to do: I wrote the code as general classes and new rules can be easily introduced by pushing a new function pointer to a "rules" vector. That's very flessible, allows a very rapid prototyping but it's very inefficient: the naive algorithm is O(n²), because every boid's position needs to be compared with the rest of the flock, and every rule is computed on its own, adding a costant factor. Too slow for serious business like sport events, concerts, and religious pilgrimages, just to name a few of the typical simulations that require millions of individuals at once.

Friday, 11 February 2011

A softer way of life

Soft computing course: completed! A sparkling 30/30 cum laude closes an astonishing and inspiring experience.

I chose this course primarly because of simulated annealing and neural networks, for I studied these techniques in a very formal way (cybernetics) and from a philosophic point of view (epistemology). I was interested into a more concrete approach, and I found it here. But there was much more!

I discovered a wide range of special neural networks, Hopfield's and Kohonen's in particular, which show several interesting capabilities that I'd like to experiment soon. I finally studied the infamous back-propagation, able to learn non-linearly separable tasks, and that was good. But despite my initial expectations, these weren't the most interesting parts.  

Genetic algorithms, for instance. It's funny to think of data structures "sharing genetic material", maybe discriminating partners according to their own sexual tastes (Booker's model), with their children being subject to random mutations. Well, believe it or not, but that funny stuff works! My experimental data clearly say that GA are a strong and robust way to tackle NP problems. The drawback is the difficulty of modeling the solutions' search space by using simple codified strings. Moreover, genetic algorithms can find a global optimum for a specific problem, but not whole classes.

So, here comes the Genetic Programming. Yep, double upper case here. Whereas the GA were funny because of data sharing genetic information, what we have here are copulating programs... so nice! Massive derivation trees create brand new offspring generation after generation, letting new abilities to emerge and spontaneously losing the useless (a process called automatic feature selection). The way GP works  immediatly calls for a LISP implementation, and that's what I tried to do... even though I'm still stuck on a pure functional, recursive crossover function.

In the end, my personal favourite model, whose I dedicated my blog to: the Particle Swarm Optimization. Yeah, triple upper case. As I wrote in the past, I love the spectrum of emergent behaviours this bunch of simple coordinates show. Just like Conway's game of life, there seems to be some kind of unpredictable lifeforms inside the model, and that is cool. Moreover, it can be quickly extended:


This example is an interesting kind of PSO that prof. Vanneschi proposed, named Multiswarm Repulsive PSO (or MRPSO). It features more than just one swarm living on different "islands", all looking for the same global optimum. After some generation, best individuals are cloned and replace the worst ones by migrating to the next island. Moreover, on some island, the particles run away from the best individual, instead of following its "pheromones trails". This approach solves two possible drawbacks of PSO: the lack of diversity and the tendancy that individuals show of blindly following the leader, thus keeping away from interesting unknown zones of the search space. The boost of performance is stunning, and still there's a lot of space for inventing new ways to improve it!

Monday, 19 October 2009

3D collision detection


3D libraries are conceived as just drawing primitives, and nothing more. That's why graphics programmers are asked to reinvent the wheel several times.

One of those wheels is the "world space" partitioning. In particular, the collision detection of meshes.

I don't know how serious engines handle it on a per-pixel/vertex basis, but I think they just do miracles. I implemented something simpler for my PLT by using cheap bounding spheres.

The improved geometries loader "keeps in mind" the minimum and maximum coordinates for each axis, while it constructs the vertex arrays; at the end of the loading process, it uses those coordinates to find the middle point of the mesh and the farthest vertex from it: they are the bounding sphere center and radius, respectively.

Checking if a mesh collides with another is straightforward:
  • transform the bounding centers from model to world space, and compute the distance (let's call this value "D") between them
  • sum up the two bounding radius (let's call it "S")
  • if D > S meshes don't collide
It's kind of a cheat, but it works.
Of course, using complex meshes will reveal its nature: it's far from being accurate. But it so fast!

I think there's a couple of possible optimizations.

The first one is, well, "CAD driven": I could simply cut down every model into many "groups", and give everyone its own bounding sphere. The drawback is, obviously, that the designer is asked to think meshes as a collection of separate tranches, and everything works better if each partition fills out its "space". Moreover, this solution makes LOD design slightly more complex.

The second one is more challenging: compute the partitions on fly, during geometry loading, by using a stochastic approach: areas denser of polygons are more likely to be relevant for the mesh, and should be enclosed by their own bounding sphere.

Friday, 9 October 2009

The evolution of parallax

Once upon a time, videogames were bidimensional, and it was fine.
One day, in order to give some depthness to the scene, programmers introduced a technique called parallax scrolling: by moving the elements of the background at different speed, a sense of spatiality and immersivity emerged.

The sensation was outstanding for the age, but the drawback was a very high computational cost, that kept this strategy confined on 16bit machines. With the arrival of 3D graphics, parallax scrolling was abandoned.

In 2001, a Japanese researcher named Tomomichi Kaneko introduced a brilliant modification of texture mapping able to provide "the capability to represent the motion parallax effect to the textures placed on 3D objects". His algorythm was basically a normal mapping, whose strategy is to use a texture's RGB channels to map a 3D normal, overriding the vectors on surfaces; Tomomichi basically suggested to enhance the feeling of movement by displacing textures by a function of the view angle in tangent space.

The tangent space itself is the trick behind all, and let us compute stuff in a comfortable space: instead of doing tons of computations in world space, we "look at things" from the top of the texture.

The generation of the special coordinates (normal, tangent and binormal) is done "client side" during the geometries loading, and sent to the server as vertex attributes.

The normal mapping itself requires at least a couple of textures to be sent and, guess what, I faced again the limits of DirectX9 compliant boards. I wanted to keep at least 4 lights on the scene, and use three textures (diffuse, normal map, and a gloss map). I quickly overflowed the capabilities of my VGA card and learnt an interesting thing.

GLSL is paranoid

I'm not sure it's a GLSL thing, but that's it: it's paranoid. It expects the worst scenario to happen. And therefore does not optimize a thing. I explain.

Suppose you have a code similar to this:

void DoSomething(int number_of_light) {
  ...
  Use( gl_LightSource[number_of_light] );
  ...
}

void main() {
  ...
  for (int i=0; i<TOTAL; i++) dosomething(i); 
  ...
}
What do you expect? You may think that your GLSL compiler will smartly recognize the boundaries of your  request and will fetch from memory just the informations about the TOTAL lights you're going to use. Well, wrong: it will allocate all MAX_GL_LIGHTS informations as uniforms. A mess. By keeping your code so general you instantly reach the limits of the card and the program does not link anymore. I didn't know about this (never read about that) and there's no way to optimize it. You must do several fetches to each specific light source you want to use on the scene, and that way you keep the uniforms count down. Weird.

Tuesday, 6 October 2009

Sprites, animations, and 3D textures


During the 80s real 3D was just a chimera. Apart some pioneer, embryonic vectorial engine, computer graphics was achieved by using small bitmap images, stored in contiguous memory locations: the sprites.

Borrowing from traditional techniques, sprites were animated by drawing the animations in separate images (a classic example) and displaying them in sequence, creating the feeling of dynamism.

Before the transition from 16 to 32 bits systems, many hardware engineers introduced several solutions for smoothly rendering lots of sprites on screen (eg.: DMA, dedicate co-processors, and so on), but they were soon forgotten as full 3D engines appeared and stole the scene.

So, why should anyone bother getting the sprites back from the past? Well, they could be useful even today, because there are aspects of computer graphics which work better pre-rendered: explosions, flames, debris, smoke and all that stuff it's too expensive to compute real-time.

An higher dimension
As I wrote, sprites animations used to be stored in different memory locations, and the pointer to the current frame was given to the scanline renderer. We could do the same now, creating several bitmaps and rendering them on screen. But this idea has a couple of drawbacks: we need to create many different texture objects for the same "thing" and, in order to obtain a smooth transition, we must load plenty of slightly different images.

Another idea could be to load a huge texture featuring all frames on it, and display the current one playing with offsets: this fixes the first issue, but not the second one.

A nice way to achieve both is to use a 3D texture. A 3D texture is, basically, a stack of normal 2D textures. The number of "stacks" is called "depth", and it's a parameter we must give during initialization. Pay attention here: as normal textures, even depth must be a power of 2.

Why do 3D texture give us no drawbacks? First, they aggregate lot of textures in one single texture object. Second, when you make request for an intermediate stack, you can be given the interpolated texture. This is transparent by turning on linear filtering, and does not impact on performances.

A nice explanation of this technique can be found here. The following step is billboarding, that is keeping the sprites frontal to the viewer. There's a cool NeHe tutorial about it. If you want to play with sprites but don't have time/skill for producing your own explosions, check this out.

Tuesday, 29 September 2009

Fixed functionality beats programmable pipeline 10:0?


Something's terribly wrong somewhere... Blinn-phong shader is wasting all GPU's power!

I've done some optimization, but I'm afraid that writing general code, good for all seasons is not the best way to achieve performance.

I can't imagine how many compromises serious 3D engines hide under the hood!

Monday, 14 September 2009

Meshes from points

It's been an hard job, but an interesting experience. My 3D engine "PLT" now features its own Delaunay triangulation. Now I can turn an arbitrary set of points into a continue, smooth, tridimensional mesh.

The algorythm is well explained on Wikipedia, and practically consists in breaking an encompassing triangle in smaller ones, ensuring that every triangle does not "contain" any other vertex.

This check is the heart of the algorythm and it's achieved by verifying that every vertex lie outside the circle identified by the three points of each triangle.

There's a lot of geometry, trigonometry and even a little bit of calculus inside of this piece of software; the exams of discrete algebra and calculus found a practical, consistent application.

The next step is to speed it up: the algorythm is far from its potential O(n log n) possibilities and must be optimized. It seems to be even greater than the expected O(n²), probably due to the STL internal sorting algorythms.

Thanks to Erazor for the idea and Gerry for the math support. =)

Saturday, 29 August 2009

The farm of convexity

I'm working on my first scientific visualization "job" in these days; it's a computational fluid dynamics simulation, which has to be displayed to make some inference.

Instead of importing datas into VTK, I loaded them into my 3D framework. And it was a fail, because of three conceptual errors:
  • my engine was written to display polygons; instead, we have ~3000 point samples which have to be manipulated in different ways
  • datas were in double precision! all my framework used to be in floating point, and the loss of data was unacceptable
  • a single point dispersed in a sea is not recognizable; its impossible to make inferences. Therefore, I thought it could be nice to partition data in sets, and display a convex hull of similar points.

Convex envelopes
Given a set of points, a convex hull is the minimum convex boundary which contains them.

There exist many algorythms for computing it, by my choice fall on Jarvins' March: also called "the gift wrapping algorythm", consists in... well... rotating around your set trying to keep your sheeps inside the corral!
  1. the algorythm starts choosing a single point which surely lies on the hull (p0 in the image)
  2. you trace a vector from your point to any other point in your set
  3. [important] you compute the angle formed by an axis and the previously traced vectors, and you pick the minimum one by using the arcotangent.
  4. now you have the second point and iterate from 2; on step 3 you'll naturally won't pick just the minimum, but you'll check that the new angle is greater than the last one
If "h" is the number of points in the convex hull, and "n" the number of points in the set, the complexity of this algorythm is O(nh), which is cool.


Going back to the CFD job, I used the convex hull to display the set of points featuring the highest values. Well, the first screenshot shows how useless it is: the points are too far, and this representation, again, does not help us in any way. Probably, a better way could be to keep together the closest points, using some kind of "distance" function. I'll work on it.

Wednesday, 22 July 2009

Wavefront .OBJ files & Vertex Buffer Objects


OBJ files are handy for many reasons:
  • they're text files
  • well documented
  • easy to parse
  • and many professional softwares can export them

This is why they are so popular as standard files for 3D graphics.

Their structure is basically a (long) list of vectors definitions (spatial, normals, texture coordinates, and so on) followed by a (long) list of structures defining faces. What are faces? Faces are tuples of indexes, which point specific vertexes, normals and texture coordinate in the previous lists. At least three vertexes are needed for each face, because it forms a triangle, that is the smaller polygon we can create.

Why is this needed? Well, because a single vertex could be shared by many faces (=polygons). That's important in real-time 3D programming, because avoiding redundancy is good for performances!

This is the very same idea behind OpenGL's indexed vertex arrays: rather than sending to the video card a list of vertexes, normals and texture coordinates (the so called immediate mode) we prepare a couple of arrays filled with informations, and another array of indexes which tell OpenGL what vertex use in a certain primitive.

Sounds good, uh? Well, there is a couple of drawbacks actually.

The biggest is that every index points simultaneously at vertex, normals and texture arrays! For instance, if our first index is "1", that face points vertex[1], normal[1] and texture_coord[1]. That's a real nightmare, because there's no ensurance that our modeling softwares will produce .OBJ files featuring correctly ordered lists! Moreover, there's often a few of vertexes (which are often shared by many faces) and lot of normals!

So what?? Well, OpenGL does not aid us in any way: you gotta solve with your hands. My solution is the following:
  • parse the OBJ file, loading vertexes, normals, and faces;
  • from faces, create a definitive vertex index and a temporary arrays for normals and textures coordinates;
  • for each i in number_of_indexes do
    definitive_normal_array[ index_array[i] ] :=
    Temporary_normals_array[ TemporaryNormalIndexes[i] ]
And there you go, you have correct normals bound to the right vertex. You could even have time for re-normalizing them, if they aren't.

The most interesting way for drawing this series of arrays is the use of Vertex Buffer Objects (VBO). They're similar to the indexed vertex arrays, but have interesting features... like dropping geometries in the fastest video card memory and use it! The boost of performances is impressive, and they are easy enough to deal with.

My PLTmesh::Draw() method is this:

// PLT_VBOnames are references to the VBOS - see glGenBuffers()

glEnableClientState(GL_NORMAL_ARRAY);
glBindBuffer(GL_ARRAY_BUFFER, PLT_VBOnames[2]);
glNormalPointer(GL_FLOAT, 0, NULL);

glEnableClientState(GL_VERTEX_ARRAY);
glBindBuffer(GL_ARRAY_BUFFER, PLT_VBOnames[0]);
glVertexPointer(3, GL_FLOAT, 0, (char *) NULL );

glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, PLT_VBOnames[1]);

glDrawElements(GL_TRIANGLES, this->IndexesArray.size(), GL_UNSIGNED_INT, (GLvoid*)((char*)NULL));

glDisableClientState(GL_VERTEX_ARRAY);
glDisableClientState(GL_NORMAL_ARRAY);

glBindBuffer(GL_ARRAY_BUFFER, 0);
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, 0);
Quite simple and clear! :)

Monday, 20 July 2009

We meet again, dear wheel

Soon after my experience at CINECA, I took the decision to implement my own 3D engine.

I'm not trying to develop the ultimate graphic experience, of course. I just wanna own a set of libraries useful for easily handling and displaying scientific information, without too much effort.

Actually, I tried to avoid the huge effort of programming a whole 3D framework from scratch by using OpenSceneGraph; well, I tried, but it turned out that making it work was going to take more time than coding my own! Furthermore, I probably won't make always use of a scenegraph for my representations, so here we go.

The first thing I focused on are shaders: PLT is going to be a GLSL-centric engine. The fixed pipeline is still supported, more or less, but there's no future for it, and it's time to leave it alone.

Besides the qualitative leap that shaders allow (the screenshot shows a comparison between fixed functionality and a phong-blinn lighting), there's another huge benefit: the engine can easily be ported onto embedded systems. That's the second aspect I'm focusing on: portability. I want PLT to run on every system I could code on: Windows, Linux, and Windows Mobile.

What misses at the moment? A true geometry loader, for instance. There's already full support for vertex buffer objects, and I can manually load vertexes, normals and so on, but I must code at least an handy .OBJ loader.

As I told, there's no a scene graph, not even a rudimentary one. I'll surely need a bunch of matrix functions for manipulating my objects loaded server-side.

And naturally, I'm going to extend the PLTlights classes for shadow maps support as soon as possible! :)