Monday, 19 October 2009

3D collision detection


3D libraries are conceived as just drawing primitives, and nothing more. That's why graphics programmers are asked to reinvent the wheel several times.

One of those wheels is the "world space" partitioning. In particular, the collision detection of meshes.

I don't know how serious engines handle it on a per-pixel/vertex basis, but I think they just do miracles. I implemented something simpler for my PLT by using cheap bounding spheres.

The improved geometries loader "keeps in mind" the minimum and maximum coordinates for each axis, while it constructs the vertex arrays; at the end of the loading process, it uses those coordinates to find the middle point of the mesh and the farthest vertex from it: they are the bounding sphere center and radius, respectively.

Checking if a mesh collides with another is straightforward:
  • transform the bounding centers from model to world space, and compute the distance (let's call this value "D") between them
  • sum up the two bounding radius (let's call it "S")
  • if D > S meshes don't collide
It's kind of a cheat, but it works.
Of course, using complex meshes will reveal its nature: it's far from being accurate. But it so fast!

I think there's a couple of possible optimizations.

The first one is, well, "CAD driven": I could simply cut down every model into many "groups", and give everyone its own bounding sphere. The drawback is, obviously, that the designer is asked to think meshes as a collection of separate tranches, and everything works better if each partition fills out its "space". Moreover, this solution makes LOD design slightly more complex.

The second one is more challenging: compute the partitions on fly, during geometry loading, by using a stochastic approach: areas denser of polygons are more likely to be relevant for the mesh, and should be enclosed by their own bounding sphere.

Friday, 9 October 2009

The evolution of parallax

Once upon a time, videogames were bidimensional, and it was fine.
One day, in order to give some depthness to the scene, programmers introduced a technique called parallax scrolling: by moving the elements of the background at different speed, a sense of spatiality and immersivity emerged.

The sensation was outstanding for the age, but the drawback was a very high computational cost, that kept this strategy confined on 16bit machines. With the arrival of 3D graphics, parallax scrolling was abandoned.

In 2001, a Japanese researcher named Tomomichi Kaneko introduced a brilliant modification of texture mapping able to provide "the capability to represent the motion parallax effect to the textures placed on 3D objects". His algorythm was basically a normal mapping, whose strategy is to use a texture's RGB channels to map a 3D normal, overriding the vectors on surfaces; Tomomichi basically suggested to enhance the feeling of movement by displacing textures by a function of the view angle in tangent space.

The tangent space itself is the trick behind all, and let us compute stuff in a comfortable space: instead of doing tons of computations in world space, we "look at things" from the top of the texture.

The generation of the special coordinates (normal, tangent and binormal) is done "client side" during the geometries loading, and sent to the server as vertex attributes.

The normal mapping itself requires at least a couple of textures to be sent and, guess what, I faced again the limits of DirectX9 compliant boards. I wanted to keep at least 4 lights on the scene, and use three textures (diffuse, normal map, and a gloss map). I quickly overflowed the capabilities of my VGA card and learnt an interesting thing.

GLSL is paranoid

I'm not sure it's a GLSL thing, but that's it: it's paranoid. It expects the worst scenario to happen. And therefore does not optimize a thing. I explain.

Suppose you have a code similar to this:

void DoSomething(int number_of_light) {
  ...
  Use( gl_LightSource[number_of_light] );
  ...
}

void main() {
  ...
  for (int i=0; i<TOTAL; i++) dosomething(i); 
  ...
}
What do you expect? You may think that your GLSL compiler will smartly recognize the boundaries of your  request and will fetch from memory just the informations about the TOTAL lights you're going to use. Well, wrong: it will allocate all MAX_GL_LIGHTS informations as uniforms. A mess. By keeping your code so general you instantly reach the limits of the card and the program does not link anymore. I didn't know about this (never read about that) and there's no way to optimize it. You must do several fetches to each specific light source you want to use on the scene, and that way you keep the uniforms count down. Weird.

Tuesday, 6 October 2009

Sprites, animations, and 3D textures


During the 80s real 3D was just a chimera. Apart some pioneer, embryonic vectorial engine, computer graphics was achieved by using small bitmap images, stored in contiguous memory locations: the sprites.

Borrowing from traditional techniques, sprites were animated by drawing the animations in separate images (a classic example) and displaying them in sequence, creating the feeling of dynamism.

Before the transition from 16 to 32 bits systems, many hardware engineers introduced several solutions for smoothly rendering lots of sprites on screen (eg.: DMA, dedicate co-processors, and so on), but they were soon forgotten as full 3D engines appeared and stole the scene.

So, why should anyone bother getting the sprites back from the past? Well, they could be useful even today, because there are aspects of computer graphics which work better pre-rendered: explosions, flames, debris, smoke and all that stuff it's too expensive to compute real-time.

An higher dimension
As I wrote, sprites animations used to be stored in different memory locations, and the pointer to the current frame was given to the scanline renderer. We could do the same now, creating several bitmaps and rendering them on screen. But this idea has a couple of drawbacks: we need to create many different texture objects for the same "thing" and, in order to obtain a smooth transition, we must load plenty of slightly different images.

Another idea could be to load a huge texture featuring all frames on it, and display the current one playing with offsets: this fixes the first issue, but not the second one.

A nice way to achieve both is to use a 3D texture. A 3D texture is, basically, a stack of normal 2D textures. The number of "stacks" is called "depth", and it's a parameter we must give during initialization. Pay attention here: as normal textures, even depth must be a power of 2.

Why do 3D texture give us no drawbacks? First, they aggregate lot of textures in one single texture object. Second, when you make request for an intermediate stack, you can be given the interpolated texture. This is transparent by turning on linear filtering, and does not impact on performances.

A nice explanation of this technique can be found here. The following step is billboarding, that is keeping the sprites frontal to the viewer. There's a cool NeHe tutorial about it. If you want to play with sprites but don't have time/skill for producing your own explosions, check this out.

Thursday, 1 October 2009

Does your shader seamlessly compile, but inexplicably doesn't link?

Maybe it's due to the GL_MAX_VARYING_FLOATS limit.

If you go crazy obtaining the error "Fragment shader(s) failed to link", try reducing your explicit -and implicit- varyings variables.

It turns out that, for instance, you cannot combine 8 textures and 8 "blinn-phong light sources" inconsiderately, because if you would like to interpolate textures and light vectors, you end up using (8+8)*4 = 64 varyings floating point variables!

Keeping out useless textures coordinates (z,w) and the alpha channel, by using custom varyings instead of the built-in ones, you can go down to 8*2+8*3 = 40, which is lower than 44, limit for most common DX9 compliant boards.