Simple LOD

Introduction

Level Of Detail technique (or further simply “LOD”) is a graphics technique when an object has more than one visual representations. One of these representations is picked to be drawn in any single time. In most cases the less visible object is, the less detail level should be picked. Assumes LOD 0 is most detailed level. It is up to developer to pick number of LODs to use. Detailization can involve polygon count, different materials and additional effects; because when some object is far far away from camera and the drawing results couple of pixels on the screen, there is no need to draw it with shader materials or spend hundred of polygons.

When use LOD?

In order LOD should be used everywhere if it gives overall performance boost. This technique is very useful for games like TES 3-4, Gothic 1-4, Dragon Age 2; also quite crucial for MMOs like WOW, Lineage 2, EVE Online and so on. LOD can be less prioritized for games like Starcraft 2, Doom 3 and other games where camera position or level design help to control amount of visible objects.

In this article i would like to show you how to use LOD in practical way. We will develop small application, and will try to solve one by one problems with performance.

Our case

Lets take space theme, where planets, asteroids and stars; there is no huge variety of object types and the distances quite large. For the simplicity (to skip asteroids), we will draw only planets. Sphere is a model of a planet, so we can use embedded GeometryCreator to make it. Also we will animate each sphere by rotating it in its own direction, to make impossible just to batch all the planet geometry into mesh buffer and draw it all with few calls very fast.

For the start, lets generate all LOD levels for our object; we are going to store them in single list and use only references for every planet. That will save us memory greatly.

// generate all LODs of mesh

List<Mesh> lodMesh = new List<Mesh>();

int[] p = new int[] { 100, 50, 32, 20, 12, 6, 3 };

for (int i = 0; i < p.Length; i++)

{

Mesh m = scene.GeometryCreator.CreateSphereMesh(50, p[i], p[i]);

MeshBuffer mb = m.GetMeshBuffer(0);

mb.Material.Type = MaterialType.Solid;

mb.Material.SetTexture(0, driver.GetTexture("../../media/earth.jpg"));

m.SetMaterialFlag(MaterialFlag.Lighting, false);

lodMesh.Add(m);

}

lodMesh stores list of meshes for each LOD level. p array defines detalization level for each level; particular value defines how much rows and columns the sphere will contain. Its not hard to calculate number of triangles: sphere is been generated from square polygons, each one consists of two triangles; this means that sphere with LOD 0 contains 100 * 100 * 2 = 20,000 triangles -- this is our most detailed level. We have 7 levels in total (from 0 to 6 inclusive), this number was picked by me, but it can be literally anything.

How to pick total number of LODs? Well, it up to you, but I would recommend to keep in mind next rules:

in order we have to minimize number of LODs, since more levels takes more memory to store them;
we have to pick the number high enough, because one frame you can draw LOD 0 and next frame you can draw LOD 1, and if there is a big difference (a noticeable difference) between rendering results, then user will see nasty effect (like object was replaced by another object);
try to test picked value in real project and pick the “golden mean”;

So we picked 7, and now we can estimate how many triangles the minimum and the maximum we would need to draw for certain amount of planets. Lets define number of planets to be 5,000, well this value I picked only to make sure that drawing 5000 spheres with 20000 triangles each should be a nonsense without LOD. The minimum and the maximum will be next:

5,000 * 20,000 = 100,000,000 triangles. Exactly this amount we need to draw all the planets with LOD 0. It is insanely a lot if we want to draw them 40+ times per second. Indeed it can happen and we will need to draw that amount even with LOD -- it depends how the objects will be placed in the space, yes yes, if we gather all of them in one place (like in same position) then LOD won't help us. So it is very important to arrange objects evenly in space (or on the terrain in case of shrubs or trees). The person who is a level designer should understand this, but in our case we will arrange all the planets randomly and evenly in space;
5,000 * 18 (which came from 3 * 3 * 2) = 90,000 triangles. This is the minimum amount which will be drawn in any case. This is quite a lot if we take into account the fact that on the last LOD the object will draw into couple of pixels on the screen. It is 18 triangles! Indeed we can draw only 2 triangles which forms a billboard (its a quad which is faced the camera all the time), but there are some hidden issue: drawing of 18 triangles is faster than calculating rotation for a billboard and regenerating its texture since planet actually rotates. Other thing is that it can be optimized too, we can recalculate rotation and regenerate the texture quite rarely because the planet is too far away to notice the change. So it is two ends of the same stick.

As we have list of all LOD levels for our future planets, we can define a class that will hold all related info to a single LOD object (a planet in our case).

class LODItem

{

IrrlichtDevice device;

VideoDriver driver;

List<Mesh> meshLODs;

Matrix transformation;

Vector3Df rotationVector;

int currentLOD;

static public LODItem Create(IrrlichtDevice device, List<Mesh> meshLODs, Matrix transformation, Vector3Df rotationVector)

{

LODItem n = new LODItem();

n.device = device;

n.meshLODs = meshLODs;

n.transformation = transformation;

n.rotationVector = rotationVector;

n.currentLOD = meshLODs.Count - 1;

n.driver = device.VideoDriver;

return n;

}

readonly float[] lodDistance = new float[] {

150, // LOD 0: 0 ... 150

300, // LOD 1: 150 ... 300

600, // LOD 2: 300 ... 600

1200, // LOD 3: 600 ... 1200

2500, // LOD 4: 1200 ... 2500

6000, // LOD 5: 2500 ... 6000

-1 // LOD 6: 6000+; value "-1" doesn't play any role,

// because anything that is further than 6000 will stay at the last available LOD, which is 6

};

public void Draw(uint time, Vector3Df cameraPosition)

{

// animation

transformation.Rotation = rotationVector * time;

// recalculate current LOD

currentLOD = meshLODs.Count - 1;

float distance = (transformation.Translation - cameraPosition).Length;

for (int i = 0; i < lodDistance.Length - 1; i++)

{

if (distance < lodDistance[i])

{

currentLOD = i;

break;

}

// drawing

driver.SetTransform(TransformationState.World, transformation);

MeshBuffer mb = meshLODs[currentLOD].GetMeshBuffer(0);

driver.SetMaterial(mb.Material);

driver.DrawMeshBuffer(mb);

}

As you see, we have list of meshes, transformation matrix (where we store position and rotation) and rotationVector which defines rotation direction and rotation speed. We will create planets by calling Create() method. lodDistance array defines a distance for every LOD level when it will be actually used for representing a planet on the screen.

Draw() method takes a timestamp and camera position, first one we need to properly animate the object, second one need to calculate correct LOD level. Next is a drawing part: setting up world matrix, the material and drawing single mesh buffer. If our model would be more difficult (more than one mesh buffer), we would go through all the mesh buffer and draw them all (but for simplicity, I draw only one occupied mesh buffer in code above).

Now lets fill the space with the planets (LODItem objects).

// generate world,

// we generate a lot of objects with random positions in 20k x 20k x 20k virtual cube

int lodItemCount = 5000;

int virtualCubeSide = 20000;

LODItem[] lodItems = new LODItem[lodItemCount];

Random r = new Random(5555565);

for (int i = 0; i < lodItemCount; i++)

{

Matrix tmat = new Matrix(

new Vector3Df( // translation

r.Next(virtualCubeSide) - virtualCubeSide / 2,

r.Next(virtualCubeSide) - virtualCubeSide / 2));

Vector3Df rvect = new Vector3Df(

(float)r.NextDouble() / 200.0f,

(float)r.NextDouble() / 200.0f);

lodItems[i] = LODItem.Create(device, lodMesh, tmat, rvect);

}

We generated randomly 5,000 planets in space (which is a virtual cube with length from -10,000 to +10,000 on each axis); also each planet got an unique rotation vector.

Next we initialize standard FPS camera, the only important thing about the camera is that we set up FarValue to be 30,000 (because default value is 3,000 and it is not good for us, we don’t want to cut output for very far objects). Also we load the font to draw some stats in future.

Next is going a rendering loop, like so:

while (device.Run())

{

driver.BeginScene();

scene.DrawAll();

uint timer = device.Timer.Time;

Vector3Df cameraPosition = camera.AbsolutePosition;

for (int i = 0; i < lodItemCount; i++)

lodItems[i].Draw(timer, cameraPosition);

// … the code which draws info goes here …

driver.EndScene();

}

The result of our efforts will look like shown below.

In the start camera position (which is 0,0,0) we are drawing 137,578 triangles 16 times per second. Not a big deal. Well if we will take in account the fact that without LOD we would draw 100,000,000 triangles in 50 seconds and therefore would have like 0.02 fps then “yes” we did good job, but i’m sure we can make it run faster if we try to optimize our algorithm. So basically LOD helped us to greatly reduce the amount of drawing triangles, exactly on 99.86% (so we draw actually ~0.14% -- it is percent of 137,578 from 100,000,000). Now our task is to draw them more optimal and faster.

Optimization

Lets look into simple thing first. How much times do we call driver to set up world matrix and the material? 16 fps * 5,000 objects = 80,000 times we call LODItem.Draw() every second, which sets up the same material every time. We cannot do anything with need to set up world matrix, because each object has it own position and changing rotation. But setting the material we can optimize easy, we will set it once in the main rendering loop.

This small change already gives a boost.

It is 5 fps (+25% boost from nothing)! A bit better than before, but still far away from 40+ fps which we aimed for. Now our LODItem.Draw() is been called 21 * 5,000 = 110,000 times per second. If we look closely into it we see:

float distance = (transformation.Translation - cameraPosition).Length;

This is distance calculation. Actually for Length to return a value, it calculates a square root, which is slow (not mention that we do it 110,000 times). But indeed we don’t need the distance itself, we need a value that will help us to define which LOD level should we use now. We can easily use square of distances, so we will not need to extract square root. We will change lodDistance array to contain square distances and we will use LengthSQ instead of Length property of Vecrtor3Df class. After implementing this tiny approach we get 2 more fps.

Next I would like to add code to draw Wireframe mode and Stats mode. The stats will help us to understand what else can be improved. So after those changes we can see next picture.

Lets look on LOD level distribution:

LOD 5: 554 (11%)
LOD 6: 4,405 (88%)
1% - all other LOD levels

So looks like we have 99% of all objects that are quite far away and we update all of them at same speed like for 1% that are close to us. For 99% we calculate LOD value and animation on each frame. We should change this, in the way when we update planets which more far away less frequent. Indeed the object of LOD 6 is so far away that we can update them easily once per second, since we cannot see the change because of their size on the screen. As we have time-based animation (it is when we receive timestamp (any moment in time) and we calculate the state of the object), we don’t have to worry that far away planets will rotate slower than they would be close, no, they will rotate correctly, we just calculate the actual value of the rotation more rarely.

For this change we will add an array into LODItem which will hold time interval for each LOD -- how often it should be updated. If the object should not be updated, it will just draw its previous state.

Why it is an array? Why do we need time interval for each LOD? We could take single value, like 20ms (so its 1000/20 = 50 times per second), which means if our fps is less than 50, we will calculate the state each frame and will not notice and speed up at all. We could take 200ms (so its 1000/200 = 5 updates per second), which will result to good boost, but there will be a problem -- very close objects will be updated also only 5 times per second and it is quite noticeable for them, since they are big on the screen, they will look antsy. That is why we use an array and define own time interval for each LOD.

readonly uint[] updateIntervals = new uint[] {

10,

20,

40,

80,

120,

200,

1500 };

As you see, we use very small value for LOD 0 and 1, to make them to update very very often, for animation to be smooth. We should not be worry that its too often, because as we saw in stats there are too few objects with LOD 0 and 1 we need to draw (in most cases it is less than 5). From other side, we have 4990+ objects that we can draw faster than we spend time on calculation of their rotation vector and LOD value to use. The last LOD is going to get updated once per 1.5 sec -- this is very rare, but we win in performance mostly because of it.

So, after those changes, our state update will look like:

if (time > nextUpdateAt)

{

// animation

transformation.Rotation = rotationVector * time;

// recalculate current LOD

currentLOD = meshLODs.Count - 1;

float distanceSQ = (transformation.Translation - cameraPosition).LengthSQ;

for (int i = 0; i < lodDistanceSQ.Length - 1; i++)

{

if (distanceSQ < lodDistanceSQ[i])

{

currentLOD = i;

break;

}

nextUpdateAt = time + updateIntervals[currentLOD];

}

This should boost overall performance greatly. Lets check it.

Wow 54 fps! Thats is awesome.

But lets try to improve result even more. Right now we draw every planet every frame (doesn’t matter where it is located related to the camera: we can see it or its behind). From the screenshot we see that in current camera position we draw 137,578 triangles, but indeed we may not see even ⅓ part of them. As we all know: the fastest drawn polygon is that one, which we don’t draw at all. So lets try to optimize our application it that way too.

We actually see only polygons that is inside camera’s view frustum. I differ 3 zones:

Visible zone (small zone).
Invisible side zone (it is zone which is located at the from of the camera but invisible since view angle is not 180 degree; it is large zone, located around the visible screen).
Invisible back zone (everything which is behind the camera; it is most large zone).

A visual representation of the zones described above is shown below (simplified, in two dimensions).

So, that is why we actually draw a lot of invisible polygons. We spend time on calling video driver methods to draw them. Most easy here is to exclude from drawing “invisible back zone”. To implement this, we need to provide Draw() method with cameraViewBox argument, which is simply a bounding-box of ViewFrustum of our camera. And straight before the drawing, we will check if the planet position is not behind the camera:

if (!cameraViewBox.IsInside(transformation.Translation))

return;

Checking the result:

Now we draw less triangles indeed, but overall performance not increased a bit, its even dropped. The problem here is that we are checking every single object for its hitting into camera view box. So basically we moved a half of a load from video driver to CPU. That is maybe not bad optimization for some software renderer but not for hardware one.

We do too much math every frame. To avoid that, we have to do it more efficiently, and obviously do less math. We have to make decisions about “what to draw and what not” not for every planet, but for group of planets. Now we will try to group our planets into some sort of sectors, so we could check visibility of the sector and we will make conclusion roughly but faster.

The idea is to split all the planets on sectors, where a “sector” is a virtual cube. Each planet will belong to single particular sector. We will not call Draw() for our planets from main rendering loop as before, but instead we are going to call some Draw() of our sector, which will compute own visibility and if it is not completely invisible -- it will draw all the planets it has.

For those sectors we define a new class -- LODSector. The Draw() method will look like:

public void Draw(uint time, Vector3Df cameraPosition, AABBox cameraViewBox)

{

if (cameraViewBox.IsInside(d1) ||

cameraViewBox.IsInside(d2) ||

cameraViewBox.IsInside(d3) ||

cameraViewBox.IsInside(d4) ||

cameraViewBox.IsInside(d5) ||

cameraViewBox.IsInside(d6) ||

cameraViewBox.IsInside(d7) ||

cameraViewBox.IsInside(d8))

{

for (int i = 0; i < lodItems.Count; i++)

{

lodItems[i].Draw(time, cameraPosition);

}

We test: if any single corner of the cube (from d1 to d8 -- 8 vertices of our virtual cube) is located inside cameraViewBox -- we draw whole the sector. After this change, the drawing in main rendering loop will look like:

for (int i = 0; i < lodSectorSide; i++)

{

for (int j = 0; j < lodSectorSide; j++)

{

for (int k = 0; k < lodSectorSide; k++)

{

lodSectors[i, j, k].Draw(timer, cameraPosition, cameraViewBox);

}

Basically we go through all the sectors and “draw” them. lodSectorSide variable defines the size of lodSectors array. As you see, it is three-dimensional one. In this example i use value 6. It directly influences how many we are going to call cameraViewBox.IsInside() when checking visibility.

When we pick lodSectorSide we should remember that real number of sectors is a cube value (for example, if we would implement LOD for trees on the terrain we could easily use two-dimensional array for sectors). So for value 6 we have 6 ^ 3 = 216 sectors; which means that each sector contains 5,000 / 216 ~ 23 planets in average, meaning that after 8 calls of cameraViewBox.IsInside() we cut off about 23 planets immediately, which is very good for us. For value like 10, it will be 1000 sectors and each will contain 5 planets in average, and that will not give us performance boost we want. In any case, for our particular example i recommend to use values from 5 (125) to 8 (512).

This optimization gives us next boost:

As for me, it is great result! Of course, without sectors we had performance about 52-55 fps independently where we are looking at. Now after flying around, we get fps value from 52 (when we are looking on all the planets from the side, so we see all the sectors) and up to 900+ when we don’t see anything. So in average, we can say that fps had grown on 15-20% -- we have 60-70 fps in most cases.

It looks like shown below.

That is all!

Indeed, there are some ways to optimize even more. There are some of them below:

using the fact that we are processing each sector separately. We can group all sectors into more large sectors. We will win more performance, since we will do decisions that something is invisible much faster by cost of less checks.
using the fact that all our sectors is defined by 8 utmost vertices and we are checking visibility for all of them for every single sector. But indeed, the vertex will be joint for 2, 4 or 8 cubes (and in most cases, exactly for 8). Using this approach, we can do decision about sector visibility much faster. All we need is to regenerate kind of cloud of these vertices, checking visibility of each (which we are doing right now, but we are doing it for the same vertex 8 times in most cases, which is non efficient of cause). Having that cloud of vertices, some particular sector is not visible if 8 certain vertices from that cloud is not visible. Simple and clear.

Illustration of distribution of joint vertices among 2 x 2 x 2 virtual cubes (we have used 6 x 6 x 6 in our application) shown below. You may see that all internal points are joint to 8 cubes.

OK, hope my research with LOD and optimization overall was helpful to you!

Two-minutes video of how it all works you can check here: http://www.youtube.com/watch?v=gobgy9_AM98.

The source code for this article is an example L09.SimpleLOD of the Irrlicht Lime SDK.