On the limitations of the id Tech 3 renderer
-
Rookie One.pl
- Site Admin
- Posts: 2752
- Joined: Fri Jan 31, 2003 7:49 pm
- Location: Nowa Wies Tworoska, Poland
- Contact:
On the limitations of the id Tech 3 renderer
As per Tourist-tam's request.
I mainly conducted practical experiments with a lot of objects being rendered simultanously.
The test system is a Pentium 4 3GHz on an Intel i915G chipset-based mainboard with 1.5 GB of DDR400 RAM and a Galaxy GeForce 8600GT with 256 MB of DDR RAM. It can pull off Crysis on High Quality settings at an average FPS of 30.
I created maps which consisted of a small, single room, filled with multiple objects (either detail brushes or entities to ensure no impact on VIS) which were creating the required number of triangles and vertices. I deliberately used single-stage shaders to minimize the workload. The results were rather disappointing - once the number of triangles rendered was exceeding about 50 thousand, the FPS started to drop dramatically, reaching the value of 24 at a triangle count of 125,000. The performance of both engines was almost identical.
Now for the theoretical part. First, a little bit of glossary. In OpenGL terminology, the client is the software that is using OpenGL and its environment (i.e. CPU, system RAM), while server is the 3D video device with its environment (the GPU and the video RAM).
The id Tech 3 engine uses vertex arrays to render stuff. This technique was one of three available back in 1999 (the others being the immediate mode and display lists; you can read about them at the aforementioned website, too). Vertex buffer objects were invented later to tackle performance problems. Vertex arrays were actually the only way to go. Display lists were out of the question because they are static (i.e. their vertex data cannot be changed once they're compiled), and immediate mode is extremely inefficient, as it creates a tremendous amount of overhead because of having to make individual function calls for each vertex information. Why function calls create overhead? Because one requires at least one additional CPU tick. Mutliply those ticks by the number of vertex data calls (one for a vertex position, one for texture coordinates and one for colour data, so that's at least 3 per vertex, and there are situations where it takes more) and you'll end up losing a considerable amount of precious microseconds.
But back to vertex arrays. The whole point of this technique is that while all the geometry data (not the textures, just the vertices and triangle indices) remains in the client memory (the system RAM), the number of calls is reduced - you only give the server a couple of pointer to arrays of relevant data and then tell it ranges of data it should use to draw stuff. This allows to maintain the ability to modify the vertex data by the application on the fly while reducing the number of function calls, often to a single glDrawElement() call.
And the ability to modify that data is the key for the whole shader system to work. All those nifty texture and vertex deformations - they either change the texture coordinates or vertex positions. Of course, in more or less sophisticated ways, but still. That's all there is to it.
I should also mention that all those shader modifier calculations are done by the CPU. So, on modern systems a complex scene (with, say, > 100k triangles) can get quite a lot of load on your CPU while the GPU will be almost idle.
Another thing is the way that multi-stage shaders are handled. Don't know if you were aware of that, but for each shader stage all the elements in the scene bearing that shader must be redrawn again. Yes, you read that right - if you have, let's say, a weapon with a 4-stage shader, it will be drawn 4 times. So, 4 times the memory consumption, 4 times the calculation time. See, power comes at the cost of performance. Of course, the engine tries to collapse multi-stage shaders into single-staged ones, but it isn't always possible.
And last but not least, there is a HORRIBLE duplication of data. When, let's say a player model, is supposed to be rendered into the scene, all of its triangle and vertex data is copied from a memory pool that contains the data loaded from the model file into the vertex array for rendering. And it's like that with everything - the world brushes, patch meshes, static models, everything, and it's done every single frame. It 1) is time consuming, 2) eats ridiculously large amounts of memory. Once again, the shader system is to blame, because if the world was static (i.e. had no dynamic vertex data), it could be turned into a display list compiled at map load time, speeding things up by several orders of magnitude.
As I said, all of this was quite acceptable back in 1999, but with today's hardware it doesn't perform very well on high-complexity scenes. Of course, there is a way to modernize the renderer - rewrite it to use vertex buffer objects and vertex programs (a.k.a. vertex shaders). The latter could be used to emulate most, if not all of the Quake shader features. However, this would drastically rise the hardware requirements (in relation to the original game), as the technology in question requires hardware a few generations newer than what was available in 2002. Plus there still are some pitfalls, as you can see in the performance of the XreaL engine (undisputably the most advanced and up-to-date of all the id Tech 3-derived renderers) - while superior to all the other id Tech 3 solutions at rendering high-complexity scenes, is still unsatisfactory for large outdoor environments that modern game engines can pull off (it starts choking at 250k triangles).
With all of this said, modernizing the renderer would be a job which is well over my head. I lack the necessary experience and knowledge, there is no one else to do the work and the time expense also would not be justified IMHO. That's all I have to say on the subject.
I mainly conducted practical experiments with a lot of objects being rendered simultanously.
The test system is a Pentium 4 3GHz on an Intel i915G chipset-based mainboard with 1.5 GB of DDR400 RAM and a Galaxy GeForce 8600GT with 256 MB of DDR RAM. It can pull off Crysis on High Quality settings at an average FPS of 30.
I created maps which consisted of a small, single room, filled with multiple objects (either detail brushes or entities to ensure no impact on VIS) which were creating the required number of triangles and vertices. I deliberately used single-stage shaders to minimize the workload. The results were rather disappointing - once the number of triangles rendered was exceeding about 50 thousand, the FPS started to drop dramatically, reaching the value of 24 at a triangle count of 125,000. The performance of both engines was almost identical.
Now for the theoretical part. First, a little bit of glossary. In OpenGL terminology, the client is the software that is using OpenGL and its environment (i.e. CPU, system RAM), while server is the 3D video device with its environment (the GPU and the video RAM).
The id Tech 3 engine uses vertex arrays to render stuff. This technique was one of three available back in 1999 (the others being the immediate mode and display lists; you can read about them at the aforementioned website, too). Vertex buffer objects were invented later to tackle performance problems. Vertex arrays were actually the only way to go. Display lists were out of the question because they are static (i.e. their vertex data cannot be changed once they're compiled), and immediate mode is extremely inefficient, as it creates a tremendous amount of overhead because of having to make individual function calls for each vertex information. Why function calls create overhead? Because one requires at least one additional CPU tick. Mutliply those ticks by the number of vertex data calls (one for a vertex position, one for texture coordinates and one for colour data, so that's at least 3 per vertex, and there are situations where it takes more) and you'll end up losing a considerable amount of precious microseconds.
But back to vertex arrays. The whole point of this technique is that while all the geometry data (not the textures, just the vertices and triangle indices) remains in the client memory (the system RAM), the number of calls is reduced - you only give the server a couple of pointer to arrays of relevant data and then tell it ranges of data it should use to draw stuff. This allows to maintain the ability to modify the vertex data by the application on the fly while reducing the number of function calls, often to a single glDrawElement() call.
And the ability to modify that data is the key for the whole shader system to work. All those nifty texture and vertex deformations - they either change the texture coordinates or vertex positions. Of course, in more or less sophisticated ways, but still. That's all there is to it.
I should also mention that all those shader modifier calculations are done by the CPU. So, on modern systems a complex scene (with, say, > 100k triangles) can get quite a lot of load on your CPU while the GPU will be almost idle.
Another thing is the way that multi-stage shaders are handled. Don't know if you were aware of that, but for each shader stage all the elements in the scene bearing that shader must be redrawn again. Yes, you read that right - if you have, let's say, a weapon with a 4-stage shader, it will be drawn 4 times. So, 4 times the memory consumption, 4 times the calculation time. See, power comes at the cost of performance. Of course, the engine tries to collapse multi-stage shaders into single-staged ones, but it isn't always possible.
And last but not least, there is a HORRIBLE duplication of data. When, let's say a player model, is supposed to be rendered into the scene, all of its triangle and vertex data is copied from a memory pool that contains the data loaded from the model file into the vertex array for rendering. And it's like that with everything - the world brushes, patch meshes, static models, everything, and it's done every single frame. It 1) is time consuming, 2) eats ridiculously large amounts of memory. Once again, the shader system is to blame, because if the world was static (i.e. had no dynamic vertex data), it could be turned into a display list compiled at map load time, speeding things up by several orders of magnitude.
As I said, all of this was quite acceptable back in 1999, but with today's hardware it doesn't perform very well on high-complexity scenes. Of course, there is a way to modernize the renderer - rewrite it to use vertex buffer objects and vertex programs (a.k.a. vertex shaders). The latter could be used to emulate most, if not all of the Quake shader features. However, this would drastically rise the hardware requirements (in relation to the original game), as the technology in question requires hardware a few generations newer than what was available in 2002. Plus there still are some pitfalls, as you can see in the performance of the XreaL engine (undisputably the most advanced and up-to-date of all the id Tech 3-derived renderers) - while superior to all the other id Tech 3 solutions at rendering high-complexity scenes, is still unsatisfactory for large outdoor environments that modern game engines can pull off (it starts choking at 250k triangles).
With all of this said, modernizing the renderer would be a job which is well over my head. I lack the necessary experience and knowledge, there is no one else to do the work and the time expense also would not be justified IMHO. That's all I have to say on the subject.
-
Rookie One.pl
- Site Admin
- Posts: 2752
- Joined: Fri Jan 31, 2003 7:49 pm
- Location: Nowa Wies Tworoska, Poland
- Contact:
I know at least one thing already made with Quake and Ogre3D:

http://www.ogre3d.org/gallery/albums/al ... stein1.jpg (sorry image too large)
Maybe this example works better:
http://www.ogre3d.org/gallery/albums/al ... apping.jpg (large too)
I'll post a list later, I've found a few GPL Renderers.
Apparently "Vertex Buffer Objects"
//edit
Maybe there's more things needed, or I misunderstood it.
Here's a list of projects using OGRE3D (http://www.ogre3d.org/index.php?set_alb ... _album.php).

And it has that "Parallax mapping (offset mapping, virtual displacement mapping)" thing (someone asked for it)...Quake3 BSP Level Renderer wrote:Here\'s a few of shots of the Quake3 level renderer inside OGRE, which has recently been enhanced to support curved surfaces thanks to the bezier patch support added recently to the engine (see below). It also supports advanced texture effects, animated textures, and efficient render state management meaning the frame rates are pretty good. All the specifics of rendering an indoor level of this kind are hidden within a SceneManager subclass called BspSceneManager, which is automatically used if you request a scene manager specialised for indoor rendering by calling Root::getSceneManager(ST_INDOOR). You then load a level by using the SceneManager::setWorldGeometry(...) method. All the features like curves, texture effects, render state management are the same as the rest of the core engine, there are no \'special implementations\' for this type of scene.
At the moment the code loads the Quake3 level from a .bsp file and converts it on the fly to OGRE\'s internal BSP format.
I chose to write a Quake3 level renderer now (rather than extending more general features of the engine like particles, collision detection etc) because I wanted to test out my theories that you could generalise a scene-oriented rendering engine like this and still support very specialised rendering approaches like those used by Quake3.
The renderer is not 100% perfect - MD3 files for things like gargoyles are not being loaded, a few shader scripts are not displaying properly, and there are a few other minor issues. However I didn\'t intend to get this Q3A perfect - this is just a demonstration of what Ogre is capable of.
http://www.ogre3d.org/gallery/albums/al ... stein1.jpg (sorry image too large)
Maybe this example works better:
http://www.ogre3d.org/gallery/albums/al ... apping.jpg (large too)
I'll post a list later, I've found a few GPL Renderers.
Apparently "Vertex Buffer Objects"
//edit
Maybe there's more things needed, or I misunderstood it.
Apparently OGRE3D uses "Vertex Buffer Objects", that makes an "upload" of entirely arrays to the graphics card, rendering it there, making the process incredibly faster than "Vertex Arrays".About wrote:Want to render indoor levels fast? Fine, use the BSP/PVS plugin scene manager which has already been written. Want an outdoor landscape? Again, use another plugin scene manager. The rest of the engine continues to function exactly as before.
Here's a list of projects using OGRE3D (http://www.ogre3d.org/index.php?set_alb ... _album.php).
Last edited by morris on Sat Jul 19, 2008 7:28 am, edited 2 times in total.
-
Rookie One.pl
- Site Admin
- Posts: 2752
- Joined: Fri Jan 31, 2003 7:49 pm
- Location: Nowa Wies Tworoska, Poland
- Contact:
Yeah, I know of those 3D renderers, already tried a couple of them out. I especially like OGRE3D, IMHO by far the best free 3D renderer (most flexible, gives most control over the 3D scene), very easy to use. Crystal Space is fine, too, but it's quite monolithic and crafted specifically for CEL, which makes it a tad difficult to use solely as a rendering module (as opposed to a whole game engine).
But anyhow it is not so easy to integrate a renderer like that into an existing engine. Language mismatch (C vs C++ - getting the Q3 source to even compile at all in C++ mode is a major pain in the bum), architecture differences. It's not just drag-and-drop, it requires serious engineering.
But anyhow it is not so easy to integrate a renderer like that into an existing engine. Language mismatch (C vs C++ - getting the Q3 source to even compile at all in C++ mode is a major pain in the bum), architecture differences. It's not just drag-and-drop, it requires serious engineering.
I don't know, maybe could we discuss more? There's no release date btw.Rookie One.pl wrote:Yeah, I know of those 3D renderers, already tried a couple of them out. I especially like OGRE3D, IMHO by far the best free 3D renderer (most flexible, gives most control over the 3D scene), very easy to use. Crystal Space is fine, too, but it's quite monolithic and crafted specifically for CEL, which makes it a tad difficult to use solely as a rendering module (as opposed to a whole game engine).
But anyhow it is not so easy to integrate a renderer like that into an existing engine. Language mismatch (C vs C++ - getting the Q3 source to even compile at all in C++ mode is a major pain in the bum), architecture differences. It's not just drag-and-drop, it requires serious engineering.
MoHAA has a large community, we just can't focus on things that would be good to everyone.
I don't know why EA make things so difficult...
-
Rookie One.pl
- Site Admin
- Posts: 2752
- Joined: Fri Jan 31, 2003 7:49 pm
- Location: Nowa Wies Tworoska, Poland
- Contact:
@Morris: sure we can discuss. If you think you can do it, go ahead and integrate that renderer into OMoHAA. I know I can't.
@Dan: qcommon/qfiles.h contains what we know so far about the data structs of MoHAA BSP format. Also qcommon/cm_load.c and renderer/tr_bsp.c contain the code to load the various parts of the BSP.
@Dan: qcommon/qfiles.h contains what we know so far about the data structs of MoHAA BSP format. Also qcommon/cm_load.c and renderer/tr_bsp.c contain the code to load the various parts of the BSP.
-
brendank310
- Corporal
- Posts: 27
- Joined: Mon Apr 07, 2003 11:02 am
- Contact:




