Techniques demonstrated:
- Ultra-fast font rendering
- Ultra-fast bitmap scaling
- Ultra-fast bitmap rotation
- Projection of 3D structures to 2D for rendering
- Invisible surface detection
- Ultra-fast rendering of lines and polygons directly into video RAM
- Gouraud shading
- Ultra-fast dithering
- A voxel rendering experiment
- A tiled floor rendering experiment
Summary (just give me the TL;DR)
A collection of very short YouTube videos of graphics demos that I did all by myself at home for fun back in the mid-nineties (when I was in my twenties) rendering pixels directly into the video RAM of the 320x200-pixel, 256-color palette VGA without the use of any libraries. Everything is in C or C++ with the crucial routines written in 80386 assembly, in some cases generating machine code on the fly.
It was the end of 1993. The peak time of the 320x200-pixel, 256-color VGA. Around that time the world saw some legendary game releases such as Doom and Prince of Persia 2. The year before that, we had seen Wolfenstein 3D and Indiana Jones and the Fate of Atlantis. Wing Commander and Wing Commander II had already been released as early as 1990 and 1991 respectively. Most of those games were raster-graphics-based. Wolfenstein 3D was a weird thing that we did not quite know whether to call it raster-based or vector-based; Doom looked rather vector-based, so it was beginning to seem that vector graphics might be the way of the future; but when Lucas Arts released Star Wars: X-Wing, even the slightest doubt went away.
My available time for experimentation with vector graphics was limited, since I was attending classes at the University and holding a full time job, but I managed to put a few demos together. In order to run these demos today one would need a DOS emulator, but I will spare the reader from that hassle, by providing short videos of the demos instead. (Each video is only a few seconds long.)
It is worth noting that all these demos are DOS applications and they are not making use of any ready-made graphics libraries. A direct BIOS call puts the graphics adapter in 320x200x256 mode, (regular scan lines instead of the exotic Mode X,) and from that moment on it is just routines entirely written by me in 80386 assembly and C++ directly accessing the video RAM. Rendering is done top to bottom, left to right, which is inadvisable because it increases contention with video readout by the VGA hardware, but I did not care about that yet, I wanted to first get everything to work correctly before tweaking things to squeeze the maximum performance out of it.
Distortions, jaggedness, and / or fuzziness that you might observe in the following videos are mostly due to the imprecise nature of interpolation via integer arithmetic. The skips that you will notice in the movement are to an extent due to the emulation of a DOS-era system under modern Windows, and mostly, due to the video capturing process. Natively, the demos used to run as smoothly as silk.
I found a way to do interpolation by addition and extraction of the high-order word, (division by 65536,) which is essentially fixed point real number arithmetic. This has the advantage of not requiring a comparison and a conditional jump within the interpolation loop: the only jump is the loop jump. This would be even more advantageous on modern hardware, (if we were still doing vector graphics on the CPU, which we don't,) since it would not cause branch prediction to fail once every few pixels. Still, even back then, the fewer instructions made a considerable difference.
It is worth noting that all these demos are DOS applications and they are not making use of any ready-made graphics libraries. A direct BIOS call puts the graphics adapter in 320x200x256 mode, (regular scan lines instead of the exotic Mode X,) and from that moment on it is just routines entirely written by me in 80386 assembly and C++ directly accessing the video RAM. Rendering is done top to bottom, left to right, which is inadvisable because it increases contention with video readout by the VGA hardware, but I did not care about that yet, I wanted to first get everything to work correctly before tweaking things to squeeze the maximum performance out of it.
Distortions, jaggedness, and / or fuzziness that you might observe in the following videos are mostly due to the imprecise nature of interpolation via integer arithmetic. The skips that you will notice in the movement are to an extent due to the emulation of a DOS-era system under modern Windows, and mostly, due to the video capturing process. Natively, the demos used to run as smoothly as silk.
I found a way to do interpolation by addition and extraction of the high-order word, (division by 65536,) which is essentially fixed point real number arithmetic. This has the advantage of not requiring a comparison and a conditional jump within the interpolation loop: the only jump is the loop jump. This would be even more advantageous on modern hardware, (if we were still doing vector graphics on the CPU, which we don't,) since it would not cause branch prediction to fail once every few pixels. Still, even back then, the fewer instructions made a considerable difference.
Text Scrolling and bitmap scaling
This is the very first demo that I ever made. It might seem childish nowadays, but believe me, back then, to get something like this to run so fast, it was something.
Bitmap Rotation and Scaling
The following demo shows rotation and scaling of a bitmap. The underlying algorithm performs line interpolations across the source bitmap to visit the pixels that it copies to the video RAM.
"Ark"
This was going to be a 3-D arkanoid, never made it though. I do not remember now how I produced the sounds, it is very likely that I was controlling the PC speaker. (Remember that?)
Animated 3-D objects - Wireframe
The next demo is all about vector graphics. You will see various 3-D objects being animated against a procedurally generated backdrop which is made so as to give the illusion of a 3-dimensional room.
On the first run of the demo, the objects are rendered as wire frames. Edges that belong to surfaces that face towards (are visible by) the viewer are drawn in color, while edges that belong to surfaces facing away from (are not visible by) the viewer are drawn in black. Also, black edges are drawn before colored edges, so the former never cross over the latter. Double buffering is utilized to prevent flicker.
Animated 3-D objects - Solid
On the 2nd run of the demo, the objects are rendered as solids. Without reading it anywhere, I discovered that if I was careful to define the points of each surface in a clockwise fashion when the surface is visible, then after rotation and projection the points would remain clockwise if the surface was still visible, but they would turn out counter-clockwise if the surface had become invisible. So, detection of "clockwiseness" is the simplistic mechanism that I am employing to detect and refrain from drawing invisible surfaces in this demo.
Animated 3-D objects - Shaded
That was all very nice, but then in the summer of 1994 Lucas Arts released Star Wars: Tie Fighter, which I immediately bought, and I was astonished to find out that surfaces in that game were shaded in a way which created a very convincing illusion of curvature. Some sort of never seen before random dithering effect gave them a very computationally expensive look. I thought "this is impossible!"
At that point I decided that I could not keep reinventing everything by myself, and I had to hit the books. Luckily, in the options of the game there was a hint: the check box for enabling or disabling the shading read "Enable Gouraud shading". So, I went to the University library to look up the term, and very quickly I had a pretty clear idea of how to do it. Unfortunately, another problem still remained: with the 256-color palette which was the best that you could hope for on a PC at that time, a simple interpolation across brightness values would yield awfully ugly results. (Banding.) I needed to somehow add dithering without spending more than a couple of clock cycles per pixel. Achieving such a feat seemed inconceivable, but obviously TIE Fighter was doing it, (the computationally expensive look,) so there had to be a way.
The solution came in the form of a nifty assembly language hack: I loaded a 16-bit register with a random pattern of zeros and ones, and every time I had a pixel value in the AL register and I needed to add noise to it, I would execute a circular rotation instruction on the random-pattern register, which would essentially yield a random bit in the carry flag. Then, I would execute a special instruction which adds the carry flag to the AL register, thus incrementing or not incrementing the AL register at random, in effect selecting or not selecting the next (brighter) palette entry at random. The result was just awesome.
Despite what the demo says, I did not implement proper Gouraud shading, I just wrote a routine which draws surfaces of variable brightness, which is the basis of Gouraud shading, and I made this routine use dithering so as to overcome the limited palette constraint of the 256-color display. In order to do it properly I would have had to have a proper light source, and to compute the normal at each vertex, and then interpolate brightness values along each edge. Instead, I just used the Y coordinate of each scan line as a relative brightness value. The result is pretty cool, though it is a pity that I did not have the time to take it one step further and make it proper.
Despite what the demo says, I did not implement proper Gouraud shading, I just wrote a routine which draws surfaces of variable brightness, which is the basis of Gouraud shading, and I made this routine use dithering so as to overcome the limited palette constraint of the 256-color display. In order to do it properly I would have had to have a proper light source, and to compute the normal at each vertex, and then interpolate brightness values along each edge. Instead, I just used the Y coordinate of each scan line as a relative brightness value. The result is pretty cool, though it is a pity that I did not have the time to take it one step further and make it proper.
Here is a screen capture showing the shaded sphere up close and personal, magnified from 320x200 to 1280x800 using pixel-resize to avoid blur, allowing detailed inspection of the quality of the rendered artifact. A careful look reveals that the randomness repeats periodically, and the period is exactly 16 pixels, that is, the width of the register which is being rotated.
The shaded sphere frozen in time and magnified for inspection. Click to enlarge. |
Flying in a voxel world
At some point voxel based games came out, and they too looked very interesting. I tried to figure out how they work, I came up with a rough idea, and I tried to implement it. Please excuse the crudeness of the following demo, I did it in a weekend:
Textured floor rendering
Just some doodling around with floor rendering.
No comments:
Post a Comment