The X windows DMA hack

The accelerated XFree86 server for matrox cards controls the graphics
chip by direct programming of its memory mapped registers and it assumes
that it is the only process interacting with the chip.

If a 3D driver just left the chip set up for 3D drawing, then when X
tried to draw something on the screen, it would wind up on the back
buffer, possibly depth buffered and blended, which is not what X wants.

In a programmed I/O 3D driver, this is avoided by setting things back the
way X expects it after rendering each primitive if there is a chance X
may try to draw something. This is wasteful of command bandwidth.

Even worse, if a DMA (direct memory access) command transfer is used to 
program 3D operations and X tries to program the accelerator, it usually 
crashes the machine. I'm not sure of the exact details, but it appears
that a programmed I/O write to a register takes priority over a DMA
command stream, and just gets mixed in randomly with whatever was coming
from DMA. Boom.

In an ideal world, X would generate DMA command buffers that could be
nicely scheduled between 3D DMA command buffers. 

If X just HAD to do programmed I/O to the graphics accelerator, it would
do something like this:

	if ( iDontHaveTheAccelerator ) {
		WaitForAccelerator();
	}
	ProgramAccelerator();

In this world, we make do with a hack.  When a DMA command buffer is
started, the region of memory that contains the hardware registers is
mprotect()ed so that if X tries to write to a register, we catch a signal
and wait until it is safe (all DMA has completed) before allowing the 
write to complete.

This incurs the overhead of two mprotect() calls for every command
buffer transfer: one to protect it, and another to unprotect it so DMA can
be programmed for the next buffer or X can draw. This does not seem to be
a huge overhead, but it is unfortunate in principle.

The registers that control the mouse cursor are left unprotected, so
moving the mouse cursor does not cause signals to be fired. Thankfully,
matrox has these on a separate 4k page from the drawing registers, and
they don't interfere with each other.

Whenever X tries to draw something to the screen when 3D rendering is 
happening, it will hit the signal handler and basically spin in place
until all the 3D drawing completes. This ruins asynchronous performance,
and in the worst case, can cut 3D framerates in half if the load was
evenly balanced between the CPU and the graphics accelerator.

If the GLX_MGA_BOXES debugging tool is enabled, a blue box will flash on
the screen if the signal handler was hit one or more times in the
previous frame.  You can watch this by moving the mouse cursor over
various windows while a 3D animation (like gears) is running. Every time
you cross into or out of a title bar, it redraws, forcing a signal 
trap.

If ANYTHING else is drawing, it will impact 3D performance. Any text
printing to a terminal, and any taskbar clocks or cpu meters will cause
slight hitches.  In terms of benchmarking, a once-a-second interruption
to update a clock or cpu bar isn't going to make much of an average
difference, but it is a noticeable glitch in the smoothness. It makes a
difference for gameplay.

X does seem to be pretty good about not hitting the registers if nothing
is going to actually draw. If your 3D window covers a printing terminal
window, it won't get interrupted. If you must have lots of little
doodads running on your normal desktop, consider setting up another
virtual desktop without anything on it for high performance 3D use.

Interestingly, multiple 3D apps get along much better than mixed 3D and
2D apps, because their commands are all queueable in the same buffers.

We do not currently mprotect() the framebuffer, so if X draws something
with CPU instructions instead of with the accelerator, it can be out of
sync. This sometimes happens with pop-up tip windows overlapping 3D
windows. We probably should protect the visible screen (no need to
protect any undisplayed memory). Unlike the registers, we would not need
to change it's protection for every command buffer, just when X tries to
draw.

John Carmack (johnc@idsoftware.com)
$Date: 1999/08/22 22:21:18 $

(see http://lists.openprojects.net/pipermail/glx-dev/1999-August/000141.html
 for the original reference)
