PROFILING GLX.SO
----------------

You can now profile glx.so without any support from the X server, and also 
in direct applications.  The mechanism is:

1)	  Configure using '--enable-profiling'

2)	  Invoke either the X server or the direct client with the environment variable 'GLX_SO_MON' set, eg:

	  [root] # GLX_SO_MON=t ./gears


	  This step should produce 2 files in the working directory: glx_lowpc and gmon.out.  These are both important.

3)	  Copy or link the 2 files produced above to <glx-dir>/servGL.

	  
4)	  Invoke 'make profile' in servGL.

	  $ make profile

	  This will produce a lot of output, such as

	  	  rm -f libglx.a
		  ar ruv libglx.a lowpc.o  serverglx/?*.o mesaglx/?*.o ...
		  a - lowpc.o
		  a - serverglx/glx_clients.o
		  a - serverglx/glx_dispatch.o
		  a - serverglx/glx_log.o
		  [...]
		  ranlib libglx.a
		  ld -o glxsyms -noinhibit-exec --whole-archive ...
		  gprof glxsyms < gmon.out > profile
	  

5)	  The results should appear in a file called 'profile'.


This is the output from a run of gears (as direct client):


  %   cumulative   self              self     total           
 time   seconds   seconds    calls  us/call  us/call  name    
 52.06      1.39     1.39   433608     3.21     3.21  triangle_flat
  7.87      1.60     0.21     2136    98.31    98.31  rs_w
  5.24      1.74     0.14    42720     3.28     3.28  triangle_smooth
  3.75      1.84     0.10                             gl_x86_cliptest_points4
  3.37      1.93     0.09     3738    24.08    24.08  shade_fast_rgba_one_sided_
compacted
  2.25      1.99     0.06     3204    18.73   196.11  render_vb_tri_strip_raw
  1.87      2.04     0.05     3738    13.38    13.38  normalize_normals_masked
  1.87      2.09     0.05                             gl_3dnow_transform_points3


This is the output from a run of q3 (direct client):

  %   cumulative   self              self     total           
 time   seconds   seconds    calls  us/call  us/call  name    
 45.91      7.02     7.02  1269376     5.53     5.54  triangle_smooth_texture
  6.74      8.05     1.03   160748     6.41     8.26  rs_wgt
  4.77      8.78     0.73     2025   360.49   360.49  image_to_texture
  3.60      9.33     0.55   144025     3.82    14.34  viewclip_polygon_4
  2.81      9.76     0.43     2025   212.35   351.59  CopyImage
  2.35     10.12     0.36    28252    12.74   325.07  indexed_render_tris
  2.22     10.46     0.34   336952     1.01     1.01  clipTEX0_RGBA0
  2.16     10.79     0.33                             gl_x86_cliptest_points4
  1.96     11.09     0.30    13508    22.21    24.06  rs_gt
  1.70     11.35     0.26    33841     7.68    41.93  gl_update_state
  1.44     11.57     0.22    28559     7.70    10.77  gl_build_full_precalc_pipe
line


Using the new warp code with q3test, running direct:


Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls  us/call  us/call  name    
 13.64      2.79     2.79  3292969     0.85     0.96  triangle_smooth_texture
  6.75      4.17     1.38   346188     3.99    11.39  viewclip_polygon_4
  6.16      5.43     1.26   397756     3.17     4.63  rs_wgt
  5.23      6.50     1.07    84881    12.61    96.41  indexed_render_tris
  4.74      7.47     0.97   847532     1.14     1.14  clipTEX0_RGBA0
  2.64      8.01     0.54                             gl_x86_cliptest_points4
  2.59      8.54     0.53    84881     6.24     6.24  trans_2_GLfloat_4f_raw
  2.59      9.07     0.53     2025   261.73   261.73  image_to_texture
  2.25      9.53     0.46    97612     4.71    19.64  mga_setup_DD_pointers
  2.20      9.98     0.45                             gl_3dnow_transform_points3
_general_raw



Summary:

 cd ~/work/glx/servGL 
 rm -f glx_lowpc gmon.out  
 ln -s ~/q3test/glx_lowpc . 
 ln -s ~/q3test/gmon.out . 
 make profile && more profile




Keith Whitwell
August 1999
