Building a 3D Engine in Perl, Part 4

Building a 3D Engine in Perl, Part 4
by Geoff Broadwell |

dprofpp is Your (Obtuse) Friend

With benchmark mode enabled, the engine runs under dprofpp. The first step is to collect the profile data:

dprofpp -Q -p step065

-p step065 tells dprofpp to profile the program named step065, and -Q tells it to quit after collecting the data. dprofpp ran step065, collected the profile data, and stored it in a specially formatted text file named tmon.out in the current directory.

To turn the profile data into human-readable output, I used dprofpp without any arguments. It crunched the collected data for a while and finally produced this:

$ dprofpp

Exporter::Heavy::heavy_export_to_level has 4 unstacked calls in outer

Exporter::export_to_level has -4 unstacked calls in outer

Exporter::export has -12 unstacked calls in outer

Exporter::Heavy::heavy_export has 12 unstacked calls in outer

Total Elapsed Time = 4.838377 Seconds

  User+System Time = 1.498377 Seconds

Exclusive Times

%Time ExclSec CumulS #Calls sec/call Csec/c  Name

 88.1   1.320  1.320      1   1.3200 1.3200  SDL::SetVideoMode

 38.1   0.571  0.774    294   0.0019 0.0026  main::draw_quad_face

 16.0   0.240  0.341      8   0.0300 0.0426  SDL::OpenGL::BEGIN

 13.0   0.195  0.195  64722   0.0000 0.0000  SDL::OpenGL::Vertex

 11.3   0.170  0.170      1   0.1700 0.1700  DynaLoader::dl_load_file

 9.34   0.140  0.020     12   0.0116 0.0017  Exporter::export

 6.67   0.100  0.100   1001   0.0001 0.0001  SDL::in

 4.00   0.060  0.060      1   0.0600 0.0600  SDL::Init

 3.34   0.050  0.847      8   0.0062 0.1059  main::BEGIN

 2.00   0.030  0.040      5   0.0060 0.0080  SDL::Event::BEGIN

 1.80   0.027  0.801     49   0.0005 0.0163  main::draw_cube

 1.47   0.022  0.022   2947   0.0000 0.0000  SDL::OpenGL::End

 1.33   0.020  0.020      1   0.0200 0.0200  warnings::BEGIN

 1.33   0.020  0.020     16   0.0012 0.0012  Exporter::as_heavy

 1.33   0.020  0.209      5   0.0040 0.0418  SDL::BEGIN

There are several problems with this output. The numbers are clearly silly (88 percent of its time spent in SDL::SetVideoMode?), the statistics for the various BEGIN blocks are inconsequential to the task and in the way, and the error messages at the top are rather disconcerting. To fix these issues, dprofpp has the -g option, which tells dprofpp to only display statistics for a particular routine and its descendants:

$ dprofpp -g main::main_loop

Total Elapsed Time = 4.952042 Seconds

  User+System Time = 0.812051 Seconds

Exclusive Times

%Time ExclSec CumulS #Calls sec/call Csec/c  Name

 70.3   0.571  0.774    294   0.0019 0.0026  main::draw_quad_face

 24.0   0.195  0.195  64722   0.0000 0.0000  SDL::OpenGL::Vertex

 3.32   0.027  0.801     49   0.0005 0.0163  main::draw_cube

 2.71   0.022  0.022   2947   0.0000 0.0000  SDL::OpenGL::End

 1.23   0.010  0.010     49   0.0002 0.0002  SDL::OpenGL::Rotate

 1.11   0.009  0.009      7   0.0013 0.0013  main::prep_frame

 1.11   0.009  0.009     70   0.0001 0.0001  SDL::OpenGL::Color

 0.25   0.002  0.002   2947   0.0000 0.0000  SDL::OpenGL::Begin

 0.00       - -0.000      1        -      -  main::action_quit

 0.00       - -0.000      2        -      -  SDL::EventType

 0.00       - -0.000      2        -      -  SDL::Event::type

 0.00       - -0.000      7        -      -  SDL::GetTicks

 0.00       - -0.000      7        -      -  SDL::OpenGL::Clear

 0.00       - -0.000      7        -      -  SDL::OpenGL::GL_NORMALIZE

 0.00       - -0.000      7        -      -  SDL::OpenGL::GL_SPOT_EXPONENT

You may have noticed that I specified main::main_loop instead of just main_loop. dprofpp always uses fully qualified names and will give empty results if you use main_loop without the main:: package qualifier.

In this exclusive times view, the percentages in the first column and the row order depend only on the runtime of each routine, without respect to its children. Using just this view, I might have tried to optimize draw_quad_face somehow, as it appears to be the most expensive routine by a large margin. That's not the best approach, however, as an inclusive view (-I) shows:

$ dprofpp -I -g main::main_loop

Total Elapsed Time = 4.952042 Seconds

  User+System Time = 0.812051 Seconds

Inclusive Times

%Time ExclSec CumulS #Calls sec/call Csec/c  Name

 100.       -  0.814      7        - 0.1163  main::do_frame

 99.9       -  0.812      1        - 0.8121  main::main_loop

 99.7       -  0.810      7        - 0.1158  main::draw_view

 99.2       -  0.806      7        - 0.1151  main::draw_frame

 98.6   0.027  0.801     49   0.0005 0.0163  main::draw_cube

 95.3   0.571  0.774    294   0.0019 0.0026  main::draw_quad_face

 24.0   0.195  0.195  64722   0.0000 0.0000  SDL::OpenGL::Vertex

 2.71   0.022  0.022   2947   0.0000 0.0000  SDL::OpenGL::End

 1.23   0.010  0.010     49   0.0002 0.0002  SDL::OpenGL::Rotate

 1.11   0.009  0.009     70   0.0001 0.0001  SDL::OpenGL::Color

 1.11   0.009  0.009      7   0.0013 0.0013  main::prep_frame

 0.25   0.002  0.002   2947   0.0000 0.0000  SDL::OpenGL::Begin

 0.00       - -0.000      1        -      -  main::action_quit

 0.00       - -0.000      2        -      -  SDL::EventType

 0.00       - -0.000      2        -      -  SDL::Event::type

In this view, draw_quad_face looks even worse, because the first column now includes the time taken by all of the OpenGL calls inside of it, including tens of thousands of glVertex calls. It seems that I should do something to speed it up, but at this point it's not entirely clear how to simplify it or reduce the number of OpenGL calls it makes (other than reducing the subdivision level of each face, which would reduce rendering quality).

Actually, there's a better option. The real problem is that draw_cube dominates the execution time, and draw_quad_face dominates that. How about not calling draw_cube (and therefore draw_quad_face) at all during normal rendering? It seems extremely wasteful to have to tell OpenGL how to render a cube face dozens of times each frame. If only there were a way to tell OpenGL to remember the cube definition once, and just refer to that definition each time the engine needs to draw it.

Prev  [1] [2] [3] [4] [5] [6] [7] [8] Next

Close    To Top
  • Prev Article-Programming:
  • Next Article-Programming:
  • Now: Tutorial for Web and Software Design > Programming > Perl > Programming Content
    Photoshop Tutorial
     

    Special Effect

      3D Effect
      Photoshop Articles
    Programming Tutorial
     

    C/C++ Tutorial

      Visual Basic
      C# Tutorial
    Database Tutorial
     

    MySQL Tutorial

      MS SQL Tutorial
      Oracle Tutorial
    Geek Tutorial
     

    Blogging Tutorial

      RSS Tutorial
      Podcasting Tutorial
    Graphic Design Tutorial
      Coreldraw Tutorial
      Illustrator Tutorial
      3D Tutorials
    Webmaster Articles
     

    Domain Service

      Web Hosting
      Site Promotion
    Java Tutorial/ Articles
     

    Java Servlets

      JavaEE Tutorial
     

    JavaBeans Tutorial

    XML Tutorial/ Articles
     

    XML Style

      AJAX Tutorial
      XML Mobile
    Flash Tutorial/ Articles
     

    Flash Video

      Action Script
      Flash Articles
    OS Tutorial/ Articles
      Linux Tutorial
      Symbian Tutorial
      MacOS Tutorial
    Personal Tech
      Hardware Tutorial
      Software Tutorial
      Online Auction