Building a 3D Engine in Perl, Part 4

Building a 3D Engine in Perl, Part 4
by Geoff Broadwell |

Too Fast

There's yet another problem; this time, one that will require a change to the frame rate calculations. The frame rate shown in the above screenshots is either 333 or 500, but nothing else. On this system, the frames take between two and three milliseconds to render, but because SDL can only provide one-millisecond resolution, the time delta for a single frame will appear to be exactly either .002 second or .003 second. 1/.002=500, and 1/.003=333, so the display is a blur, flashing back and forth between the two possible values.

To get a more representative (and easier-to-read) value, the code must average frame rate over a number of frames. Doing this will allow the total measured time to be long enough to drown out the resolution deficiency of SDL's clock.

The first thing I needed was a routine to initialize the frame rate data to carry over multiple frames:

sub init_fps

{

    my $self = shift;



    $self->{stats}{fps}{cur_fps}    = 0;

    $self->{stats}{fps}{last_frame} = 0;

    $self->{stats}{fps}{last_time}  = $self->{world}{time};

}

The new stats structure in the engine object will hold any statistics that the engine gathers about itself. To calculate FPS, the engine needs to remember the last frame for which it took a timestamp, as well as the timestamp for that frame. Because the engine calculates the frame rate only every few frames, it also saves the last calculated FPS value so that it can render it as needed. The init_fps call, as usual, goes at the end of init:

$self->init_fps;

The new update_fps routine now calculates the frame rate:

sub update_fps

{

    my $self      = shift;



    my $frame     = $self->{state}{frame};

    my $time      = $self->{world}{time};



    my $d_frames  = $frame - $self->{stats}{fps}{last_frame};

    my $d_time    = $time  - $self->{stats}{fps}{last_time};

    $d_time     ||= 0.001;



    if ($d_time >= .2) {

        $self->{stats}{fps}{last_frame} = $frame;

        $self->{stats}{fps}{last_time}  = $time;

        $self->{stats}{fps}{cur_fps}    = int($d_frames / $d_time);

    }

}

update_fps starts by gathering the current frame number and timestamp, and calculating the deltas from the saved values. Again, $d_time must default to 0.001 second to avoid possible divide-by-zero errors later on.

The if statement checks to see if enough time has gone by to result in a reasonably accurate frame rate calculation. If so, it sets the last frame number and timestamp to the current values and the current frame rate to $d_frames / $d_time.

The update_fps call must occur early in the main_loop, but after the engine has determined the new frame number and timestamp. main_loop now looks like this:

sub main_loop

{

    my $self = shift;



    while (not $self->{state}{done}) {

        $self->{state}{frame}++;

        $self->update_time;

        $self->update_fps;

        $self->do_events;

        $self->update_view;

        $self->do_frame;

    }

}

The final change needed to enable the new more accurate display is in draw_fps; the $d_time lookup goes away and the $fps calculation turns into a simple retrieval of the current value from the stats structure:

my $fps  = $self->{stats}{fps}{cur_fps};

The more accurate calculation now makes it easy to see the difference between the frame rate for a simple view (Figure 7):

frame rate for a simple view
Figure 7. Frame rate for a simple view

and the frame rate for a more complex view (Figure 8).

frame rate for a complex view
Figure 8. Frame rate for a complex view

Is the New Display a Bottleneck?

The last thing to do is to check that the shiny new frame rate display is not itself a major bottleneck. The easiest way to do that is to turn benchmark mode back on in init_conf:

    benchmark =& 1,

After doing that, I ran the engine under dprofpp again, and then analyzed the results, just as I had earlier:

$ dprofpp -Q -p step075



Done.

$ dprofpp -I -g main::main_loop

Total Elapsed Time = 3.943764 Seconds

  User+System Time = 1.063773 Seconds

Inclusive Times

%Time ExclSec CumulS #Calls sec/call Csec/c  Name

 100.       -  1.064      1        - 1.0638  main::main_loop

 94.6   0.006  1.007    384   0.0000 0.0026  main::do_frame

 85.2   0.019  0.907    384   0.0000 0.0024  main::draw_frame

 50.7   0.205  0.540    384   0.0005 0.0014  main::draw_view

 16.8   0.073  0.179    384   0.0002 0.0005  main::draw_fps

 15.4   0.095  0.164    384   0.0002 0.0004  main::set_projection_2d

 11.6   0.045  0.124    384   0.0001 0.0003  main::draw_axes

 10.9   0.116  0.116   2688   0.0000 0.0000  SDL::OpenGL::CallList

 8.74   0.013  0.093    384   0.0000 0.0002  main::end_frame

 7.52   0.003  0.080    384   0.0000 0.0002  SDL::App::sync

 7.24   0.077  0.077    384   0.0002 0.0002  SDL::GLSwapBuffers

 4.89   0.052  0.052   3072   0.0000 0.0000  SDL::OpenGL::PopMatrix

 4.70   0.023  0.050    384   0.0001 0.0001  main::update_view

 3.67   0.039  0.039   3456   0.0000 0.0000  SDL::OpenGL::GL_LIGHTING

 3.48   0.037  0.037    384   0.0001 0.0001  SDL::OpenGL::Begin

As it currently stands, draw_view takes half of the run time of main_loop, and the combination of set_projection_2d and draw_fps takes about a third of the main_loop time together. Is that good or bad news?

draw_view is so quick now because I've just optimized it. Now that it's running so fast again, I can afford to add more features and perhaps make a more complex scene, either of which will make draw_view take a larger percentage of the time again. Also, set_projection_2d is necessary for any in-window statistics, debugging, or HUD (heads up display) anyway, so the time spent there will not go to waste.

That leaves draw_fps, taking about one sixth of main_loop's run time. That's perhaps a bit larger than I'd like, but not large enough to warrant additional effort yet. I'll save my energy for the next set of features.

Conclusion

During this article, I covered several concepts relating to engine performance: adding a benchmark mode; profiling with dprofpp; using display lists to optimize slow, repetitive rendering tasks; and using display lists, bitmapped fonts, and averaging to produce a smooth frame rate display. I also added a stub for a triggered events subsystem, which I'll come back to in a future article.

With these performance improvements, the engine is ready for the next new feature, textured surfaces, which will be the main topic for the next article.

Until then, enjoy yourself and have fun hacking!

Prev  [1] [2] [3] [4] [5] [6] [7] [8] 

Close    To Top
  • Prev Article-Programming:
  • Next Article-Programming:
  • Now: Tutorial for Web and Software Design > Programming > Perl > Programming Content
    Photoshop Tutorial
     

    Special Effect

      3D Effect
      Photoshop Articles
    Programming Tutorial
     

    C/C++ Tutorial

      Visual Basic
      C# Tutorial
    Database Tutorial
     

    MySQL Tutorial

      MS SQL Tutorial
      Oracle Tutorial
    Geek Tutorial
     

    Blogging Tutorial

      RSS Tutorial
      Podcasting Tutorial
    Graphic Design Tutorial
      Coreldraw Tutorial
      Illustrator Tutorial
      3D Tutorials
    Webmaster Articles
     

    Domain Service

      Web Hosting
      Site Promotion
    Java Tutorial/ Articles
     

    Java Servlets

      JavaEE Tutorial
     

    JavaBeans Tutorial

    XML Tutorial/ Articles
     

    XML Style

      AJAX Tutorial
      XML Mobile
    Flash Tutorial/ Articles
     

    Flash Video

      Action Script
      Flash Articles
    OS Tutorial/ Articles
      Linux Tutorial
      Symbian Tutorial
      MacOS Tutorial
    Personal Tech
      Hardware Tutorial
      Software Tutorial
      Online Auction