Building a 3D Engine in Perl, Part 4
by Geoff Broadwell
|
Too Fast
There's yet another problem; this time, one that will require a change to the
frame rate calculations. The frame rate shown in the above screenshots is
either 333 or 500, but nothing else. On this system, the frames take between two
and three milliseconds to render, but because SDL can only provide one-millisecond
resolution, the time delta for a single frame will appear to be
exactly either .002 second or .003 second. 1/.002=500, and 1/.003=333, so the display is a blur, flashing back and forth between the two possible
values.
To get a more representative (and easier-to-read) value, the code must
average frame rate over a number of frames. Doing this will allow the total
measured time to be long enough to drown out the resolution deficiency of SDL's
clock.
The first thing I needed was a routine to initialize the frame rate data to
carry over multiple frames:
sub init_fps
{
my $self = shift;
$self->{stats}{fps}{cur_fps} = 0;
$self->{stats}{fps}{last_frame} = 0;
$self->{stats}{fps}{last_time} = $self->{world}{time};
}
The new stats structure in the engine object will hold any
statistics that the engine gathers about itself. To calculate FPS, the engine
needs to remember the last frame for which it took a timestamp, as well as the
timestamp for that frame. Because the engine calculates the frame rate only
every few frames, it also saves the last calculated FPS value so that it can
render it as needed. The init_fps call, as usual, goes at the end
of init:
$self->init_fps;
The new update_fps routine now calculates the frame rate:
sub update_fps
{
my $self = shift;
my $frame = $self->{state}{frame};
my $time = $self->{world}{time};
my $d_frames = $frame - $self->{stats}{fps}{last_frame};
my $d_time = $time - $self->{stats}{fps}{last_time};
$d_time ||= 0.001;
if ($d_time >= .2) {
$self->{stats}{fps}{last_frame} = $frame;
$self->{stats}{fps}{last_time} = $time;
$self->{stats}{fps}{cur_fps} = int($d_frames / $d_time);
}
}
update_fps starts by gathering the current frame number and
timestamp, and calculating the deltas from the saved values. Again,
$d_time must default to 0.001 second to avoid possible divide-by-zero errors later on.
The if statement checks to see if enough time has gone by to
result in a reasonably accurate frame rate calculation. If so, it sets the last
frame number and timestamp to the current values and the current frame rate to
$d_frames / $d_time.
The update_fps call must occur early in the
main_loop, but after the engine has determined the new frame
number and timestamp. main_loop now looks like this:
sub main_loop
{
my $self = shift;
while (not $self->{state}{done}) {
$self->{state}{frame}++;
$self->update_time;
$self->update_fps;
$self->do_events;
$self->update_view;
$self->do_frame;
}
}
The final change needed to enable the new more accurate display is in
draw_fps; the $d_time lookup goes away and the
$fps calculation turns into a simple retrieval of the current
value from the stats structure:
my $fps = $self->{stats}{fps}{cur_fps};
The more accurate calculation now makes it easy to see the difference
between the frame rate for a simple view (Figure 7):

Figure
7. Frame rate for a simple view
and the frame rate for a more complex view (Figure 8).

Figure 8. Frame rate for a complex view
Is the New Display a Bottleneck?
The last thing to do is to check that the shiny new frame rate display is
not itself a major bottleneck. The easiest way to do that is to turn benchmark
mode back on in init_conf:
benchmark =& 1,
After doing that, I ran the engine under dprofpp again, and
then analyzed the results, just as I had earlier:
$ dprofpp -Q -p step075
Done.
$ dprofpp -I -g main::main_loop
Total Elapsed Time = 3.943764 Seconds
User+System Time = 1.063773 Seconds
Inclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c Name
100. - 1.064 1 - 1.0638 main::main_loop
94.6 0.006 1.007 384 0.0000 0.0026 main::do_frame
85.2 0.019 0.907 384 0.0000 0.0024 main::draw_frame
50.7 0.205 0.540 384 0.0005 0.0014 main::draw_view
16.8 0.073 0.179 384 0.0002 0.0005 main::draw_fps
15.4 0.095 0.164 384 0.0002 0.0004 main::set_projection_2d
11.6 0.045 0.124 384 0.0001 0.0003 main::draw_axes
10.9 0.116 0.116 2688 0.0000 0.0000 SDL::OpenGL::CallList
8.74 0.013 0.093 384 0.0000 0.0002 main::end_frame
7.52 0.003 0.080 384 0.0000 0.0002 SDL::App::sync
7.24 0.077 0.077 384 0.0002 0.0002 SDL::GLSwapBuffers
4.89 0.052 0.052 3072 0.0000 0.0000 SDL::OpenGL::PopMatrix
4.70 0.023 0.050 384 0.0001 0.0001 main::update_view
3.67 0.039 0.039 3456 0.0000 0.0000 SDL::OpenGL::GL_LIGHTING
3.48 0.037 0.037 384 0.0001 0.0001 SDL::OpenGL::Begin
As it currently stands, draw_view takes half of the run time of
main_loop, and the combination of set_projection_2d
and draw_fps takes about a third of the main_loop time
together. Is that good or bad news?
draw_view is so quick now because I've just optimized it. Now
that it's running so fast again, I can afford to add more features and perhaps
make a more complex scene, either of which will make draw_view
take a larger percentage of the time again. Also,
set_projection_2d is necessary for any in-window statistics,
debugging, or HUD (heads up display) anyway, so the time spent there will not
go to waste.
That leaves draw_fps, taking about one sixth of
main_loop's run time. That's perhaps a bit larger than I'd like,
but not large enough to warrant additional effort yet. I'll save my energy for
the next set of features.
Conclusion
During this article, I covered several concepts relating to engine
performance: adding a benchmark mode; profiling with dprofpp;
using display lists to optimize slow, repetitive rendering tasks; and using
display lists, bitmapped fonts, and averaging to produce a smooth frame rate
display. I also added a stub for a triggered events subsystem, which I'll come
back to in a future article.
With these performance improvements, the engine is ready for the next new
feature, textured surfaces, which will be the main topic for the next
article.
Until then, enjoy yourself and have fun hacking!
Prev [1] [2] [3] [4] [5] [6] [7] [8]