I was able to go a long way without optimizing the Matrix and Point classes. Since most of my games and prototypes were relatively simple or made limited use of points and matrices it didn’t matter too much, that is until Red Ice, which was using 5 physics steps per frame and computing tons of vectors and collision responses.
Optimizing the Drawing
The problems became evident in the profiler. First was the classic of “draw less”. Because Red Ice is running at 1024×768 even clearing the entire canvas can get costly, not to mention copying over the ice and blood canvases to the main one. In order to break through this bottleneck I separated out the rink/background and blood canvases into layers that stay below the main canvas. Since the rink never redraws and since the blood only adds and erases strokes in a semi-permanent way, this saves copying a ton of image data to the main canvas, which in turn saves a ton of time. I still clear the entire main canvas every frame, but implementing selective clearing is a big project and would have limited performance gains (we still need to clear big chunks of the canvas most updates with all the stuff moving around).
Optimizing Joystick Input
axes. Each single piece slows things down.
The first performance improvement I made was to use a single integer for all the buttons, holding each one in a bit. This gave an immediate 25% speedup, confirming that going from an array of 10 booleans to 1 integer (reducing the number of items passed across) speeds things up. Next, since I wasn’t using the additional 4 axes I decided to only pass 2 axes as a temporary fix. This also gave a decent speedup, especially at the 6 controller mark, because each additional controller was additional data.
Optimizing the Points
This finally exposed the remaining problems of the
Point class and garbage collection being the next bottle neck. Due to all the physics updates and point computations it was creating tons of new Point objects and each additional point operation created more because the operations were non-destructive. The good news is that I have a full test suite for the Point class, so that I can refactor and optimize without any fear. This was quite valuable when testing crazy new ideas to see if they would improve performance without breaking everything. I figured that since I was creating so many points that if I added optional destructive methods, and used them in the right places, that could reduce the new point creation and also GC load quite a bit.
! as a method suffix like in Ruby, because it is a known convention for destructive methods. The closest I could come would be
point["add!"](otherPoint), which was too brutal on the mind and eyes to make it into common usage. If CoffeeScript could auto-compile
point.add!(otherPoint) into the index-operator notation like it does with
a.new then it would be okay, but until then the
! suffix is out.
$ symbol in variable and method names so, by necessity, I have chosen to use
$ as the glyph of destruction, which has its own poetry in a way.
point.add$(otherPoint) not bad, but not my first choice.
Now, armed with new destructive methods, I set about looking for places in the physics and collisions where I could slip them in to prevent the creation of unwanted/unused points. Then a funny thing happened, the majority of the places I looked needed the operators to be non-destructive, and it was difficult to see exactly where a destructive method could be added without unwanted side-effects except in a few simple situations such as
point = point.add$(delta) => point.add$(delta).
While I was cleaning up the Point code I was thinking about a conversation I had on GitHub about the performance benefits from using prototype inheritance rather than object augmentation. This sounded like a good idea in this case, as
Points are primitive objects with their
y properties on the outside and all of their methods using
this everywhere. The one sticking point was that I could not abide having to stick
new in front of the point constructor in ten thousand places in my existing and future code. If only there was some way to get the advantages of prototype performance, without pushing the syntactic hassles onto the people using the class.
The good news is that there is a way to set an objects prototype. All you need to do is to set the
__proto__ property. So now my
Point constructor looks like this:
Point = (x, y) ->
x: x || 0
y: y || 0
You know you’ve been programming in the browser too long when
__proto__: Point:: becomes a thing of beauty.
All the instance methods are defined below as follows:
add: (first, second) ->
add$: (first, second) ->
this.x += first
this.y += second
this.x += first.x
this.y += first.y
This gives all the performance benefits of using prototypes rather than making an anonymous function for every method as well as the additional side benefit that developers can extend
Point.prototype with additional methods for use in their own projects if they want to. Another advantage is that the syntax remains unchanged, no need to use the
new operator, and all the tests still pass.
The best news is that this provides a 90% reduction in time that the code spends constructing and garbage collecting points, and was simple enough to pull into the
Matrix class with a two line change as well. For primitive objects like Points, Matrixes, Arrays, Numbers, and the like I wholeheartedly recommend this approach. For complex objects that require mixins, private variables, and instance variables I don’t think it will be possible because each object actually does need it’s own functions that are in the correct closure scope.
Another interesting thing is that this last optimization voided the assumption of my previous one about destructive operators. I assumed that because creating points was expensive it would be worthwhile to go to extra lengths to prevent their creation unnecessarily. Using the prototypesque construction the cost of point creation and garbage collection was reduced so much that it’s not worth it to try and squeeze out the now slight performance gains that would produce except in the hottest inner loops. I’ll still keep the destructive methods around for situations where points actually want to be updated in place, like
p = p.norm(speed) => p.norm$(speed), but I won’t be quick to begin trying to “optimize” by defaulting to using destructive methods and then spend hours debugging issues that come up because two objects are actually sharing the same point reference.
Points are cheap now, use them freely!