I was able to go a long way without optimizing the Matrix and Point classes. Since most of my games and prototypes were relatively simple or made limited use of points and matrices it didn’t matter too much, that is until Red Ice, which was using 5 physics steps per frame and computing tons of vectors and collision responses.
Optimizing the Drawing
The problems became evident in the profiler. First was the classic of “draw less”. Because Red Ice is running at 1024×768 even clearing the entire canvas can get costly, not to mention copying over the ice and blood canvases to the main one. In order to break through this bottleneck I separated out the rink/background and blood canvases into layers that stay below the main canvas. Since the rink never redraws and since the blood only adds and erases strokes in a semi-permanent way, this saves copying a ton of image data to the main canvas, which in turn saves a ton of time. I still clear the entire main canvas every frame, but implementing selective clearing is a big project and would have limited performance gains (we still need to clear big chunks of the canvas most updates with all the stuff moving around).
Optimizing Joystick Input
Now that we are drawing less the next big bottleneck came from joystick input. This sounds pretty ridiculous for anyone coming from a non-web background, but considering that using 6 XBox 360 controllers as input into an HTML5 game was pretty much unheard of, it’s not inconceivable that the first time it has been done might not be the most optimal. The core of the issue was that every single piece of data that is transferred from the native extension into JavaScript slows things down. Even something as simple as an array of true/false values for buttons. Even an array of six integers for six axes. Even an object with two properties buttons
and axes
. Each single piece slows things down.
The first performance improvement I made was to use a single integer for all the buttons, holding each one in a bit. This gave an immediate 25% speedup, confirming that going from an array of 10 booleans to 1 integer (reducing the number of items passed across) speeds things up. Next, since I wasn’t using the additional 4 axes I decided to only pass 2 axes as a temporary fix. This also gave a decent speedup, especially at the 6 controller mark, because each additional controller was additional data.
At this point I was stumped for a while, but then I realized that if I could pass a single JSON string across that would only be a single item of data, no matter how many joysticks or axes were active. I assumed that browsers were pretty good at parsing JSON, since that is most of what the JavaScript interpreter does on all web pages, and after a day of struggling getting C++ to spit out JSON it was legit and joystick input was no longer an issue.
Optimizing the Points
This finally exposed the remaining problems of the Point
class and garbage collection being the next bottle neck. Due to all the physics updates and point computations it was creating tons of new Point objects and each additional point operation created more because the operations were non-destructive. The good news is that I have a full test suite for the Point class, so that I can refactor and optimize without any fear. This was quite valuable when testing crazy new ideas to see if they would improve performance without breaking everything. I figured that since I was creating so many points that if I added optional destructive methods, and used them in the right places, that could reduce the new point creation and also GC load quite a bit.
I wish that JavaScript or CoffeeScript would allow for a shortcut to use !
as a method suffix like in Ruby, because it is a known convention for destructive methods. The closest I could come would be point["add!"](otherPoint)
, which was too brutal on the mind and eyes to make it into common usage. If CoffeeScript could auto-compile point.add!(otherPoint)
into the index-operator notation like it does with a.class
and a.new
then it would be okay, but until then the !
suffix is out.
JavaScript does allow for the $
symbol in variable and method names so, by necessity, I have chosen to use $
as the glyph of destruction, which has its own poetry in a way. point.add$(otherPoint)
not bad, but not my first choice.
Now, armed with new destructive methods, I set about looking for places in the physics and collisions where I could slip them in to prevent the creation of unwanted/unused points. Then a funny thing happened, the majority of the places I looked needed the operators to be non-destructive, and it was difficult to see exactly where a destructive method could be added without unwanted side-effects except in a few simple situations such as point = point.add$(delta) => point.add$(delta)
.
While I was cleaning up the Point code I was thinking about a conversation I had on GitHub about the performance benefits from using prototype inheritance rather than object augmentation. This sounded like a good idea in this case, as Point
s are primitive objects with their x
and y
properties on the outside and all of their methods using this
everywhere. The one sticking point was that I could not abide having to stick new
in front of the point constructor in ten thousand places in my existing and future code. If only there was some way to get the advantages of prototype performance, without pushing the syntactic hassles onto the people using the class.
The good news is that there is a way to set an objects prototype. All you need to do is to set the __proto__
property. So now my Point
constructor looks like this:
Point = (x, y) -> __proto__: Point:: x: x || 0 y: y || 0
You know you’ve been programming in the browser too long when __proto__: Point::
becomes a thing of beauty.
All the instance methods are defined below as follows:
Point:: = copy: -> Point(this.x, this.y) add: (first, second) -> this.copy().add$(first, second) add$: (first, second) -> if second? this.x += first this.y += second else this.x += first.x this.y += first.y this ...
This gives all the performance benefits of using prototypes rather than making an anonymous function for every method as well as the additional side benefit that developers can extend Point.prototype
with additional methods for use in their own projects if they want to. Another advantage is that the syntax remains unchanged, no need to use the new
operator, and all the tests still pass.
The best news is that this provides a 90% reduction in time that the code spends constructing and garbage collecting points, and was simple enough to pull into the Matrix
class with a two line change as well. For primitive objects like Points, Matrixes, Arrays, Numbers, and the like I wholeheartedly recommend this approach. For complex objects that require mixins, private variables, and instance variables I don’t think it will be possible because each object actually does need it’s own functions that are in the correct closure scope.
Another interesting thing is that this last optimization voided the assumption of my previous one about destructive operators. I assumed that because creating points was expensive it would be worthwhile to go to extra lengths to prevent their creation unnecessarily. Using the prototypesque construction the cost of point creation and garbage collection was reduced so much that it’s not worth it to try and squeeze out the now slight performance gains that would produce except in the hottest inner loops. I’ll still keep the destructive methods around for situations where points actually want to be updated in place, like p = p.norm(speed) => p.norm$(speed)
, but I won’t be quick to begin trying to “optimize” by defaulting to using destructive methods and then spend hours debugging issues that come up because two objects are actually sharing the same point reference.
Points are cheap now, use them freely!