Friday, August 10, 2012

Thoughts on XNA's SpriteBatch

I've been doing some work with OpenGL lately, and I found myself wishing for something as convenient as XNA's SpriteBatch object. I wrote one for myself, but recently I noticed that Shawn Hargreaves has released a C++ version of SpriteBatch. I thought it would be interesting to compare our implementations. For those not familiar with XNA Game Studio, SpriteBatch encapsulates the setup and shaders necessary for blitting from a texture to the screen with some optional effects. It's probably the best system out there for making your own Super Nintendo knockoff game. Using SpriteBatch is as simple starting a batch, drawing your sprites, then ending the batch. It also has some options for immediate drawing vs. queuing and sorting for efficiency.

The are a few different draw calls in SpriteBatch all of which are various combinations of the following parameters:

  • Source Texture
  • Source Rectangle (for sprite sheets)
  • Sprite Effects (mirroring options for source image)
  • Destination (either point or rectangle)
  • Depth (for layering effects)
  • Rotation
  • Origin of Rotation
  • Scale
  • Color (more of a tint really)

Unless the SpriteSortMode is Immediate, each draw call is queued for later sorting before rendering. The intermediate data format, as well as the shaders, are key to the overall efficiency. One extreme would be to cache all the potential parameters, as is,  whether or not they're needed, filling in the blanks with defaults. This leads to excess work, such as calculating sin(0) for the non-rotation cases. However, on a platform where all the parameters are just passed on to geometry shaders the GPU won't notice the extra work. On the other extreme would be calculating a 2x2 texture coordinate matrix and a 4x4 transformation matrix to encapsulate all the parameters in a form the GPU needs. This does some work on the CPU that could be done in a vertex shader, plus has a larger memory footprint and, if you don't have instanced drawing capability, there will be extra API overhead. However, it doesn't use any features not present on fixed function hardware.

XNA's Implementation
I was surprised to see in the official source that Shawn used the cache-all-parameters approach. I suppose this would have some advantages considering the cross platform nature of the API. All the draw calls do is queue up a sprite with little to no computation. All the heavy lifting is in the RenderSprite function, which calculates everything on the CPU then writes it to an indexed vertex array. The vertex array is drawn using one big draw call.

My Implementation
I probably went a bit overkill on this considering the complexity of my (many) draw calls and the simplicity of my render function. I store color and depth as is, I store the texture coordinates in homogenous form in two vec2,  and I store the destination/rotation/origin/scale as two vec3 (the first two columns of a 3x3 transform matrix, where the third column is just [0,0,1]). Overall, it saves just a single float of intermediate storage (not as good as I was hoping for). Just like the official implementation, I do all my math on the CPU. However, I do most of it in the draw call, and have a simple render function. Unlike the official implementation, I render into an un-indexed vertex array and render one tri-strip per sprite. Fortunately OpenGL doesn't have the API overhead that DirectX does, because doing it this way saves 36 floats of memory per sprite by not having a huge VertexPositionColorTexture array.

Potential API Improvements
One big change I made to my API, is that I added versions that accept a Sprite object. The Sprite object encapsulates the texture pointer, source rectangle, and origin. These parameters are typically passed in groups so it makes sense to collect them together. I store pre-scaled texture coordinates in it so it saves a little computation. It also allows the specification of an origin point even when not rotating. This is useful for things like locating sprites by a point between the characters feet.

Something I found myself wishing for was for every function have color and depth as the last two parameters with default values of Color.White and 0.0f respectively. These parameters don't require any extra calculations, and they are useful no matter what other combinations of parameters are present.
Another thing I added is a draw call that accepts SpriteEffects but no rotation. I noticed this missing function when I was implementing a SNES style tile map. I had to rotate every tile by 0 degrees every time.

Overall, I was surprised how similar my implementation was to the official version. Especially considering how much work it was. I'd like to thank Shawn for releasing his source code. I always love to peek into the stuff going on behind the scenes of an API.