After the first optimization – that proved to be rather successful – I continued with my second idea, precalculating some often-used values. I started with finding all moves from every board position that didn't end up outside the board boundaries. These used to be calculated repeatedly during the search so my hypothesis was that finding them only once and then using cached values later would save time during the search.

However, due to my not thinking about things through before jumping to the coding part, I actually managed to also prove another optimization hypothesis of mine on the same go — although only after taking the scenic route, so to speak.

I first made the change hastily, not paying too much attention to the data structures I was using. I created a map from board positions to valid moves from them using the logical objects that first came to mind, ending up with the Java collection HashMap<Position, Set<Move>>.

The result wasn't quite what I had expected. The change only slowed down the algorithm. The difference in performance was also quite drastic, with a test run on a 6×6 board lasting over a minute longer than before making the change.

I was actually quite shocked about this. But then I started thinking about how the code originally used to go through all the possible values for each position by using an enum Move for the move directions, using the result of a Move.values() call, i.e. an array. Even though I now used precalculated values for possible moves, those values were fetched from a cache that had to find the correct selection of moves from a Map using a Position object as a key, and then iterate over a Set object when trying each of them out.

To get rid of this slowness I had introduced, I then decided to jump ahead on my list of optimization ideas, to the part "Using optimized data structures". I didn't go through the whole codebase but changed only the part I had added, i.e. the cache of precalculated values. Instead of a Map I chose a simple 2-dimensional array of Move objects. The precalculation step now stores an array of valid moves in an array with a fixed size matching the number of cells on the board, using the position's coordinates as an index (with the formula row * board.width + col).

This made a massive difference. Instead of slowing down the whole algorithm, the cache of valid moves improved the speed more than I had dared to expect. While the running time for a 6×6 board had improved to 3 minutes and 44 seconds after the first optimization, a test run with this new change finished in 2 minutes and 25 seconds.

So, even though these were just single example runs and one cannot really say what the change would be in average times for a proper set of test runs, for these particular tests the improvement was 35.25% over the previous run, and 90.71% over the original naive implementation. Going back to my original hypotheses, I'd say this confirmed that both the precalculation of values and using the right data structures for the critical data can indeed be quite important when it comes to improving performance.

I have to say I'm quite enjoying this optimization exercise so far.