Finding the Exact Pseudoinverse

27 Jan 2014, 11:21 UTC

So here's a problem posed by my brother, Cake.

Suppose you have a game board where each square is a different colour. There are 8 colours (say), including white. If you click on a square it cycles through the colours. But, other squares also cycle through their colours at the same time. Each square connects to some set of other squares so that clicking on that square cycles the colours in itself, and in its set of connected squares. (In a simple case, each square connects to the squares "before" and "after" it; in more complex cases the connections can be arbitrary.)

Given an initial board colouring, what is the minimum sequence of mouse clicks which turns the entire game board white?

A Mondrian can be a game board!

Maths to the Rescue!

So let's start by making some observations.

Firstly, we're working in modular arithmetic. With 8 colours, clicking a square 9 times is the same as clicking it once, because the colours cycle around.

Secondly, the system is linear - if you click twice on square A and then twice on square B, the final board position is the same as if you had clicked twice on square B and then twice on square A. The sequence of clicks doesn't matter, just which squares were clicked how many times.

We can represent the gameboard as a vector - each element of the vector gets an index from 0-7 (for 8 colours), with 0 being white. The final state of the gameboard must be (0,0,...,0) but the initial state can be anything. For a gameboard with 6 squares and simple before/after connections, we get a matrix problem like this:

(? ? ? ? ? ?)(1 1 0 0 0 1) + (i0 i1 i2 i3 i4 i5) = (0 0 0 0 0 0) [MOD 8]
             (1 1 1 0 0 0)
             (0 1 1 1 0 0)
             (0 0 1 1 1 0)
             (0 0 0 1 1 1)
             (1 0 0 0 1 1)

aM + b = 0

Here, b=(i0,...,i5) is the initial board position, a=(?,...,?) is the number of clicks we should do on each square to solve the board, and 0=(0,...0) is the final board position. [MOD 8] means that each element in that final vector could be 0,8,16,24,etc. and not exactly 0, since the colours cycle i.e. we're working modulo 8.

What Didn't Work

First I tried doing all work modulo N. This is attractive because it means all numbers can be stored in a byte; matrices are small; and arithmetic is very fast (especially using tables to perform things like a + b mod N, for small N).

But, I tried doing Gram-Schmidt Orthogonalization (not orthonormalization because square-roots don't exist in either rational or modulo arithmetic). This doesn't work modulo N, because it assumes that if two vectors have a zero dot-product, they are linearly independent. That is not the case:

(1 1 1) . (1 1 1) == 0 (MOD 3)
(1 1 1) is not linearly independent of itself!

So then I tried working in rational arithmetic, inside Lisp. I used an orthogonal decomposition. Since this involves dot products (sums of large numbers of rational numbers), the sizes of the rationals grew rapidly. 176-bit integers are needed to solve a 100x100 matrix and it takes about 20 seconds; a 1000x1000 matrix is out of the question (perhaps you could compute one over lunch).

Then I found myself thinking that to reduce the size of the rationals, it's better to do something like Gaussian elimination, than Gram-Schmidt. This only involves one divide per step, so the rationals grow slowly. I could solve a 100x100 matrix using only 12-bit integers.

But it turns out that whereas Gram-Schmidt doesn't work in modulo arithmetic, Gaussian elimination does work, so using the same algorithm I just replaced rationals with bytes modulo N. That works. Not only can it solve a 100x100 matrix in 0.02 seconds (a 1000x speedup!), I can now solve a 1000x1000 matrix in only 20 seconds. Plus, the code is simple enough that it could in principle be ported to the target language for the game, which is Javascript.

But Gaussian Elimination Doesn't Work with Singular Matrices!

Ah!

Here's how to make it work, in that it will find the pseudo-inverse of any matrix M.

Recall that in Gaussian elimination, you work downwards as follows:

(1 2 3)
(4 5 6)
(7 8 9)

R1 <- R1 - 4*R0
R2 <- R2 - 7*R0

(1  2   3)
(0 -3  -6)
(0 -6 -12)

R1 <- -3 * R1

(1  2   3)
(0  1   2)
(0 -6 -12)

R2 <- R2 + 6*R1

(1 2 3)
(0 1 2)
(0 0 0)

The process fails at this point, but it fails in an interesting way, because there are guarantees about the form the matrix now takes.

What we can do now is transpose the matrix and do the process again:

(1 2 3)    (1 0 0)
(0 1 2) -> (2 1 0)
(0 0 0)    (3 2 0)

R1 <- R1 - 2*R0
R2 <- R2 - 3*R0

(1 0 0)
(0 1 0)
(0 2 0)

R2 <- R2 - 2*R1

(1 0 0)
(0 1 0)
(0 0 0)

We end up with a matrix with the following properties:

It is diagonal
All entries are either 1 or 0
The first R entries are 1, where R is the rank of the original matrix
The last (N-R) entries are therefore 0
This matrix is its own pseudoinverse (see later)

The matrix has a very simple operation on vectors:

(x y z) * (1 0 0) = (x y 0)
          (0 1 0)
          (0 0 0)

(1 0 0) * (x) = (x)
(0 1 0)   (y) = (y)
(0 0 0)   (z) = (0)

The Process in More Detail

The Gaussian Elimination procedure as described can already fail, but there's an easy fix which is called "row pivoting". Suppose you get this far with a GE:

(1 x x x x x)
(0 1 x x x x)
(0 0 0 x x x)
(0 0 0 x x x)
(0 0 0 x x x)
(0 0 1 x x x)

The next step is supposed to be scale R2 so R2[2] == 1, but that's not possible since R2[2] = 0 now (it would involve a divide-by-zero). So we "pivot" which means finding some non-zero element in this column, and swapping rows. In the case here we'd swap R2 and R5 giving this matrix:

(1 x x x x x)
(0 1 x x x x)
(0 0 1 x x x)
(0 0 0 x x x)
(0 0 0 x x x)
(0 0 0 x x x)

... and now we can proceed with the algorithm.

In the modified algorithm we go further than this:

We find a row pivot if possible
Otherwise, we find a column pivot
Otherwise, we find a row&column pivot

In other words, in the marked region of the matrix, we find *any* non-zero element, and perform a row and/or column swap to move it into position:

(1 x x x x x)
(0 1 x x x x)
(0 0 0 X X X)
(0 0 X X X X) <- find any non-zero in an X
(0 0 X X X X)
(0 0 X X X X)

If we succeed, we can continue with GE as before. If we fail, we know we have a matrix that looks like this:

(1 x x x x x)
(0 1 x x x x)
(0 0 0 0 0 0) <- rows of all 0s
(0 0 0 0 0 0)
(0 0 0 0 0 0)
(0 0 0 0 0 0)

This is perfect; now we transpose and run GE again, and get our lovely simple output matrix.

(1 0 0 0 0 0)
(0 1 0 0 0 0)
(0 0 0 0 0 0)
(0 0 0 0 0 0)
(0 0 0 0 0 0)
(0 0 0 0 0 0)

Solving the System

The above just shows how to take a matrix into a nice form. You can follow Matrix Decomposition For Dummies and generate a proper matrix decomposition like

M = RI * W * CI

Where RI,CI are invertible matrices and W is a "nearly-identity" matrix. We can now solve the equation:

aM + b = 0
=> a*RI*W*CI + b = 0
=> a*RI*W + b*C = 0

OK, so we know that whatever a*RI is, a*RI*W has zeroes in its last few entries. So, b*C must have zeroes in its last few entries, otherwise b is not a possible solution. A possible process to ensure this is:

Calculate B = b*C
Zero its last few entries to get B'
Set b' = B'*CI

This is equivalent to fixing the colours on the board so the board can be solved.

Now that we know we can solve the board, we can continue moving matrices across from a to b:

a*RI + b*C*W = 0

As before, since b*C*W has zeroes in the last few entries, a*RI will also have zeroes there.

a + b*C*W*R = 0

Assuming b can be solved, this gives a solution a.

Unfortunately, this solution to the game may be very inefficient in terms of actual mouse clicks. The algorithm doesn't really care how many clicks you make. If there's only one solution (M was invertible) that's as good as it gets. But if there are multiple solutions we really want the one that takes the fewest mouse clicks. So let's explore all the possible solutions to the board.

The Left-Null Space of M

Since M is singular, there are vectors obeying

xM = 0

Any such vector x is in the left-null space of M. You can add any such vector to the solution a and get a new solution:

aM + b = 0
aM + xM + b = 0
(a+x)M + b = 0

To find these vectors x:

xM = 0
x*RI*W*CI = 0
x*RI*W = 0
x*RI has zeroes in its first R entries, and not in its later (N-R) entries

As an example, if W is given here:

W = (1 0 0 0)
    (0 1 0 0)
    (0 0 0 0)
    (0 0 0 0)

Then x*RI must be of the form (0 0 z w) in order for x*RI*W = 0 to hold. The vector x is therefore (0 0 z w) * R.

So, we can take the vectors (0 0 1 0) and (0 0 0 1) and multiply them by R, to get basis vectors for the left-null space of M. If the vectors are x0 and x1, then the full solution to the board is:

(a + c0x0 + c1x1)M + b = 0

where c0, c1 are any values (in modulo arithmetic, they take values 0,...,N-1).

In the Mondrian problem, we can enumerate all possible solutions by enumerating all possible vectors like (0 0 z w). Then, we get this solution, and count how many clicks it takes. This allows us to optimize for "number of clicks" and get the optimal solution for the player. And that, dear reader, is Cake's game solved. If there is a very large null space, enumerating all possible solutions will take a long time ... but there's so many solutions it may not be an interesting game anyway. So we reject boards that have a null-space of rank higher than 2 or 3, say.

Before we go, let's abstract this all a little more.

Proof that CWR is in fact the pseudo-inverse of M

The pseudo-inverse of M, M^P, is a matrix with the following properties:

M * M^P * M = M
M^P * M * M^P = M^P
(M^P)^P = M

First, let's see that W is its own pseudo-inverse:

W * W * W = W
W * W * W = W
W = W

Now, let's just substitute our decomposition

1. (RI * W * CI) * (C * W * R) * (RI * W * CI) =
   RI * W * Ic * W * Ir * W * CI =
   RI * W * W * W * CI =
   RI * W * CI (as required)

2. (C * W * R) * (RI * W * CI) * (C * W * R) =
   C * W * Ir * W * Ic * W * R =
   C * W * W * W * R =
   C * W * R (as required)

For part 3, note that the pseudo-inverse of C * W * R can be calculated by decomposing that matrix again:

C*W*R = R1I * W1 * C1I

Since W1 clearly has to be equal to W, it follows that R1I = C and C1I = R. So the psuedo-inverse of C*W*R = C1II * W1 * R1II = RI * W * CI = M.

Thus, we can find the exact pseudo-inverse of a matrix using either modular arithmetic, or rational arithmetic, using this extended Gaussian elimination algorithm. (What's more, we deal with smaller rational numbers using this method than we would with other methods, such as orthogonalization.) Since "finding the exact pseudoinverse" is the title of this article, we're done. QED.

Reanimator Ltd

High-performance coding by Eddie Edwards

Finding the Exact Pseudoinverse

Maths to the Rescue!

What Didn't Work

But Gaussian Elimination Doesn't Work with Singular Matrices!

The Process in More Detail

Solving the System

The Left-Null Space of M

Proof that CWR is in fact the pseudo-inverse of M

New comments are disabled for this page

Reanimator Ltd

High-performance coding by Eddie Edwards

Finding the Exact Pseudoinverse

Maths to the Rescue!

What Didn't Work

But Gaussian Elimination Doesn't Work with Singular Matrices!

The Process in More Detail

Solving the System

The Left-Null Space of M

Proof that C*W*R is in fact the pseudo-inverse of M

New comments are disabled for this page

Proof that CWR is in fact the pseudo-inverse of M