Theory of the Thumb: December 2018

I ran across an interesting math problem here. I've taken the solution posted by Mark Hennings and filled it out with more details.

Problem

Consider the binary look-and-say sequence:

\[ 1, 11, 101, 111011, 11110101, 100110111011, \ldots \]

Each term of the sequence can be generated from the previous term by reading off the digits, always saying how many times the same digit appears consecutively (but converting this number to binary). For example, the fourth term is "three 1s, one 0, two 1s". Three is 11 in binary and two is 10. So the fifth term is 11 1, 1 0, 10 1, or 11110101.

Let \(f(n)\) denote the number of times the string "100" appears in the \(n^\text{th}\) term. Then \(f(8) = 2\), for example, because the \(8^\text{th}\) term equals 111100111010110100110111011.

What is \(\displaystyle\lim_{n\to\infty}\frac{f(n)}{f(n-1)}\)?

Step 1. Express \(f(n)\) as a matrix equation

The terms in this sequence can all be expressed as collections of 10 specific strings. The table below details the strings, and what each string contributes to the next term. This transformation can also be expressed by the matrix \(M\). If \(v(n)\) is a column vector representing the number of times each string occurs in the \(n^\text{th}\) term of the sequence, then \(v(n+1) = M v(n)\).

\[ \begin{array}{ccc} \hline \mathrm{Label} & \mathrm{String} & \mathrm{Becomes} \\ \hline A & 1 & B \\ B & 11 & CA\\ C & 10 & E \\ D & 110 & CD \\ E & 1110 & F \\ F & 11110 & GD \\ G & 100 & I \\ H & 1100 & CH \\ I & 11100 & J \\ J & 111100 & GH \\ \hline \end{array} \qquad M = \left(\begin{array}{cccccccccc} 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 1 & 0 & 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 & 1 & 0 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 1 \\ 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 \\ \end{array}\right) \]

It follows that:

\[f(n) = \left( \begin{array}{cccccccccc} 0 & 0 & 0 & 0 & 0 & 0 & 1 & 1 & 1 & 1 \end{array}\right) M^{n-1} v(1)\]

Step 2. Simplify the expression for \(f(n)\)

In order to easily compute powers of \(M\), we can put it in Jordan normal form. First, the characteristic polynomial of \(M\) is \( \chi_M(t) = t^4(t-1)^2(t+1)(t^3 - t^2 - 1) \). From here we can calculate the Jordan normal form \(J\) and transition matrix \(P\). The numbers \(x\), \(y\), and \(z\) are the roots of \(t^3-t^2-1\).

\[ J = \left(\begin{array}{cccccccccc} 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & -1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & x & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & y & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & z \end{array}\right) \] \[ P = \left(\begin{array}{cccccccccc} 0 & 0 & 0 & 0 & 0 & 1 & 3 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 & -3 & 0 & 0 & 0 \\ 0 & 0 & -1 & 1 & 0 & 0 & 4 & x^2 & y^2 & z^2 \\ 1 & 0 & 0 & 0 & 1 & 0 & -2 & x^2 & y^2 & z^2 \\ 0 & -1 & 1 & 0 & 0 & 0 & -4 & x & y & z \\ -1 & 1 & 0 & 0 & 0 & 0 & 4 & 1 & 1 & 1 \\ 0 & 0 & 1 & -1 & 0 & 0 & -2 & 1 & 1 & 1 \\ -1 & 0 & 0 & -1 & -1 & -1 & 1 & 1 & 1 & 1 \\ 0 & 1 & -1 & 0 & 0 & 0 & 2 & 1/x & 1/y & 1/z \\ 1 & -1 & 0 & 1 & 0 & 0 & -2 & 1/x^2 & 1/y^2 & 1/z^2 \end{array}\right) \]

Substitute \(M=P J P^{-1}\) into the equation for \(f(n)\). Since \(P^{-1}\) is multiplied by \(v(1)\), we only need to find its first column, which we'll call \(p\). We can find it by using Gaussian elimination on the equation \(P p = e_1\), where \(e_1\) is the first column of the identity matrix. It will be necessary to make extensive use of identities like the ones below, which hold for any root \(r\) or any three distinct roots \(a\), \(b\), \(c\) of \(t^3-t^2-1\).

We can almost get away with using dummy variables for the entries of \(p\), which would save a lot of work, but I think it's necessary for later to show that \(p_8 \neq 0\). Besides, it's kind of neat to have a more explicit expression for \(f\).

\[ p = \left(\begin{array}{c} 0 \\ 0 \\ 0 \\ -1 \\ 0 \\ 1/2 \\ 1/6 \\ \frac{(x-1)(y+1)(z+1)}{3(x-y)(x-z)} \\ \frac{(x+1)(y-1)(z+1)}{3(y-x)(y-z)} \\ \frac{(x+1)(y+1)(z-1)}{3(z-x)(z-y)} \end{array}\right) \qquad\qquad\begin{array}{c} \hline \mathrm{Identities} \\ \hline r^3-r^2=1 \\ r-1=1/r^2 \\ abc=1 \\ a+b+c=1 \\ ab=c^2-c \\ a^2b^2=c-1 \\ \hline \end{array} \]

Multiplying everything out, and again making use of the identities, we find:

\begin{multline*} f(n) = -\frac{1}{2} + \frac{1}{6} (-1)^{n} + p_8 x^{n+2} + p_9 y^{n+2} + p_{10} z^{n+2}\qquad\text{if}\quad n > 1 \end{multline*}

Step 3. Find the limit

Now, find the value of \(\lim_{n\to\infty} \frac{f(n)}{f(n-1)}\). Let \(g(n)\) consist of all terms of \(f(n)\) that do not contain \(x\), assuming WLOG that \(x\) is the real root. Since the powers of \(y\) and \(z\) converge to 0, and \(-\frac{1}{2} + \frac{1}{6} (-1)^{n}\) is bounded, \(g(n)\) can be killed off by dividing by \(x^{n+1}\). Note that \(p_8 \neq 0\).

\[ \lim_{n\to\infty} \frac{f(n)}{f(n-1)} = \lim_{n\to\infty}\frac {-\frac{1}{2} + \frac{1}{6} (-1)^{n} + p_8 x^{n+2} + p_9 y^{n+2} + p_{10} z^{n+2}} {-\frac{1}{2} + \frac{1}{6} (-1)^{n-1} + p_8 x^{n+1} + p_9 y^{n+1} + p_{10} z^{n+1}} \] \[ = \lim_{n\to\infty}\cfrac {\cfrac{g(n)}{x^{n+1}} + p_8 x} {\cfrac{g(n-1)}{x^{n+1}} + p_8} = x \approx \boxed{1.46557} \]

Congratulations! If you would like to celebrate with a video, this is John Conway, known for the Game of Life, talking about a similar problem relating to the decimal version of the look-and-say sequence.

Here is a blog post written by Nathaniel Johnston, where he finds the rate of growth and the ratio of ones to zeros for the binary look-and-say sequence. He mentions that "it follows from standard theory of linear homogeneous recurrence relations that we can ... read off all of the long-term behaviour of the binary look-and-say sequence from the eigenvalues and eigenvectors of [the matrix]."

The closed-form expression for the \(n^\text{th}\) Fibonacci number is \(F_n = (\varphi{}^n - \psi{}^n) / \sqrt{5} \), and \(\lim_{n\to\infty} \frac{F_{n+1}}{F_n} = \varphi\). If you write the recurrence in matrix form, \(\varphi\) and \(\psi\) are the eigenvalues.

My computer can calculate values for \(f\) much more quickly using the expression found above, compared to generating the terms of the sequence and searching for occurrences of the substring "100". It takes more than 20 seconds to calculate the values up to \(n = 40\) the slow way, but less than a millisecond the fast way. In this case, math is more powerful than programming! Here's the code I used to take these measurements:

Theory of the Thumb

Sunday, December 30, 2018

The Binary Look-and-Say Sequence

Problem

Step 1. Express \(f(n)\) as a matrix equation

Step 2. Simplify the expression for \(f(n)\)

Step 3. Find the limit