Monday, December 31, 2018

Two Recurrences for Counting Rooted Trees

A rooted tree is a tree with a special node called the root. The number of rooted trees with \(n\) nodes is given by the sequence A000081 in the OEIS. They provide the following recurrence relation: \[ a_{n+1} = \frac{1}{n} \sum_{k=1}^n \left( \sum_{d \mid k} d \cdot a_d \right) a_{n-k+1} \] I came up with a different one. I was surprised that they can produce the same sequence despite appearing so different. \[ a_{n+1} = \sum_{\left( p_1^{r_1} \, \ldots \, p_\ell^{r_\ell} \right) \, \vdash \, n} \left( \prod_{i=1}^{\ell} \left(\!\!\binom{a_{p_i}}{r_i}\!\!\right) \right) \]

The sum is over all partitions of \(n\), and \( \left( p_1^{r_1} \ldots p_\ell^{r_\ell} \right) \) is the frequency representation of a partition. \( \left(\!\binom{n}{k}\!\right) \) is the multiset coefficient. It represents the number of ways to choose \(k\) elements out of \(n\) allowing repetition, and is equal to \( \binom{n+k-1}{k} \).

The image below shows how the second formula can be derived with an example for \(a_{25}\). Every node that is directly attached to the root is the root of a subtree, and the subtrees are arranged in a partition. Changing the partition changes the tree. You can switch out subtrees for another subtree with the same number of nodes, to get a new tree without changing the partition.

Using the program below I have verified that both recurrence relations are correct up to at least the \(30^\text{th}\) term.

Sunday, December 30, 2018

The Binary Look-and-Say Sequence

I ran across an interesting math problem here. I've taken the solution posted by Mark Hennings and filled it out with more details.

Problem

Consider the binary look-and-say sequence:

\[ 1, 11, 101, 111011, 11110101, 100110111011, \ldots \]

Each term of the sequence can be generated from the previous term by reading off the digits, always saying how many times the same digit appears consecutively (but converting this number to binary). For example, the fourth term is "three 1s, one 0, two 1s". Three is 11 in binary and two is 10. So the fifth term is 11 1, 1 0, 10 1, or 11110101.

Let \(f(n)\) denote the number of times the string "100" appears in the \(n^\text{th}\) term. Then \(f(8) = 2\), for example, because the \(8^\text{th}\) term equals 111100111010110100110111011.

What is \(\displaystyle\lim_{n\to\infty}\frac{f(n)}{f(n-1)}\)?

Step 1. Express \(f(n)\) as a matrix equation

The terms in this sequence can all be expressed as collections of 10 specific strings. The table below details the strings, and what each string contributes to the next term. This transformation can also be expressed by the matrix \(M\). If \(v(n)\) is a column vector representing the number of times each string occurs in the \(n^\text{th}\) term of the sequence, then \(v(n+1) = M v(n)\).

\[ \begin{array}{ccc} \hline \mathrm{Label} & \mathrm{String} & \mathrm{Becomes} \\ \hline A & 1 & B \\ B & 11 & CA\\ C & 10 & E \\ D & 110 & CD \\ E & 1110 & F \\ F & 11110 & GD \\ G & 100 & I \\ H & 1100 & CH \\ I & 11100 & J \\ J & 111100 & GH \\ \hline \end{array} \qquad M = \left(\begin{array}{cccccccccc} 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 1 & 0 & 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 & 1 & 0 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 1 \\ 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 \\ \end{array}\right) \]

It follows that:

\[f(n) = \left( \begin{array}{cccccccccc} 0 & 0 & 0 & 0 & 0 & 0 & 1 & 1 & 1 & 1 \end{array}\right) M^{n-1} v(1)\]

Step 2. Simplify the expression for \(f(n)\)

In order to easily compute powers of \(M\), we can put it in Jordan normal form. First, the characteristic polynomial of \(M\) is \( \chi_M(t) = t^4(t-1)^2(t+1)(t^3 - t^2 - 1) \). From here we can calculate the Jordan normal form \(J\) and transition matrix \(P\). The numbers \(x\), \(y\), and \(z\) are the roots of \(t^3-t^2-1\).

\[ J = \left(\begin{array}{cccccccccc} 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & -1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & x & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & y & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & z \end{array}\right) \] \[ P = \left(\begin{array}{cccccccccc} 0 & 0 & 0 & 0 & 0 & 1 & 3 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 & -3 & 0 & 0 & 0 \\ 0 & 0 & -1 & 1 & 0 & 0 & 4 & x^2 & y^2 & z^2 \\ 1 & 0 & 0 & 0 & 1 & 0 & -2 & x^2 & y^2 & z^2 \\ 0 & -1 & 1 & 0 & 0 & 0 & -4 & x & y & z \\ -1 & 1 & 0 & 0 & 0 & 0 & 4 & 1 & 1 & 1 \\ 0 & 0 & 1 & -1 & 0 & 0 & -2 & 1 & 1 & 1 \\ -1 & 0 & 0 & -1 & -1 & -1 & 1 & 1 & 1 & 1 \\ 0 & 1 & -1 & 0 & 0 & 0 & 2 & 1/x & 1/y & 1/z \\ 1 & -1 & 0 & 1 & 0 & 0 & -2 & 1/x^2 & 1/y^2 & 1/z^2 \end{array}\right) \]

Substitute \(M=P J P^{-1}\) into the equation for \(f(n)\). Since \(P^{-1}\) is multiplied by \(v(1)\), we only need to find its first column, which we'll call \(p\). We can find it by using Gaussian elimination on the equation \(P p = e_1\), where \(e_1\) is the first column of the identity matrix. It will be necessary to make extensive use of identities like the ones below, which hold for any root \(r\) or any three distinct roots \(a\), \(b\), \(c\) of \(t^3-t^2-1\).

We can almost get away with using dummy variables for the entries of \(p\), which would save a lot of work, but I think it's necessary for later to show that \(p_8 \neq 0\). Besides, it's kind of neat to have a more explicit expression for \(f\).

\[ p = \left(\begin{array}{c} 0 \\ 0 \\ 0 \\ -1 \\ 0 \\ 1/2 \\ 1/6 \\ \frac{(x-1)(y+1)(z+1)}{3(x-y)(x-z)} \\ \frac{(x+1)(y-1)(z+1)}{3(y-x)(y-z)} \\ \frac{(x+1)(y+1)(z-1)}{3(z-x)(z-y)} \end{array}\right) \qquad\qquad\begin{array}{c} \hline \mathrm{Identities} \\ \hline r^3-r^2=1 \\ r-1=1/r^2 \\ abc=1 \\ a+b+c=1 \\ ab=c^2-c \\ a^2b^2=c-1 \\ \hline \end{array} \]

Multiplying everything out, and again making use of the identities, we find:

\begin{multline*} f(n) = -\frac{1}{2} + \frac{1}{6} (-1)^{n} + p_8 x^{n+2} + p_9 y^{n+2} + p_{10} z^{n+2}\qquad\text{if}\quad n > 1 \end{multline*}

Step 3. Find the limit

Now, find the value of \(\lim_{n\to\infty} \frac{f(n)}{f(n-1)}\). Let \(g(n)\) consist of all terms of \(f(n)\) that do not contain \(x\), assuming WLOG that \(x\) is the real root. Since the powers of \(y\) and \(z\) converge to 0, and \(-\frac{1}{2} + \frac{1}{6} (-1)^{n}\) is bounded, \(g(n)\) can be killed off by dividing by \(x^{n+1}\). Note that \(p_8 \neq 0\).

\[ \lim_{n\to\infty} \frac{f(n)}{f(n-1)} = \lim_{n\to\infty}\frac {-\frac{1}{2} + \frac{1}{6} (-1)^{n} + p_8 x^{n+2} + p_9 y^{n+2} + p_{10} z^{n+2}} {-\frac{1}{2} + \frac{1}{6} (-1)^{n-1} + p_8 x^{n+1} + p_9 y^{n+1} + p_{10} z^{n+1}} \] \[ = \lim_{n\to\infty}\cfrac {\cfrac{g(n)}{x^{n+1}} + p_8 x} {\cfrac{g(n-1)}{x^{n+1}} + p_8} = x \approx \boxed{1.46557} \]

Congratulations! If you would like to celebrate with a video, this is John Conway, known for the Game of Life, talking about a similar problem relating to the decimal version of the look-and-say sequence.

Here is a blog post written by Nathaniel Johnston, where he finds the rate of growth and the ratio of ones to zeros for the binary look-and-say sequence. He mentions that "it follows from standard theory of linear homogeneous recurrence relations that we can ... read off all of the long-term behaviour of the binary look-and-say sequence from the eigenvalues and eigenvectors of [the matrix]."

The closed-form expression for the \(n^\text{th}\) Fibonacci number is \(F_n = (\varphi{}^n - \psi{}^n) / \sqrt{5} \), and \(\lim_{n\to\infty} \frac{F_{n+1}}{F_n} = \varphi\). If you write the recurrence in matrix form, \(\varphi\) and \(\psi\) are the eigenvalues.

My computer can calculate values for \(f\) much more quickly using the expression found above, compared to generating the terms of the sequence and searching for occurrences of the substring "100". It takes more than 20 seconds to calculate the values up to \(n = 40\) the slow way, but less than a millisecond the fast way. In this case, math is more powerful than programming! Here's the code I used to take these measurements: