Book Proofs - A blog for mathematical riddles, puzzles, and elegant proofs

Letter Boxed

This week’s Fiddler is about the popular NY Times puzzle Letter Boxed.

you must connect letters together around a square to spell out words (they don’t have to be actual English words!). However, from any given letter, the next letter cannot be on the same side of the square. How many distinct valid sequences are there that include each letter exactly once?

My solution:
[Show Solution]

A related problem: Counting Carlitz words

This problem is related to a well-known combinatorial problem involving counting special arrangements of symbols.

Given an ordered list of letters (a word) such as “MISSISSIPPI”, one can ask: How many distinct rearrangements of these letters have no adjacent letters the same? This is known as a Carlitz word, named after the American mathematician Leonard Carlitz. There is a stunning formula for calculating this, which is explained in detail in the paper “Counting words with Laguerre polynomials” by Jair Taylor.

The result is as follows:

Theorem
The number of Carlitz words on an alphabet of $m$ symbols with the $i^\text{th}$ symbol repeated $k_i$ times is given by the formula:
\[
N = \int_0^\infty e^{-t} \prod_{i=j}^m l_{k_j}(t)\,\mathrm{d}t
\]where the function in the integrand is the polynomial defined as:
\[
l_k(t) = \sum_{i=0}^k (-1)^{k+i}\binom{k-1}{k-i}\frac{t^i}{i!}
\]

The first several such functions are given by:
\begin{align*}
l_1(t)&= t \\
l_2(t)&= \frac{t^2}{2}-t \\
l_3(t)&= \frac{t^3}{6}-t^2+t \\
l_4(t)&= \frac{t^4}{24}-\frac{t^3}{2}+\frac{3 t^2}{2}-t \\
l_5(t)&= \frac{t^5}{120}-\frac{t^4}{6}+t^3-2 t^2+t
\end{align*}Therefore, to answer the MISSISSIPPI question, we see that the word is made up of: (M, PP, IIII, SSSS). Applying the formula:
\begin{align*}
N &= \int_0^\infty e^{-t} l_1(t) l_2(t) l_4(t)^2\,\mathrm{d}t \\
&= \int_0^\infty e^{-t} \, t \left( \frac{t^2}{2}-t\right) \left(\frac{t^4}{24}-\frac{t^3}{2}+\frac{3 t^2}{2}-t\right)^2\,\mathrm{d}t \\
&= 2016
\end{align*}NOTE: An easy way to evaluate the integral above is to expand the polynomial part and use the fact that $\int_0^\infty e^{-t}t^n\,\mathrm{d}t=n!$ This can also convince you that the integrals always evaluate to integers!

Counting Letter Boxed words

Back to Letter Boxed, we have a shape with $n=4$ sides, and each side has $m$ distinct symbols. We want to find the total number of words where a symbol from the same side is never used twice in a row.

This is related to the problem of counting Carlitz words; we start by counting the number of Carlitz words where the symbols on a given side are treated as identical. For example, if there are 3 letters per side as in the picture at the top of the page, we count Carlitz words for “111222333444”. Each different Carlitz word provides a template; we then need to assign the letters from each side to the corresponding number in the template. e.g., the 1’s encode ABC, and there are 3! ways of ordering those. Then the 2’s encode DEF, and there are 3! ways of ordering those, and so on. Ultimately, we multiply our number of Carlitz words by $(m!)^n$.

Therefore, a formula for the total number of legal Letter Boxed words played on a shape with $n$ sides and $m$ letters per side is:

$\displaystyle
F(n,m) = (m!)^n \int_0^\infty e^{-t}\, l_m(t)^n\,\mathrm{d}t
$

with $l_m(t)$ defined above.

Here are some values for $F(n,m)$ for $n$ sides and $m$ symbols per side.
\[
\begin{array}{c|ccccc}
n\backslash m & 1 & 2 & 3 & 4 & 5 \\ \hline
1& 1 & 0 & 0 & 0 & 0 \\
2& 2 & 8 & 72 & 1152 & 28800 \\
3& 6 & 240 & 37584 & 15095808 & 12420864000 \\
4& 24 & 13824 & 53529984 & 751480602624 & 27917203599360000 \\
5& 120 & 1263360 & 152458744320 & 93995798935633920 & 197726965332480000000000 \\
\end{array}
\]Some observations:

In the first row, we only have one side. So if that side contains more than one letter, it’s impossible to make any words since we would be forced to use the same side for consecutive letters. This is why $F(1,m)=1$ if $m=1$ and $0$ otherwise.
In the first column, we only have one symbol per side. Therefore, we don’t need to worry about letters from consecutive sides, and we see that $F(n,m)=n!$.

For standard Letter Boxed ($n=4$), we can read off the fourth row:

Two letters per side: $F(4,2) = 13824$.
Three letters per side: $F(4,3) = 53529984$.

Can you hop to the lily pad?

This week’s Fiddler is about hopping back and forth.

You are a frog in a pond with an infinite number of lily pads in a line, marked “1,” “2,” “3,” etc. You are currently on pad 2, and your goal is to make it to pad 1. From any given pad, there are specific probabilities that you’ll jump to another pad: Whenever you are on pad $k$, you will hop to pad $k−1$ with probability $1/k$, and you will hop to pad $k+1$ with probability $(k−1)/k$.

What is the probability that you will ultimately make it to pad 1?

My solution:
[Show Solution]

The general finite case

We will first solve the most general form of this problem: arbitrary probabilities and an arbitrary starting point! Suppose there are $n$ pads numbered $1,2,\dots,n$. Let $a_k$ be the probability that we will make it to pad 1 starting from pad $k$. Further assume that when we’re at pad $k$, we transition to $k-1$ with probability $p_k$ and to $k+1$ with probability $q_k$. We have $p_k+q_k=1$ so these are the only two possibilities. This leads to the recurrence relation:
\begin{align*}
a_1 &= 1 \\
a_k &= p_k a_{k-1} + q_k a_{k+1}\qquad\text{for }k=2,3,\dots,n-1\\
a_n &= 0
\end{align*} Start by rewriting the recurrence relation using the fact that $p_k+q_k=1$:
\begin{align*}
a_1 &= 1 \\
p_k (a_{k}-a_{k-1}) &= q_k(a_{k+1}-a_k) \qquad \text{for }k=2,3,\dots,n-1\\
a_n &= 0
\end{align*}Iterating the recurrence starting from $k-1$, we obtain
\begin{align*}
(a_k-a_{k-1}) &= \frac{p_{k-1}}{q_{k-1}}(a_{k-1}-a_{k-2}) \\
&= \frac{p_{k-1}}{q_{k-1}} \cdot \frac{p_{k-2}}{q_{k-2}} (a_{k-2}-a_{k-3}) \\
&\;\;\vdots \\
&= (a_2-a_1)\prod_{i=2}^{k-1} \frac{p_{i}}{q_{i}}
\end{align*}Now sum from $k=2$ to $k=n$ and the left-hand side will telescope.
\[
a_n-a_1 = \sum_{k=2}^n (a_k-a_{k-1}) = (a_2-a_1)\sum_{k=2}^n\prod_{i=2}^{k-1} \frac{p_{i}}{q_{i}}
\]Letting $a_1=1$ and $a_n=0$, we can solve for $a_2$ and obtain:
\[
a_2 = 1-\frac{1}{\sum_{k=2}^n \prod_{i=2}^{k-1} \frac{p_i}{q_i}}
\]Now instead of summing up to $k=n$ in the telescoping sum, sum to $k=m$ to find a general term, and obtain:
\[
a_m-a_1 = (a_2-a_1)\sum_{k=2}^m\prod_{i=2}^{k-1} \frac{p_{i}}{q_{i}}
\]Substituting the values we found for $a_1$ and $a_2$, we obtain the following general formula for any $a_m$:

$\displaystyle
a_m = 1-\frac{\sum_{k=2}^m \prod_{i=2}^{k-1} \frac{p_i}{q_i}}{\sum_{k=2}^n \prod_{i=2}^{k-1} \frac{p_i}{q_i}},\qquad \text{for }m=1,2,\dots,n
$

The formula even works at the boundaries. When $m=1$, the numerator is zero (empty sum), and we recover $a_1=1$. When $m=n$, the numerator matches the denominator and we recover $a_n=0$.

This problem can also be viewed as that of finding the stationary distribution of a Markov chain. However, this approach leads to a formula that is more difficult to simplify involving the inverse of an $n\times n$ matrix, which is why I opted for the approach above.

The general infinite case

We now consider the case where there are infinitely many lily pads. The recurrence relation now looks like:
\begin{align*}
a_1 &= 1 \\
a_k &= p_k a_{k-1} + q_k a_{k+1}\qquad\text{for }k=2,3,\dots
\end{align*}Such recurrence relations are also called one-dimensional random walks.

This case is a bit trickier than the finite case. We are dealing with a second-order difference equation, so two boundary conditions are required to determine a unique solution. This was no problem in the finite case we previously solved, because we had the boundary conditions $a_1=1$ and $a_n=0$. But in this infinite case, we only have the condition $a_1=1$. This means there will be infinitely many possible solutions, each with a different value at infinity; the value of $a_\infty = \lim_{k\to\infty}a_k$. In order to have a unique solution, we must use the correct “boundary condition at infinity”.

To illustrate this fact, observe that we can satisfy the recurrence relation by setting $a_k=1$ for all $k$. This scenario has the boundary condition $a_\infty=1$ and is known as gambler’s ruin (the pads are viewed as amounts of money, and each turn we bet and either gain more money or lose money). It’s called “gambler’s ruin” because we always eventually go broke (return to pad 1). One example of gambler’s ruin is the case $p_k=q_k=\frac{1}{2}$; i.e., we flip a fair coin at every pad to see if we move forward or backward.

In cases where $q_k \gt p_k$ (we are likelier to move to a larger pad than to a smaller one), there is a non-zero probability that we will never return to pad 1. This cases is characterized by the property that the farther we get from pad 1, the less likely we are to return. In other words, our boundary condition at infinity should be $a_\infty = 0$.

To solve this case, we can use the exact same approach as in the finite case, except instead of summing up to $k=n$ and using $a_n=0$ to find $a_2$, we sum up to $k=\infty$ and use $a_\infty=0$ to find $a_2$. This results in the similar-looking formula:

$\displaystyle
a_m = 1-\frac{\sum_{k=2}^m \prod_{i=2}^{k-1} \frac{p_i}{q_i}}{\sum_{k=2}^{\infty} \prod_{i=2}^{k-1} \frac{p_i}{q_i}},\qquad \text{for }m=1,2,\dots
$

Note: To obtain the set of all possible solutions to the recurrence relation, we need only take affine combinations of the two fundamental solutions (the all-ones gambler’s ruin solution and the solution above). In other words, the general solution is given by:
\[
\begin{bmatrix}
a_1^\text{gen} \\ a_2^\text{gen} \\ \vdots
\end{bmatrix}
=
\alpha
\begin{bmatrix}
1 \\ 1 \\ \vdots
\end{bmatrix}
+ (1-\alpha)
\begin{bmatrix}
a_1 \\ a_2 \\ \vdots
\end{bmatrix}
\]where $a_1,a_2,\dots$ is the solution we derived above. This general solution satisfies $\lim_{k\to\infty}a_k = \alpha$, so based on how we choose $\alpha\in\mathbb{R}$, we can achieve any desired boundary condition at infinity.

Solving our special case

In our special case of interest, we have $p_i=\frac{1}{i}$ and $q_i=\frac{i-1}{i}$. Substituting into our formula, we obtain:
\begin{align*}
a_m &= 1-\frac{\sum_{k=2}^m \prod_{i=2}^{k-1} \frac{1}{i-1}}{\sum_{k=2}^n \prod_{i=2}^{k-1} \frac{1}{i-1}}
= 1-\frac{\sum_{k=2}^m \frac{1}{(k-2)!}}{\sum_{k=2}^n \frac{1}{(k-2)!}}
= 1-\frac{\sum_{k=0}^{m-2} \frac{1}{k!}}{\sum_{k=0}^{n-2} \frac{1}{k!}}
\end{align*}We can also write this as a single fraction:
\[
a_m = \frac{\frac{1}{(m-1)!} + \cdots + \frac{1}{(n-2)!}}{\frac{1}{0!}+\frac{1}{1!}+\frac{1}{2!}+\cdots+\frac{1}{(n-2)!}}\qquad\text{for }m=1,2,\dots,n
\]

Example 1: If we have $n=4$ pads and we start on pad $m=2$, then the probability we eventually end up on pad 1 is given by:
\[
a_2 = \frac{\frac{1}{1!}+\frac{1}{2!}}{\frac{1}{0!}+\frac{1}{1!}+\frac{1}{2!}}
=\frac{3}{5} = 60\%.
\]

Example 2: If we have $n=\infty$, we recognize the sum in the denominator as the infinite series representation for Euler’s constant $e$, and the sum in the numerator is one less. Therefore,
\[
a_2 = \frac{\frac{1}{1!}+\frac{1}{2!}+\cdots}{\frac{1}{0!}+\frac{1}{1!}+\frac{1}{2!}+\cdots}
=\frac{e-1}{e} \approx 63.212\%.
\]

If instead of starting on pad 2 we start on a larger-numbered pad, the numerator will continue to lose terms in its sum and gradually degrade to zero the farther we start.

2025 puzzle

This week’s Fiddler is about the number 2025, in celebration of (almost) New Years!

First puzzle: What is the greatest number of distinct primes that add up to 2025?

Second puzzle: How can you assign a set of 20 distinct prime numbers to the 20 vertices of a dodecahedron, so that the numbers on the five vertices of each face add up to 2025?

My solution:
[Show Solution]

I solved these problems by modeling them as integer linear programs. There are sophisticated solvers that can be brought to bear on such problems, such as Gurobi. But to do this, the problem must be put in the correct form. Namely,

The decision variables can be real numbers, integers, or binary numbers.
The constraints must be linear equalities or inequalities of the decision variables.
The objective we are trying to minimize or maximize must also be a linear function of the decision variables.

It may not be immediately obvious how to do this for the problems above, since they involves prime numbers… Here is how you do it:

First puzzle

Let $p$ be the list of prime numbers up to 2025. It turns out there are $n=306$ of them. We will treat $p$ as a column vector:
\[
p = \begin{bmatrix}2 \\ 3 \\ 5 \\ \vdots \\ 2017 \end{bmatrix}
\]Our decision variable will be $x \in \{0,1\}^n$, a vector of length $n$ that selects which primes we will use. In other words:
\[
x_k = \begin{cases}1 & \text{if we include prime }p_k \\
0 &\text{otherwise}
\end{cases}
\]The problem we want to solve is simply:
\begin{align*}
\underset{x \in \{0,1\}^n}{\text{maximize}} \qquad & \sum_{k=1}^n x_k \\
\text{subject to:} \qquad & \sum_{k=1}^n p_k x_k = 2025
\end{align*}

It turns out this problem is an example of a Knapsack Problem, which is one of the simplest integer linear programs.

I coded this up in Julia using the Gurobi solver. Here is my code:

using JuMP
using Primes
using Gurobi
using LinearAlgebra

# list of the primes we will use
p = primes(2025)
n = length(p)

# Create a model with Gurobi as the optimizer
m = Model(Gurobi.Optimizer)
set_optimizer_attribute(m, "OutputFlag", 0)

# Define decision variables
@variable(m, x[1:n], Bin)  # x selects which primes to use

# Maximize the number of primes used
@objective(m, Max, sum(x)) 

# Sum of primes equals 2025
@constraint(m, dot(p,x) == 2025)   

# Solve the model
optimize!(m)

# Check the status of the solution
if termination_status(m) == MOI.OPTIMAL
    println("Optimal solution found using ", Int(objective_value(m)), " primes")
    println([Int(prime) for prime in p .* value.(x) if prime > 0])
else
    println("No optimal solution found. Status: ", termination_status(m))
end

The code executed in 0.17 seconds and found an optimal solution, which uses 32 primes:
\begin{multline*}
2 + 3 + 5 + 7 + 11 + 13 + 17 + 19 + 23 + 29 + 31 + 37 + 41\\
+ 43 + 47 + 53 + 59 + 61 + 67 + 71 + 73 + 79 + 83 + 89\\
+ 97 + 101 + 103 + 107 + 109 + 173 + 181 + 191 = 2025
\end{multline*}

Second puzzle

This problem is more complex than the first one because we must not only select which primes to use, but also which vertices to assign them to. The first step is to label the vertices. We will use the following labeling:

Some labels are repeated because they correspond to vertices that are actually the same once the dodecahedron is folded back together.

To solve this problem, we again define $p$ to be a list of primes. This time, it’s not clear how many we will need, so we will let $p$ be the first $N$ primes, and adjust $N$ later as needed.

Next, we should identify which vertices belong to each of the faces. To this effect, we define the binary matrix $F\in \{0,1\}^{12\times 20}$:
\[
F_{ij} = \begin{cases} 1 & \text{if face $i$ uses vertex $j$} \\
0 & \text{otherwise}
\end{cases}
\]Finally, our decision variable is a binary selection matrix $X\in \{0,1\}^{20\times N}$:
\[
X_{jk} = \begin{cases} 1 & \text{if vertex $j$ is assigned to prime $p_k$} \\
0 & \text{otherwise}
\end{cases}
\]The optimization problem we want to solve is:
\begin{align*}
\underset{X \in \{0,1\}^{20\times N}}{\text{minimize}} \qquad & 0 \\
\text{subject to:}\qquad & \sum_{k=1}^N X_{jk} = 1 && \text{for }j=1,\dots,20 \\
& \sum_{j=1}^{20} X_{jk} \leq 1 && \text{for }k=1,\dots,N \\
& \sum_{j=1}^{20} \sum_{k=1}^N F_{ij}X_{jk}p_k = 2025 && \text{for }i=1,\dots,12
\end{align*}Some explanation is warranted:

The objective is to “minimize zero” since we simply want to find any feasible assignment of primes. So the objective we use does not matter.
The first constraint says that each of the $20$ vertices must be assigned to exactly one prime number.
The second constraint says that each of the $N$ primes can be assigned to at most one vertex.
The final constraint says that the primes assigned to the vertices that form each of the 12 faces must sum to $2025$.

Using more compact matrix notation, we can write the optimization problem more succinctly as:
\begin{align*}
\underset{X}{\text{minimize}} \qquad & 0 \\
\text{subject to:}\qquad & X \mathbf{1} = \mathbf{1} \\
& X^\mathsf{T} \mathbf{1} \leq \mathbf{1} \\
& F X p = 2025 \cdot \mathbf{1}
\end{align*}This problem is far more difficult to solve than the first one, since there are many more variables and constraints. As a comparison:

The first problem had $n$ variables and $n$ constraints. We picked primes up to 2025, which led to $n=306$, but we could have chosen a much smaller $n$.
The second problem has $20N$ variables and $N+32$ constraints. In this case, we need primes up to at least 405 (2025 divided by 5), which leads to $N\geq 80$, but it turns out a much larger $N$ is needed to find a feasible solution. I ended up picking primes up to 1000, which led to $N=168$.

Here is my code:

# list of the primes we will use
p = primes(1000)
N = length(p)

# Create a model with Gurobi as the optimizer
m = Model(Gurobi.Optimizer)
set_optimizer_attribute(m, "OutputFlag", 0)

# Define decision variables
# X[i,j] = 1 if vertex i is assigned to prime j
@variable(m, X[1:20,1:N], Bin)

# Sum of primes in each face equals 2025. 12x20 matrix selecting the faces
F = [ 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
      0 1 1 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0
      0 0 1 1 0 0 0 1 0 1 0 0 0 1 0 0 0 0 0 0
      0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0
      0 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0
      0 0 0 0 0 0 0 0 1 1 0 1 1 1 0 0 0 0 0 0
      0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 1 0 0 0 1
      0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1
      1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1
      1 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 1 1 0
      0 0 0 0 0 1 1 0 0 0 1 0 0 0 0 0 1 1 0 0
      0 0 0 0 0 0 0 0 1 0 1 1 0 0 0 1 1 0 0 0 ]
@constraint(m, X * ones(N) .== 1)     # assign exactly one prime to each face
@constraint(m, X' * ones(20) .<= 1)   # use each prime at most once
@constraint(m, F * X * p .== 2025)    # the primes on each face sum to 2025   

# Solve the model
optimize!(m)

# Check the status of the solution
if termination_status(m) == MOI.OPTIMAL
    println("Optimal solution found")
    X = Int.(value.(X))
    for i in 1:20
        for j in 1:N
            if X[i,j] == 1
                println( "(", i, ",", p[j], ")" )
            end
        end
    end
else
    println("No optimal solution found. Status: ", termination_status(m))
end

In the code above, $N=168$ and a solution was found in about 30 seconds. Interestingly, there is a trade-off with runtime:

picking a smaller $N$ means there are fewer variables and constraints, so the code should run faster.
picking a smaller $N$ also means there are fewer feasible solutions (since we have fewer primes at our disposal), so solutions could take longer to find.

Case in point: picking primes less than $900$ ($N=154$) still finds a solution, but takes several minutes to do so!

Here is one solution found by my code:

Another way to display the solution is via a Schlegel diagram:

Note: This puzzle is similar to a a previous Riddler puzzle. In that other puzzle, I used a more logical/number-theoretic approach rather than a purely computational one.

Particles in a box

This week’s Fiddler is an optimization problem about fitting particles in a box.

You have three particles inside a unit square that all repel one another. The energy between each pair of particles is $1/r$, where $r$ is the distance between them. To be clear, the particles can be anywhere inside the square or on its perimeter. The total energy of the system is the sum of the pairwise energies among the particles. What is the minimum energy for $9$ particles, and what arrangement of the particles produces it?

My solution:
[Show Solution]

Halloween Puzzle

This week’s Fiddler is about rounding!

You are presented with a bag of treats, which contains $n \geq 3$ peanut butter cups and some unknown quantity of candy corn kernels (with any amount being equally likely). You reach into the bag $k$ times, with $3 \leq k \leq n$, and pull out a candy at random. Each time, it’s a peanut butter cup! How many candy kernels do you expect to be in the bag?

My solution:
[Show Solution]

Define the following random variables:

$m$: number of candy corn kernels in the bag
$K$: whether we draw $k$ peanut butter cups in a row (true/false)

We are asked to calculate the expected number of corn kernels in the bag given that we draw $k$ peanut butter cups in a row. In other words, we want to find $\mathbb{E}(m\mid K)$.

An easier quantity to calculate is the probability that we draw $k$ peanut butter cups in a row given that there are $m$ candy corn kernels in the bag. This is given by:
\begin{align*}
\mathbb{P}(K\mid m)
&= \frac{n}{n+m}\cdot \frac{n-1}{n+m-1}\cdots\frac{n-k+1}{n+m-k+1} \\
&= \frac{n!(n+m-k)!}{(n-k)!(n+m)!} \\
&= \binom{n}{k}\binom{n+m}{k}^{-1}
\end{align*}Using Bayes’ rule, we can express the desired expectation in terms of the conditional probability above:
\begin{align*}
\mathbb{E}(m\mid K)
&= \sum_{m=0}^\infty m\cdot \mathbb{P}(m\mid K)
= \sum_{m=0}^\infty m\cdot \frac{\mathbb{P}(K \mid m)\mathbb{P}(m)}{\mathbb{P}(K)}
\end{align*} Now substitute: $\mathbb{P}(K) = \sum_{m=0}^\infty \mathbb{P}(K\mid m)\mathbb{P}(m)$ and obtain:
\begin{align*}
\mathbb{E}(m\mid K)
&= \frac{\sum_{m=0}^\infty m\cdot \mathbb{P}(K \mid m)\mathbb{P}(m)}{\sum_{m=0}^\infty \mathbb{P}(K \mid m)\mathbb{P}(m)} \\
&= \frac{\sum_{m=0}^\infty m\cdot \mathbb{P}(K \mid m)}{\sum_{m=0}^\infty \mathbb{P}(K \mid m)}
\end{align*}where in the last step, we used the fact that $\mathbb{P}(m)$ is the same for all $m$ so we canceled it from the numerator and denominator. Substituting our expression for $\mathbb{P}(K\mid m)$, we obtain:
\begin{align*}
\mathbb{E}(m\mid K)
&= \frac{\sum_{m=0}^\infty m\binom{n}{k}\binom{n+m}{k}^{-1}}{\sum_{m=0}^\infty \binom{n}{k}\binom{n+m}{k}^{-1}}
= \frac{\sum_{m=0}^\infty m\binom{n+m}{k}^{-1}}{\sum_{m=0}^\infty \binom{n+m}{k}^{-1}}
%= \frac{n-k+1}{k-2}
\end{align*}
To simplify this, define the following quantity:
\[
f(k,a,b) := \sum_{i=a}^b \binom{i}{k}^{-1}
\]and observe that we have:
\[
\mathbb{E}(m\mid K) = \frac{\sum_{j=1}^\infty f(k,n+j,\infty)}{f(k,n,\infty)}
\]It turns out we can evaluate $f$, and that it has the nice closed-form expression:
\[
f(k,a,b) = \frac{k}{k-1}\left( \binom{a-1}{k-1}^{-1}-\binom{b}{k-1}^{-1}\right)
\]This formula can be proved by induction and is related to something called the German tank problem.

With this formula in hand, we can take the limit $b\to\infty$ and see that:
\[
f(k,a,\infty) = \frac{k}{k-1}\binom{a-1}{k-1}^{-1}
\]Substituting into our expression for the desired expectation, we obtain:
\begin{align*}
\mathbb{E}(m\mid K)
&= \frac{\sum_{j=1}^\infty f(k,n+i,\infty)}{f(k,n,\infty)} \\
&= \frac{\sum_{j=1}^\infty \frac{k}{k-1}\binom{n+j-1}{k-1}^{-1}}{\frac{k}{k-1}\binom{n-1}{k-1}^{-1}} \\
&= \binom{n-1}{k-1}\sum_{j=1}^\infty \binom{n+j-1}{k-1}^{-1} \\
&= \binom{n-1}{k-1} f(k-1,n,\infty) \\
&= \binom{n-1}{k-1} \frac{k-1}{k-2} \binom{n-1}{k-2}^{-1} \\
&= \frac{(n-1)!}{(n-k)!(k-1)!} \frac{k-1}{k-2} \frac{(n-k+1)!(k-2)!}{(n-1)!} \\
&= \frac{n-k+1}{k-2}
\end{align*}

Therefore, our final answer is:

$\displaystyle
\mathbb{E}(m\mid K) = \frac{n-k+1}{k-2}
$

I undoubtedly found the most complicated possible solution for this problem… So if somebody can show me a more elegant solution (perhaps a counting argument?) then I would be much obliged!

Round, round, get a round

This week’s Fiddler is about rounding!

Let $\text{round}(x)$ be the value of $x$ rounded to the nearest integer. Suppose $x_1,\dots,x_n$ are independent uniformly distributed random variables in $[0,1]$. Find the probability that
\[
\text{round}(x_1+\cdots+x_n) = \text{round}(x_1)+\cdots+\text{round}(x_n)
\]

My solution:
[Show Solution]

Let’s call the probability we seek $p(n)$. The values of the $x_i$ determine what sums are even possible, so let’s consider some cases based on the possible values of $\text{round}(x_1)+\cdots+\text{round}(x_n)$. Suppose that $k$ of the variables are in in the interval $[\tfrac{1}{2},1]$ and the remaining $n-k$ variables are in $[0,\tfrac{1}{2}]$. The probability of this occurring is $\tfrac{1}{2^n}\binom{n}{k}$. In this case,
\[
\sum_{i=1}^n \text{round}(x_i) = k
\]Note: We can ignore the issue of whether $\tfrac{1}{2}$ rounds to $0$ or $1$ since the probability of $\tfrac{1}{2}$ occurring is zero. Define a set of rescaled variables $y_i$ in $[0,1]$ as follows:
\[
y_i = \begin{cases}
2x_i-1 & \text{if }i=1,\dots,k\\
2x_i & \text{if }i=k+1,\dots,n
\end{cases}
\]Now let’s calculate the probability that the sum of the $x_i$ also rounds to $k$ and express it in terms of the $y_i$:
\begin{align*}
\mathrm{Prob}\left( \text{round}\biggl(\sum_{i=1}^n x_i\biggr) = k\right)
&= \mathrm{Prob}\left( k-\tfrac{1}{2} \leq \sum_{i=1}^n x_i \leq k+\tfrac{1}{2} \right) \\
&= \mathrm{Prob}\left( k-1 \leq \sum_{i=1}^n y_i \leq k+1 \right)
\end{align*}The sum of the $y_i$, which is a sum of $n$ independent variables in $[0,1]$ follows a so-called Irwin-Hall distribution, whose CDF is given by:
\[
F(x) = \text{Prob}\bigl( y_1+\cdots+y_n \leq x \bigr) = \frac{1}{n!}\sum_{k=0}^{\lfloor x \rfloor} (-1)^k \binom{n}{k}(x-k)^n
\]Therefore, the probability we seek is $F(k+1)-F(k-1)$, or:
\[
\frac{1}{n!}\sum_{m=0}^{k+1} (-1)^m \binom{n}{m}(k+1-m)^n
-\frac{1}{n!}\sum_{m=0}^{k-1} (-1)^m \binom{n}{m}(k-1-m)^n
\]Rearranging terms a bit, we obtain:
\[
\frac{(-1)^{n+k}}{n!} \binom{n}{k}+
\frac{1}{n!}\sum_{m=0}^{k} (-1)^m \binom{n}{m}\bigl( (k+1-m)^n-(k-1-m)^n\bigr)
\]By writing out the terms carefully, we can group like $n^\text{th}$ powers and we obtain a simpler equivalent expression:
\[
\frac{1}{n!}\sum_{m=0}^k(-1)^m\left[ \binom{n+1}{m}-\binom{n+1}{m-1}\right](k+1-m)^n
\]Remember this was the probability that the two roundings are the same when $k$ of the $x_i$’s round to $1$. Multiplying each such probability by its prior probability and summing, we get the total probability that the two roundings are the same:
\[
p(n) = \frac{1}{2^n\,n!}\sum_{k=0}^n \binom{n}{k}\sum_{m=0}^k(-1)^m\left[ \binom{n+1}{m}-\binom{n+1}{m-1}\right](k+1-m)^n
\]This can be further simplified by grouping $n^\text{th}$ powers once again and observing that all odd terms cancel out. After quite a bit of algebra, we obtain the simplest expression possible, which is:

$\displaystyle
p(n) = \frac{1}{2^n\, n!}\sum _{k=0}^{\left\lfloor \frac{n}{2}\right\rfloor } (-1)^k \binom{n+1}{k} (n+1-2k)^n
$

We can evaluate this for different values of $n$ and we obtain:
\[
\begin{array}{c|cccccccccc}
n & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10\\ \hline
p(n) & 1 & \frac{3}{4} & \frac{2}{3} & \frac{115}{192} & \frac{11}{20} & \frac{5887}{11520} & \frac{151}{315} & \frac{259723}{573440} & \frac{15619}{36288} & \frac{381773117}{928972800}
\end{array}
\]

The sequence $2^n \, n!\, p(n)$ is actually a well-known sequence. It is A261398 in OEIS (the Online Encyclopedia of Integer Sequences), and according to OEIS it has no known closed-form formula and the one above is the simplest one available. Good enough for me!

Visualization in 2D and beyond

We can visualize the probabilities for the case $n=2$ by coloring in the set of points $(x_1,x_2)$ for which the sum of the rounded values equals the rounded value of the sum. This yields:

We can check the shaded region is $\frac{3}{4}$ of the total area, as anticipated. We can do the same for $n=3$. This time, we plot the set of points $(x_1,x_2,x_3)$ and the probability is the shaded volume:

Again, we can check the volume is $\frac{2}{3}$. Unfortunately, we can’t visualize the case $n=4$, but we can try… This should be a 4D hypercube. In the same way that each slice of a 3D cube is a square, each slice of a 4D hypercube is a 3D cube. Here is an animation showing what happens as we vary $x_4 \in [0,1]$ and we plot the cube resulting from the valid choices of $(x_1,x_2,x_3)$ as a function of $x_4$. The abrupt change occurs when $x_4=\tfrac{1}{2}$ as this causes the sum of rounded values to jump by $1$.

So now, if you can imagine, the probability we’re looking for is the “hyper-volume” of this 4D shape, which is given by integrating the volume above as we vary $x_4$, and as it turns out,
\[
p(4) = \int_{0}^1 \text{Volume}(x_4)\,\mathrm{d}x_4 = \frac{115}{192}.
\]

Approximation

Recall that the solution we found above is given by
\[
p(n) = \sum_{k=0}^n \frac{1}{2^n}\binom{n}{k}\bigl( F(k+1)-F(k-1) \bigr)
\]where $F(x)$ is the CDF of the Irwin-Hall distribution. The formula we found for this is quite messy, so let’s see if we can approximate it.

First, the Irwin-Hall distribution can be well-approximated by a normal distribution. This makes sense, since summing a large number of identically distributed random variables tends to a normal distribution by the central limit theorem. In this case, the limiting distribution has mean $\mu=\frac{n}{2}$ and variance $\sigma^2=\frac{n}{12}$. Therefore, we can approximate:
\begin{align*}
F(k+1)-F(k-1) &\approx 2F'(k) \\
&\approx \frac{2}{\sqrt{2\pi \sigma^2}} \exp\biggl( -\frac{(x-\mu)^2}{2\sigma^2} \biggr) \\
&= \frac{1}{\sqrt{\pi n/24}}\exp\biggl( -\frac{(k-\tfrac{n}{2})^2}{n/6} \biggr)
\end{align*}Next, the weighted binomial sum is actually a binomial distribution, which we can also approximate using a normal distribution. In general, we have $B(n,p) \approx \mathcal{N}(\mu,\sigma^2)$, with $\mu=np$ and $\sigma^2=np(1-p)$. Therefore, if $g$ is any function, we can write the binomial sum as an expectation:
\begin{align*}
\sum_{k=0}^n \frac{1}{2^n}\binom{n}{k} g(k)
&= \mathbb{E}_{k \sim B(n,\tfrac{1}{2})}\bigl( g(k) \bigr) \\
&\approx \mathbb{E}_{k \sim \mathcal{N}(\tfrac{n}{2},\tfrac{n}{4})}\bigl( g(k) \bigr) \\
&=\frac{1}{\sqrt{\pi n/2}} \int_{-\infty}^\infty \exp\biggl(-\frac{(k-\tfrac{n}{2})^2}{n/2} \biggr)g(k)\,\mathrm{d}k
\end{align*}Setting $g(k)=F(k+1)-F(k-1)$ and substituting our previous approximation, we find:
\begin{align*}
p(n) &\approx \frac{1}{\sqrt{\pi n/2}}\frac{1}{\sqrt{\pi n/24}}\int_{-\infty}^\infty \exp\biggl( -\frac{(k-\tfrac{n}{2})^2}{n/6} \biggr)\exp\biggl(-\frac{(x-\tfrac{n}{2})^2}{n/2} \biggr)\,\mathrm{d}k \\
&= \frac{\sqrt{48}}{\pi n} \int_{-\infty}^\infty \exp\biggl(-\frac{(k-\tfrac{n}{2})^2}{n/8} \biggr) \,\mathrm{d}k \\
&= \frac{\sqrt{48}}{\pi n} \sqrt{\pi n/8} \\
\end{align*}Therefore, we have:

$\displaystyle
p(n) \approx \sqrt{\frac{6}{\pi n}}
$

To check our approximation, we can plot it together with the true values:

Not bad! So in conclusion, we have an exact formula and an asymptotically exact approximation for the probability that
\[
\text{round}(x_1+\cdots+x_n) = \text{round}(x_1)+\cdots+\text{round}(x_n)
\]They are given by:

$\displaystyle
p(n) = \frac{1}{2^n\, n!}\sum _{k=0}^{\left\lfloor \frac{n}{2}\right\rfloor } (-1)^k \binom{n+1}{k} (n+1-2k)^n
\approx \sqrt{\frac{6}{\pi n}}
$

Tiling a Tilted Square

This week’s Fiddler is a challenging counting problem.

Consider the following array of 25 squares:

You are filling the array with rectangles by repeating the following two steps:

Select one of the 12 squares along the outer perimeter that has not yet been selected as part of a rectangle.
Form the largest rectangle you can that includes the square you just selected and other squares that are not yet part of any such rectangle.

You repeat these steps until every square along the perimeter has been selected. Here are two final states you might encounter:

How many distinct final states are possible? (Note: States that are rotations or reflections of each other should be counted as distinct.)

My solution:
[Show Solution]

We will solve the problem in three steps, where each time we build up from the previous step and add layers of complexity.

Step 1: one quarter

First, consider the original problem, but with one quarter of the shape:

If we ask about tiling this shape using the procedure outlined in the problem statement, then we can write a recursion in the general case. To see why, start by picking one of the edge squares. This divides the shape into two similar but smaller shapes. For example, if there are $n=6$ edge squares and we pick the third one from the top, we obtain:

We are left with the cases $n=2$ and $n=3$, which we can further subdivide. If we call $c_n$ the number of tilings of this shape, we therefore have the recursion:
\[
c_0=1,\quad\text{and}\quad c_{n+1} = \sum_{k=0}^n c_k c_{n-k}\quad\text{for }n\geq 0
\]This is the well-known recurrence relation for the Catalan numbers. The first few Catalan numbers are (starting from $n=0$):
\[
\{c_n\} = \{1, 1, 2, 5, 14, 42, 132, 429, 1430,\dots\}
\]and a general formula is given by:
\[
c_n = \frac{1}{n+1}\binom{2n}{n}
\]

Step 2: one half

Now, consider the original problem, but with one half of the shape:

If we ask about tiling this shape, we can form a recurrence relation like we did in Step 1. This time, when we pick an edge square, we draw one large rectangle and we are left with a half-shape (up top) and identical quarter-shapes (on either side). For example, if there are $n=6$ edge squares and we pick the fourth from the top, we obtain:

If we call $d_n$ the number of tilings of this shape, we therefore have the recursion:
\[
d_0=1,\quad\text{and}\quad
d_{n+1} = \sum_{k=0}^n c_{n-k}^2 d_k\quad\text{for }n\geq 0
\]The first few numbers in this sequence are (starting from $n=0$):
\[
\{d_n\} = \{1, 1, 2, 7, 38, 274, 2350, 22531, 233292,\dots\}
\]

Step 3: the whole thing

Now consider the original problem. You can probably guess the pattern by now… By picking one of the edge squares, we subdivide the problem into four half-shapes in identical pairs (north-south and east-west). If we call $e_n$ the number of tilings of the whole shape, we therefore have the formula:
\[
e_0=1,\qquad\text{and}\quad
e_{n+1} = \sum_{k=0}^n d_{n-k}^2 d_k^2\quad\text{for }n\geq 0
\]The first few numbers in this sequence are (starting from $n=0$):
\[
\{e_n\} = \{1, 1, 2, 9, 106, 3002, 153432, 11209105, 1027079042\}
\]

Unfortunately, none of this is particularly satisfying since we do not have a closed-form solution for $e_n$. Let’s try to find one…

Attempt at a closed-form solution

A good place to start is to see how the closed-form solution for the Catalan numbers is derived. One way is to use generating functions. The idea is that we define an infinite polynomial (a power series) where the coefficients are the sequence we care about. For Catalan numbers, we have:
\[
C(x) = \sum_{n=0}^\infty c_n x^n
\]Now notice that the recurrence relation for Catalan numbers is a convolution, which we can obtain by squaring $C(x)$:
\begin{align*}
C(x)^2 &= \sum_{m=0}^\infty \sum_{k=0}^\infty c_m c_k x^{m+k} \\
&= \sum_{n=0}^\infty \left(\sum_{k=0}^n c_k c_{n-k} \right)x^n \\
&= \sum_{n=0}^\infty c_{n+1} x^n \\
&= \frac{1}{x}\left( C(x)-1\right)
\end{align*}Solving for $C(x)$, we obtain:
\[
C(x) = \frac{1-\sqrt{1-4x}}{2x}
\](The other root of the quadratic can be excluded since it does not satisfy $C(0)=c_0=1$) From here, we can perform a series expansion via the binomial theorem and extract the coefficient of $x^n$, which yields the formula $c_n = \frac{1}{n+1}\binom{2n}{n}$.

We can use a similar argument to obtain a generating function for $d_n$. To this effect, define:
\[
D(x) = \sum_{n=0}^\infty d_n x^n
\]Now, we can write:
\begin{align*}
D(x) &= 1 + \sum_{n=0}^\infty d_{n+1} x^{n+1} \\
&= 1 + \sum_{n=0}^\infty \sum_{k=0}^n c_{n-k}^2 d_k x^{n+1} \\
&= 1 + \sum_{k=0}^\infty \sum_{n=k}^\infty c_{n-k}^2 d_k x^{n+1} \\
&= 1 + \sum_{k=0}^\infty \sum_{n=0}^\infty c_{n}^2 d_k x^{n+k+1} \\
&= 1 + \sum_{k=0}^\infty d_k x^k \sum_{n=0}^\infty c_n^2 x^{n+1} \\
&= 1 + D(x)\sum_{n=0}^\infty c_n^2 x^{n+1}
\end{align*}Therefore, we can express the generating function for $d_n$ in terms of the generating function of squared Catalan numbers:
\[
\hat C(x) := \sum_{n=0}^\infty c_n^2 x^{n+1}\qquad\text{and}\qquad
D(x) = \frac{1}{1-\hat C(x)}
\]According to Mathematica, the series involving squared Catalan numbers can be evaluated in terms of a hypergeometric function. Namely:
\[
\hat C(x) =
\frac{1}{4} \bigl(\, _2F_1(-\tfrac{1}{2},-\tfrac{1}{2};1;16 x)-1\bigr)
\]Unfortunately, this isn’t particularly helpful as there does not appear to be any way to obtain a formula for the coefficient of $x^n$ in the series for $D(x)$.

Continuing in this fashion, we can also define $E(x)$ as the generating function for $e_n$, which we can express in terms of the square of the generating function for $d_n^2$. Namely:
\[
E(x) = \sum_{n=0}^\infty e_n x^n,\qquad
\hat D(x) = \sum_{n=0}^\infty d_n^2 x^{n+1},\qquad
E(x) = 1 + \tfrac{1}{x}\hat D(x)^2
\]But again, this doesn’t really seem helpful as we can’t evaluate $D(x)$ or much less $\hat D(x)$.

If anybody else can make progress on this problem I would love to hear your approach!

Asymptotics

It was pointed out by commenter MarkS that $c_n^2/d_n$ and $c_n^4/e_n$ appear to tend to finite limits as $n\to\infty$. Here is what we get when we plot these quantities up to $n=2000$:

Is there a way we can find the exact values of these limits? Maybe! We’ll make use of the following fact. If the sequences $a_n$ and $b_n$ have corresponding generating functions $A(x)$ and $B(x)$, and these have radius of convergence $R$, then we can write:
\[
\lim_{n\to\infty}\frac{a_n}{b_n} = \lim_{x\to R^-} \frac{A(x)}{B(x)}
\]This works so long as $A(x)$ and $B(x)$ go to $\infty$ as $x\to R^-$, because any finite truncation of the series will be dominated by its last term (largest power of $x$), so the ratio of truncated series just behaves like the ratio of its last terms, which is what we care about (apologies for the hand-waving; hopefully this makes sense!).

Applying this idea, we would like to write:
\[
\lim_{n\to\infty}\frac{c_n^2}{d_n} = \lim_{x\to R^-}\frac{\hat C(x)}{x D(x)} = \lim_{x\to R^-} x \hat C(x)\bigl(1-\hat C(x)\bigr)
\]But there is just one problem… We already established that
\[
\hat C(x) = \sum_{n=0}^\infty c_n^2 x^{n+1} =
\frac{1}{4} \bigl(\, _2F_1(-\tfrac{1}{2},-\tfrac{1}{2};1;16 x)-1\bigr)
\]and as it turns out, the series for this function has a radius of convergence of $R=\frac{1}{16}$. If you’re curious as to why, remember that this is a series where the coefficients are squared Catalan numbers. Catalan numbers have a well-known property that
\[
c_n \sim \frac{4^n}{n^{3/2}\pi},\qquad\text{i.e.,}\quad
\lim_{n\to\infty}\frac{c_n}{\left(\frac{4^n}{n^{3/2}\pi}\right)} = 1
\]Therefore, $c_n^2 \sim 16^n/n^3\pi$, so a necessary condition for $\hat C(x)$ to converge is that $|x|\leq \frac{1}{16}$ (the general term of the series must tend to zero).

So what’s the problem? Here is a 3D plot of $\hat C(z)$ for complex $z$ (vertical axis is magnitude, color-coded by argument) and plotted for $|z|\leq \frac{1}{16}$.

As we can see, nothing is going to infinity near the boundary. How is it possible? It turns out the radius of convergence is $\frac{1}{16}$ because there is a different kind of discontinuity at this point; the argument (rather than the magnitude) becomes discontinuous. You can see it more clearly when the plot is zoomed out to $|z|\leq 1$ (notice the color discontinuity along the positive real axis).

So how do we deal with this problem? We need to transform the series so that it goes to infinity as $x\to\frac{1}{16}$. One way to do this is to take derivatives. This will yield a series with terms like $n a_n x^{n-1}$ and $n b_n x^{n-1}$, and the ratio remains unchanged! Here is a plot of the first derivative, $\hat C'(z)$:

This is better, as now the function appears non-differentiable near $z=\frac{1}{16}$, but it still doesn’t go to infinity. Let’s differentiate again! Here is a plot of $\hat C'{}'(z)$:

Now we’re in business. So the limit we’re looking for is:
\[
\lim_{n\to\infty}\frac{c_n^2}{d_n}
= \lim_{x\to\frac{1}{16}} \frac{\frac{d^2}{dx^2}\Bigl(\frac{1}{x}\hat C(x)\Bigr)}{\frac{d^2}{dx^2}\Bigl(\frac{1}{1-\hat C(x)}\Bigr)}=\left(5-\frac{4}{\pi}\right)^2\approx 13.88874349
\]I used Mathematica in the last step, which can analytically differentiate and evaluate hypergeometric functions. In conclusion, we have the asymptotic formula:

$\displaystyle
d_n \sim \frac{c_n^2}{\left( 5-\tfrac{4}{\pi}\right)^2}
\sim \frac{16^n}{\pi\left( 5-\tfrac{4}{\pi}\right)^2 n^3}
$

Now as for $e_n$, I’m stuck again. In principle we could use the same technique, which would require evaluating:
\[
\lim_{n\to\infty}\frac{c_n^4}{e_n}
= \lim_{x\to R^-} \frac{\frac{d^k}{dx^k}\sum_{n=0}^\infty c_n^4 x^n}{\frac{d^k}{dx^k}E(x)}
= \lim_{x\to R^-} \frac{\frac{d^k}{dx^k}\sum_{n=0}^\infty c_n^4 x^n}{\frac{d^k}{dx^k}\left(1+\frac{1}{x}\hat D(x)^2\right)}
\]where we differentiate however many times necessary to ensure the series diverges as $x\to R^-$. Mathematica is able to evaluate the numerator (it’s hypergeometric functions again, but a different kind this time), so I was able to determine that $R = \frac{1}{256}$ and the smallest $k$ we can use is $5$ (yikes!). But the sticking point is the denominator. Although we found an asymptotic expansion for $d_n$, we don’t have an actual formula. Without this, we can’t evaluate the general term of $\hat D(x)^2$, which depends on all the $d_n$.

Again, if anybody has ideas, let me know in the comments!

When is a triangle like a circle?

This week’s Fiddler is about a generalized notion of “radius”.

For a circle with radius $r$, its area is $\pi r^2$ and its circumference is $2\pi r$. If you take the derivative of the area formula with respect to $r$, you get the circumference formula! Let’s define the term “differential radius.” The differential radius $r$ of a shape with area $A$ and perimeter $P$ (both functions of $r$) has the property that $dA/dr = P$. (Note that $A$ always scales with $r^2$ and $P$ always scales with $r$.)

For example, consider a square with side length $s$. Its differential radius is $r = s/2$. The square’s area is $s^2$, or $4r^2$, and its perimeter is $4s$, or $8r$. Sure enough, $dA/dr = d(4r^2)/dr = 8r = P$. What is the differential radius of an equilateral triangle with side length s?

Extra credit:
What is the differential radius of a rectangle with sides of length $a$ and $b$?

My solution:
[Show Solution]

In general, if $A$ scales with $r^2$ and $P$ scales with $r$ and $P$ is the derivative of $A$, then it must be the case that:
\[
A = \alpha r^2\qquad\text{and}\qquad P = 2\alpha r
\]for some constant of proportionality $\alpha$. If we have a certain shape and we know formulas for its area and perimeter, then we can solve for $\alpha$ and $r$ in terms of $A$ and $P$ and we obtain:
\[
\alpha = \frac{P^2}{4A}\qquad\text{and}\qquad r = \frac{2A}{P}
\]
We can verify this formula for a square of sidelength $s$. Here, $P=4s$ and $A=s^2$. Substituting into the above formulas, we obtain $\alpha = 4$ and $r = \tfrac{s}{2}$, just as the problem stated.

In general, we can solve the case for a regular polygon with $n$ sides of length $s$. The perimeter is $P=ns$ and the area is $A = \frac{ns^2}{4\tan(\pi/n)}$. Therefore, we obtain:
\[
\alpha = n\tan\bigl(\tfrac{\pi}{n}\bigr),\qquad\text{and}\qquad r = \frac{s}{2}\cot\bigl(\tfrac{\pi}{n}\bigr)
\]So, in particular, for $n=3$, we get $r=\frac{\sqrt{3}}{6}s$. For regular polygons, the differential radius is the apothem (the line joining the center of the polygon to the midpoint of one of its sides. It is also the “inradius” (radius of the inscribed circle). Note that based on the general formula for $r$, we have:
\[
A = \frac{P\cdot r}{2}
\]In the case of the regular polygon, this means the area is one half of the perimeter times the differential radius (apothem). This makes sense, since we can unfold a regular polygon and compute its area using the length of the bases (the perimeter) and the height (apothem). See below for a visual demonstration:

And here is an animated version:

Extra credit

For the case of a rectangle of side lengths $a$ and $b$, the perimeter is $P=2(a+b)$ and the area is $A=ab$. Applying the formula above, we obtain:
\[
\alpha = \frac{(a+b)^2}{ab}\qquad\text{and}\qquad r = \frac{ab}{a+b}
\]The differential radius is half of the harmonic mean of $a$ and $b$. It also satisfies the nice formula:
\[
\frac{1}{r} = \frac{1}{a} + \frac{1}{b}
\]which is incidentally how you add resistors in parallel. We can construct the differential radius geometrically by drawing a line connecting opposite corners, and then another line at 45 degrees to the other corner and marking the intersection. The result is illustrated in the image below:

The triangles on the right are what you get when you “unfold” the rectangle on the left. The area of the rectangle, $ab$, is simply the base of the triangles (perimeter of rectangle), times the height ($r$, or the differential radius), times one half.

This works because triangles AFP and PGC are similar. It follows that:
\begin{align*}
\frac{AF}{FP} &= \frac{AD-DF}{FP} = \frac{b-r}{r},\quad\text{and}\\[2mm]
\frac{PG}{GC} &= \frac{PG}{DC-DG} = \frac{r}{a-r}
\end{align*}Equating them, we obtain:
\[
\frac{r}{a-r}=\frac{b-r}{r}
\]Solving this equation for $r$, we obtain $r = \frac{ab}{a+b}$, as required.

Here is an animation of the rectangle unfolding into four triangles with equal height and whose bases sum to the perimeter of the rectangle!

Tiling squares

This week’s Fiddler is about tiling a square with smaller squares.

Suppose you have infinitely many 3-by-3 cm tiles and infinitely many 5-by-5 cm tiles. You want to use some of these tiles to precisely cover a square whose side length is a whole number of centimeters. Tiles may not overlap, and they must completely cover the larger square, without jutting beyond its borders. What is the smallest side length this larger square can have, such that it can be precisely covered using at least one 3-by-3 tile and at least one 5-by-5 tile?

Extra credit:
This time, you have an infinite supply of square tiles for each odd whole number side length (as measured in centimeters) greater than 1 cm. In other words, you have infinitely many 3-by-3 cm tiles, infinitely many 5-by-5 cm tiles, infinitely many 7-by-7 cm tiles, and so on. You want to use one or more of these tiles to precisely cover a square whose side length is $N$ cm, where $N$ is an integer. Once again, tiles may not overlap, and they must completely cover the larger square without jutting beyond its borders. What is the largest integer N for which this task is not possible?

My solution:
[Show Solution]

This problem is about tiling squares with smaller squares, which is a famous problem known as “squaring the square”. There is a nice wikipedia article about it, and also a website I found with a comprehensive database of different tilings. There are certain properties of interest when it comes to squaring the square:

simple tiling: it contains no sub-tilings
perfect tiling: all sub-squares are of different size
symmetric: there are certain rotational or reflection symmetries present in the final shape.

However, none of this is going to help us, since we don’t care about these properties for the purpose of this problem!

In our case, we can can repeat tiles (not perfect), we can have sub-tilings (not simple) and we do not enforce any special symmetry. Moreover, we are restricted in the kinds of tiles we can use. To tackle this problem, I formulated the problem as a mixed integer program using a column enumeration approach. This is the same approach I used to solve a similar past Riddler problem from 2017, so if you’re interested in how I did the modeling, be sure to read that previous post!

First problem: minimum size

I re-used the code from the 2017 post linked above and made two changes:

restricted the available tiles to 3×3 and 5×5 only.
added a constraint that requires using at least one of each type of tile.

I looped through possible total sizes starting from 4×4 and stopped once the integer program found a tiling. The smallest size is $N=18$ and one possible tiling is shown below.

The code made short work of this problem, testing all $N\leq 18$ in a couple seconds.

Second problem: maximum size

The second problem is a bit trickier, since we are asked for the largest square that cannot be tiled. So we’ll need to do a bit of deductive work before we turn to computation. If there exists a tiling of an $N\times N$ square using only odd-sized square tiles, I will say that $N$ is “tileable”. Now some observations:

Any odd $N$ is tileable (use a single large tile of size $N\times N$).
If $M$ is tileable, then $N = k M$ is tileable for any $k$. We can simply start with the tiling for the $M\times M$ square, and then repeat it in a $k\times k$ pattern to obtain the $N\times N$ tiling.
By items 1 and 2, if $N$ is not tileable, it must be a power of $2$.
By item 2, if $N = 2^k$ is tileable, so is any larger power of $2$.

In other words, it suffices to find the smallest tileable power of $2$, and all larger $N$ will also be tileable. Using the code again, this time allowing for tiles of all odd sizes, I was able to verify that $2$, $4$, $8$, and $16$ are not tileable, but $32$ is tileable!!! Here is one possible tiling, which uses tiles of size $3$, $5$, $7$, and $11$.

Therefore, we conclude that $N=16$ is the largest integer for which an $N\times N$ square cannot be tiled using smaller squares with odd sidelengths greater than $1$.

Computational note: I used the Gurobi solver to solve this problem, and it took about 16 seconds on my laptop. I also tried Mosek, but it was slower, coming in at about 40 seconds.

A contradiction

How is it possible that the smallest tileable $N$ is $18$ but the largest non-tileable $N$ is 16?

These are actually different problems! For the first problem, we were only allowed to use tiles of size 3×3 and 5×5. In this case, 18 is the smallest tileable number, but not all subsequent numbers are tileable! It turns out 19, 22, 23, 26, 28, 29 (and more…) are not tileable. In the second problem, we had more tiles at our disposal, so naturally there must be fewer non-tileable $N$’s in this case.

Showcase Showdown

This week’s Fiddler is based on “Showcase Showdown” on the game show “The Price is Right”.

Suppose we have some number of players. Player A is the first to spin a giant wheel, which spits out a real number chosen randomly and uniformly between 0 and 1. All spins are independent of each other. After spinning, A can either stick with the number they just got or spin the wheel one more time. If they spin again, their assigned number is the sum of the two spins, as long as that sum is less than or equal to 1. If the sum exceeds 1, A is immediately declared a loser.

After A is done spinning (whether once or twice), B steps up to the wheel. Like A, they can choose to spin once or twice. If they spin twice and the sum exceeds 1, they are similarly declared the loser. This continues until all players are done. Whoever has the greater value (that does not exceed 1) is declared the winner.

Assuming all players play the game optimally, what are player A’s chances of winning?

My solution:
[Show Solution]

We’ll assume there are $n$ players in total. Define the following:
\[
W_k(y) = \left\{ \begin{aligned}
&\text{Probability that the $k^\text{th}$}\\
&\text{player from the end will win}\\
&\text{assuming the highest number}\\
&\text{obtained so far is $y$.}
\end{aligned}
\right\}
\]In other words, $W_1$ is the probability the last player wins, $W_2$ is the probability the second-last player wins, etc. This is the quantity we need to solve for. It will also be helpful to define the probability that the last $k$ players all lose:
\[
L_k(y) = \left\{ \begin{aligned}
&\text{Probability that the last $k$}\\
&\text{players will be declared}\\
&\text{losers assuming the highest}\\
&\text{number obtained so far is $y$.}
\end{aligned}
\right\}
\]Let’s work our way backwards, starting from the last player. Suppose the number to beat is $y$. Since there is only one player left, $W_1(y) + L_1(y) = 1$. Let’s say they obtain $x$ on their first spin. Clearly, if $x > y$, they will stop (and win the game). Otherwise, they will try their chances with a second spin. Say they obtain $z$. They will win if they obtain a sum $x+z$ that is larger than $y$ but not larger than $1$. Therefore,
\begin{align*}
W_1(y) &= \int_y^1\,\mathrm{d}x + \int_0^y \int_{y-x}^{1-x}\, \mathrm{d}z\, \mathrm{d}x = 1-y^2
\end{align*}Consequently, $L_1(y)=y^2$. This makes sense. If $y$ (the number to beat) approaches $1$, then the probability of winning approaches zero because it becomes increasingly difficult to beat it.

Now consider the case with $k$ players remaining. Again, the number to beat is $y$. We will assume a threshold strategy: the player will spin a second time if their first spin is less than $a$, where $a$ is the threshold value. Clearly, if their first spin is less than $y$, they must spin again, otherwise they are guaranteed to lose. So $y < a < 1$. There are several cases to consider. Suppose the first spin is $x$ and the second spin, if needed, is $z$.

If $y < a < x$, then $W_k(y) = L_{k-1}(x)$.
If $y < x < a$, then if $x+z > 1$, we lose. Else, $W_k(y) = L_{k-1}(x+z)$.
If $x < y < a$, then if $x+z > 1$ or $x+z < y$, we lose. Otherwise, $W_k(y) = L_{k-1}(x+z)$.

Expressing as integrals, the probability of winning with threshold $a$ is:
\begin{align*}
W_k(y,a) &=
\int_a^1 L_{k-1}(x)\,\mathrm{d}x
+ \int_y^a \int_{0}^{1-x} L_{k-1}(x+z)\,\mathrm{d}z\, \mathrm{d}x
+ \int_0^y \int_{y-x}^{1-x} L_{k-1}(x+z)\,\mathrm{d}z\, \mathrm{d}x\\
&= \int_a^1 L_{k-1}(x)\,\mathrm{d}x
+ \int_y^a \int_{x}^{1} L_{k-1}(v)\,\mathrm{d}v\, \mathrm{d}x
+ \int_0^y \int_{y}^{1} L_{k-1}(v)\,\mathrm{d}v\, \mathrm{d}x\\
&= \int_a^1 L_{k-1}(x)\,\mathrm{d}x
+ \int_y^a \int_{x}^{1} L_{k-1}(v)\,\mathrm{d}v\, \mathrm{d}x
+ y \int_{y}^{1} L_{k-1}(v)\,\mathrm{d}v
\end{align*}This player will pick $a \in [y,1]$ to maximize this probability of winning. In other words:
\[
W_k(y) = \underset{a \in [y,1]}{\text{maximize}}\; W_k(y,a)
\]The maximum will occur when the derivative with respect to $a$ is zero, or at $a=y$. Therefore, we have:
\[
a_k^\star(y) = \begin{cases}
\text{soln. of }L_{k-1}(a) = \int_{a}^1 L_{k-1}(v)\,\mathrm{d}v & \text{if it satisfies }a > y \\
y & \text{otherwise}
\end{cases}
\]Next, we can recursively compute $L_k$. In order for all players to lose, the current player must lose, and so must all others. This happens when the first spin is below the threshold, the second spin doesn’t beat $y$, and all subsequent players lose. In other words,
\begin{align*}
L_k(y) &= \int_y^{a_k^\star(y)}
\int_{1-x}^1 L_{k-1}(y)\,\mathrm{d}z\,\mathrm{d}x
+
\int_0^{y}\left(
\int_0^{y-x} L_{k-1}(y)\,\mathrm{d}z +
\int_{1-x}^1 L_{k-1}(y)\,\mathrm{d}z\right)\mathrm{d}x \\
&= \left( \int_y^{a_k^\star(y)}
\int_{1-x}^1 \mathrm{d}z\,\mathrm{d}x
+
\int_0^{y}\left(
\int_0^{y-x} \mathrm{d}z +
\int_{1-x}^1 \mathrm{d}z\right)\mathrm{d}x\right)L_{k-1}(y) \\
&= \left( \int_y^{a_k^\star(y)}
x\,\mathrm{d}x
+
\int_0^{y} y\, \mathrm{d}x\right)L_{k-1}(y) \\
&= \tfrac{1}{2}\left(a_k^\star(y)^2 + y^2\right) L_{k-1}(y)
\end{align*}The reason this can simplify so much is that the integrands are evaluated at $y$ because when the present player loses, the “best spin” does not change for subsequent players!

Two players

Applying the recursive formulas above, we find:
\begin{align*}
W_1(y) &= 1-y^2 \\
L_1(y) &= y^2 \\
W_2(y,a) &= \tfrac{1}{12}\left(4+4 a-4 a^3-a^4-3 y^4\right)
\end{align*}Things are starting to get messy… It turns out the optimal $a$ is given by:
\[
a_2^\star(y) = \max\left\{y, 2 \cos \left(\tfrac{2 \pi }{9}\right)-1\right\} \approx \max\{y,0.532089\}
\]Finding $W_2(y)$ is even messier, but thankfully, in this case, we can just let $y=0$ because the second last player is also the first player! This gives us a probability of winning of approximately:
\[
W_2(0) = \tfrac{1}{12} \left(9+4 \sin (\tfrac{\pi }{18})+2 \cos (\tfrac{\pi }{9})-8 \cos (\tfrac{2 \pi }{9})\right) \approx 0.453802
\]So the probability that the first (of two) players wins is about 45.38%. Here is a plot of $W_2(0,a)$ that shows how the probability of the first player winning varies as a function of their threshold. The function is maximized at $a^\star_2(0)$

Three players

Continuing our computations, we have:
\begin{align*}
a_2^\star(y) &\approx \max\{y,0.532089\} \\
L_2(y) &= \tfrac{1}{2}\left( y^2 + a_2^\star(y)^2 \right)y^2 \\
%W_3(y,a) &= -\tfrac{1}{24}\left(a^4+4 a^3-4 a+3 y^4-4\right) a_2^\star(y)^2\\
%&\hspace{3cm}-\tfrac{1}{60}\left(a^6+6 a^5-6 a+5 y^6-6\right)
W_3(y,a) &= \text{(messy)}
\end{align*} At this point, I abandoned any hope of analytic solutions… Optimizing numerically, I obtained:
\begin{align*}
a_3^\star(y) &\approx \max\{y,0.648655\} \\
W_3(0) &\approx 0.305227
\end{align*}So the probability that the first (of three) players wins is about 30.52%.

Similar to the previous section, here is a plot of $W_3(0,a)$ that shows how the probability of the first player winning varies as a function of their threshold. It’s a lot harder to win when there are three players!

More players

In principle, we can keep turning the crank and generating solutions for more players. The nice thing about this problem is that the optimal strategy for a given player does not depend on the total number of players; it only depends on how many players there are after the current player.

Here is a table of the thresholds I was able to calculate:

$k$	$a_k^\star(y)$ (threshold)	$W_k(0)$ prob of win
$1$	$\max\{y,0\}$	$1.000000$
$2$	$\max\{y,0.532089\}$	$0.453802$
$3$	$\max\{y,0.648655\}$	$0.305227$
$4$	$\max\{y,0.711449\}$	$0.231060$
$5$	$\max\{y,0.752249\}$	$0.186229$

Here is how you read the table: Suppose you are the $k^\text{th}$ player from the end, say $k=3$. When your turn comes, let $y$ be the largest value obtained by any player before you. Pick whichever is largest, $y$ or $0.648655$, this becomes your threshold. Take your first spin. If you obtain something larger than the threshold, you’re currently winning, so pass your turn. Otherwise, spin again and hope for the best. The last column shows the probability of this player winning assuming they go first.

As $k$ gets larger, so does the threshold; the more players will spin after you, the more aggressive you have to be with your thresholding strategy. Also, notice that the probability of winning when you go first is a little less than $1/k$; the first player is always at a disadvantage, so we can expect their probability of winning to be below average.

A related problem: Counting Carlitz words

Counting Letter Boxed words

The general finite case

The general infinite case

Solving our special case

First puzzle

Second puzzle

The case $n=9$

Visualization in 2D and beyond

Approximation

Step 1: one quarter

Step 2: one half

Step 3: the whole thing

Attempt at a closed-form solution

Asymptotics

Extra credit

First problem: minimum size

Second problem: maximum size

A contradiction

Two players

Three players

More players