A shower thought turned into a beautiful Collatz visualization

I recently went on a nice long SCUBA diving trip with my wife and daughters. Lots of diving implies lots of showers, and lots of showers means lots of shower-thoughts! [1] An especially interesting one I had turned into a nice way to visualize some aspects of the Collatz Conjecture.

The Collatz Conjecture defines a very simple function on all positive integers as follows:

If you can prove that repeated applications of this function, starting from any positive integer, will eventually reach 1 then you have proved the Collatz Conjecture and won a million dollars and plenty of fame and glory! [2]

Individual inputs are trivial to verify. For example, if we start with 6, we get:


But the real work is proving that this happens for all positive integers. So far, no one has been able to do that.

I'm not going to solve the Collatz Conjecture. Somehow I'm turning 50 this year, and my PhD was more than two decades ago (and in a branch of math that's unlikely to impact the conjecture). But I love that it's just there, waiting and inspiring people. I hope to see it solved in my lifetime.

Anyway, my shower-thought was that it would be nice to visualize this repeated application of the Collatz function for many positive integers all at once, rather than just one at a time. To do that, I thought, why not keep track of the sequence of branches taken for each input and then since there are only two branches, why not treat them as binary digits?! [3] We could use those binary digit sequences to make a fraction from each input, by summing 2-nbn for each bit bn in the sequence, making it really easy to graph them and perhaps more easily see what the Collatz process is doing.

After a little more thinking I decided this approach would be a little too naive. The output of the Collatz function at each step seems like it would be biased towards producing even numbers, making the bit-sequences more full of one binary digit than the other, and maybe obscuring any interesting features we could otherwise see. I decided I would fix that by immediately dividing by 2 at the end of the 3n + 1 step (since we know 3n + 1 will be even if n is odd). It turns out this idea is well-known, as the "shortcut" Collatz function.

With that tweak, here's a simple JavaScript implementation of this idea:

function collatzBits(n) { // Generate a sequence of bits recording the branches taken by the Collatz process
    let bits = [];
    while (n !== 1) {
        if (n % 2 === 0) {
            bits.push(1);
            n = n / 2;
        } else {
            bits.push(0);
            n = (3 * n + 1) / 2;
        }
    }
    return bits;
}

function bitsToFraction(bits) { // Convert a sequence of bits to a fraction by summing 2^-i * b_i
    let fraction = 0;
    for (let i = 0; i < bits.length; i++) {
        fraction += bits[i] / Math.pow(2, i + 1);
    }
    return fraction;
}

Here's an interactive version you can play with. Try putting in a few numbers to see how the Collatz process gets encoded into binary fractions.

Choose an input n =
The corresponding sequence of bits, interpreted as a binary fraction, is 0.02
which is 0 as a decimal.

Something fun to think about: How long of a bit-sequence can you construct? Can you make an arbitrarily long one, by choosing the right starting number?


Of course, now that we have a mapping from positive integers to fractions, we can also plot them.

Here's a plot of the Collatz fractions for the first N positive integers. You can adjust N to see more points. Try some big values of N to get a sense of the distribution of the Collatz fractions. N =

The points look quite uniformly distributed to me. If I squint, then maybe I can see some structure, but it's hard to describe and I could be imagining it.

After making this plot I remembered a nice trick I read about as a teenager in James Gleick's fantastic book "Chaos" (I think the idea might have been attributed to Feigenbaum). The trick is that, when you have a sequence of numbers that seem random, you should try treating subsequent pairs of numbers as coordinates in a 2D plot. For our sequence of "Collatz fractions", fn, we would plot the points (fn, fn+1) for n = 1, 2, 3, ... N.

I did that, and was so surprised by the result I first thought I must have made a mistake in my code. But I hadn't, the patterns are real. They look almost like some kind of alien "writing" to me, and there's so much beautiful self-similarity in them. To dig deeper into the structure, I added a way to color points that match simple javascript rules. Here's an interactive version of that, with a few of my colorizing rules. You can add more of your own too!

Here's the plot for the first N integers, where N = If you zoom far enough to see gaps between points, increase N (but the plot will be slower!)

I find this incredibly fun to play with, zooming deep in and adding new color rules and seeing how they change the plot. Depending on what device you're viewing this on, you should be able to pinch or scroll to zoom in and out, and drag it around to see different parts of it.

I had already done a quick literature search to see if anyone else had done something similar, but hadn't found anything so far. I wrote down some ideas I thought might possibly turn into a proof of the self-similarity, but it didn't look trivial to me. So, before really getting to work on that, I decided to try again with the literature search. This time I used ChatGPT's new "Deep Research" feature. It thought about it for a long time, doing a bunch of searches, and eventually replied with a list of papers it thought might be promising. Most of them actually weren't, but one of them was an exact match. This 2019 paper by French mathematician Olivier Rozier contains a plot that looks exactly the same as mine! It was really fun to see it again in a different context, and Rozier does prove some self-similarity results. Rozier's paper also cites a 2007 paper by Yukihiro Hashimoto that has the same plot again. Neither Rozier's nor Hashimoto's plot is constructed the same way as mine, even though they look the same. Both of these papers build the plot starting with 2-adic numbers and only later map those to fractions. I would guess the 2-adic approach probably makes their proofs nicer, but jumping straight to fractions is likely to be easier if you've never seen p-adic numbers before (or if, like me, it's been long enough that you've forgotten most of what you learned about them!).

If you find something interesting in the plot — a nice pattern, a structure, something weird — please let me know! I'd love for this to spark your imagination and maybe even lead to some new discoveries. I think this plot is beautiful, and I hope you enjoy it too.


Thanks to Tatiana Moorier and Emmett Shear for reading drafts of this.


[1] I've been telling people for years if businesses want employees to have better ideas, they should have more showers in their offices. So far everyone seems to think I'm joking. I'm not.

[2] A reasonable question at this point is "Why do mathematicians care about proving the Collatz Conjecture?" One answer is just "because it's there!" But there are more serious answers too. Here's one: After the proof of Fermat's Last Theorem in 1995, the Collatz Conjecture is arguably the best example of a problem that's very easy to state, and so far completely impossible to prove. Often when mathematicians have encountered problems like this in the past, they've had to invent new kinds of "mathematical machinary" to solve them. In high school we learn some basic kinds of machinary -- proof by induction, proof by contradiction, and so on. Later, in college, we might learn Cantor's diagonal argument, or the epsilon-delta approach to limits. It seems likely, since we still haven't solved the Collatz Conjecture, that we will need some new kind of machinary to solve it. I think it's the anticipation of the discovery of that new machinary that keeps mathematicians interested in the Collatz Conjecture.

[3] I've seen other people keep track of the branches and use them to draw some really pretty pictures, e.g. turning left or right at each step depending on which branch was taken. The problem with that is you can't use it to see the branching structure for huge numbers of inputs all at the same time.