Published: January 31, 2024

Project Euler

Problem 14 - Longest Collatz Sequence

Problem

This problem comes from Project Euler 14

Problem The following iterative sequence is defined for the set of positive integers:

nn2 (n is even)n \rarr \dfrac{n}{2} \text{ (n is even)} n3n+1 (n is odd)n \rarr 3n+1 \text{ (n is odd)}

Using the rule above and starting with 13, we generate the following sequence:

13402010516842113 \rarr 40 \rarr 20 \rarr 10 \rarr 5 \rarr 16 \rarr 8 \rarr 4 \rarr 2 \rarr 1

It can be seen that this sequence (starting at 13 and finishing at 1) contains 10 terms. terms. Although it has not been proved yet (Collatz Problem), it is thought that all starting numbers finish at 1.

Which starting number, under one million, produces the longest chain?

NOTE: Once the chain starts the terms are allowed to go above one million.

Solution

The solution is pretty straight-forward based on the problem description, but we can make an optimization.

# Project Euler: Problem 14
# Longest Collatz sequence

max_starting_number = 0
max_chain_length = 0

for starting_number in range(1, 1000000):
    chain_length = 0
    n = starting_number

    while n != 1:
        chain_length += 1

        if (n % 2 == 0):
            n //= 2

        else:
            n = 3*n + 1

    if (chain_length > max_chain_length):
        max_starting_number = starting_number
        max_chain_length = chain_length

print(max_starting_number, max_chain_length)

The above code works, but it’s slow. This is because we recalculate past chains in the process. The program works sequentially, starting at 1 and going to 1 000 000. Let’s say we’re currently at 40, so the program divides 40 by 2, which results in 20. But since we’re working sequentially, the chain length for 20 has already been calculated, so why do it again?

The optimization is to store past chain lengths. During the process, if nn drops below the starting number, that means we have already calculated nn , so we can just grab the previous calculation and add it to the current chain length. Then we store the chain length for our starting number.

Using the previous example

  1. Starting number is 40
  2. Divide 40 by 2 resulting in nn = 20. (Chain length has increased by 1)
  3. nn < 40, so grab stored_chain_length[20] = 8 and add it to current chain length
  4. Store chain length for starting number using stored_chain_length.append()

Lastly in the code below, I initialize stored_chain_length = [0, 1]. This is because stored_chain_length[0] doesn’t exist so it’s set to zero, and stored_chain_length[1] is never calculated as n=1n = 1 marks the end of a Collatz sequence. This also allows us to perform stored_chain_length[n] otherwise the index would be off by two.

Code

# Project Euler: Problem 14
# Longest chain chain
# Optimized

max_starting_number = 0
max_chain_length = 0
stored_chain_length = [0, 1]

for starting_number in range(1, 1000000):
    chain_length = 0
    n = starting_number

    while n != 1:
        chain_length += 1

        if (n % 2 == 0):
            n //= 2

        else:
            n = 3*n + 1

        if (n <= starting_number):
            chain_length += stored_chain_length[n]
            stored_chain_length.append(chain_length)
            break

    if (chain_length > max_chain_length):
        max_starting_number = starting_number
        max_chain_length = chain_length

print(max_starting_number, max_chain_length)