The Approaches page discusses three ways to approach this exercise, with very different performance. Adding in some ways to vary the coding details, we will discuss 5 implementations.

Cubic-time code

We need to find sets of three variables meeting some criteria, so the most naive approach is to scan over the variables in nested loops, with a test for the criteria in the innermost loop.

This is simple and clear but very slow, and the run-time increases proportional to n ** 3. When tested, n = 1_000 took about 8 seconds to complete, and we can extrapolate that n = 100_000 would take nearly 3 months.

This is impractical!

Quadratic-time code

For cubic, the loops were nested 3-deep. Not coincidentally, the run time scaled as the third power of n.

As we know that a + b + c == n, it is fairly obvious that we only need to scan through two variables. The third on can be calculated, for example c = n - a - b.

Nesting loops 2-deep means, in this case, that the run time is:

  • Always faster than cubic.
  • Increases as n ** 2, so it is called quadratic-time.

In practice, this means that a n = 30_000 test ran in a few seconds. However, n = 100_000 would take several minutes per repetition (and we want to average several runs), so this was excluded from performance testing.

Tightening up the loop's upper and lower bounds can achieve a small speedup. Results are shown below for quad_loose and quad_tight, respectively.

However, the difference is only 3- to 4-fold and scaling is still quadratic, so large-n testing is still impractical.

As a general principle: if run time varies as a * n ** x, this sort of coding trickery only changes the proportionality constant a.

The exponent x is very much more important, and only a better algorithm can change it.

Linear-time code

The approaches discussed above have a strictly programming focus. To take the next step, we need to look at the problem as mathematicians — or as programmers who read what real mathematicians have already published.

The Approaches page discusses two implementations with code that looks very different but uses a similar underlying algorithm. These are shown in the analyses as linear_loop and linear_comp (the latter using generator expressions and list comprehensions). Performance is similarly impressive in both cases.

Measured timings

The five code implementations were benchmarked, using appropriate values for the upper limit of n and number of runs too average over, to keep the total testing time reasonable.

Numerical results are tabulated below.

n = 12 30 100 300 1_000 3_000 10_000 30_000 100_000
cubic 1.2e-05 1.8e-04 7.1e-03 1.9e-01 7.6e+00
quad_loose 4.7e-06 2.5e-05 2.7e-04 2.3e-03 2.7e-02 2.5e-01 2.8e+00 2.5e+01
quad_tight 1.1e-06 5.2e-06 5.9e-05 5.9e-04 7.3e-03 7.0e-02 7.9e-01 7.3e+00
linear_loop 4.4e-07 5.7e-07 1.1e-06 3.4e-06 1.1e-05 3.1e-05 1.0e-04 3.1e-04 1.2e-03
linear_comp 5.3e-07 1.1e-06 2.8e-06 9.0e-06 2.8e-05 8.3e-05 2.8e-04 8.1e-04 3.1e-03

Note the missing values, which also affect the graphical representation. Also, note the logarithmic y-axis in the graph output when running the Benchmark.py script. These run times vary over more than 7 orders of magnitude!

Testing algorithmic complexity

We have discussed these solutions as cubic, quadratic or linear. Do the experimental data support this?

For a power law relationship, we have a run time t given by t = a * n**x, where a is a proportionality constant an x is the power.

Taking logs of both sides, log(t) = x * log(n) + constant.

Plots of log(t) against log(n) will be a straight line with slope equal to the power x, which you can produce by running the Benchmark.py code.

Encouragingly, these are all straight lines for larger values of n, as we expected.

Linear least-squares fits to each line gave these slope values:

method slope
cubic 3.01
quad_loose 2.00
quad_tight 2.03
linear_loop 0.90
linear_comp 0.96

Cubic, quadratic and linear algorithms have theoretical slopes of 3, 2, and 1 respectively, so the experimental values are pretty close.

Algorithmic complexity is an analysis of the high-n limit, so including all the points (done for simplicity) is perhaps inappropriate. For the linear-time solutions, there is noticeable curvature at the low end, where some fixed overhead made a non-negligible contribution to run time. Removing these points and fitting only the linear portion would give a slope closer to 1.0.

9th Apr 2025 · Found it useful?