A Polynomial Growth Puzzle
Correcting an erratum in our quantum algorithms textbook
|
|
Cropped from source |
Paul Bachmann was the first person to use -notation. This was on page 401 of volume 2 of his mammoth four-part text Analytic Number Theory, which was published in Germany in 1894. We are unsure, however, whether he defined it correctly.
Today we admit that we got something wrong about -notation in an exercise in our recent textbook, and we ask: what is the best way to fix it?
Bachmann was simply plugging in Leonhard Euler’s estimate of the harmonic sum in
where . He stated, “we find that
if we use the expression to represent a quantity whose order in regard to
does not overstep the order of
; whether it really has components of order
inside it is left undetermined in the above derivation.”
The big clearly stands for order—Ordnung in German—but where does he define Ordnung? According to the search at the University of Michigan Historical Math Collection page for Bachmann (click the third item, which is mis-labeled), the word Ordnung appears on eleven previous pages of volume 2 in several senses. However, the closest I find to a definition is a place on pages 355—356 where he discusses a formula by Adrien-Marie Legendre:
If one says that a quantity
is of order
when for unboundedly growing
the ratio of
to
for
is unboundedly large, whereas for
it is unboundedly small, then one can show that Legendre’s expression is not precise enough up to quantities of the order
inclusive, rather that among all functions of the form
this is only the function
.
I will not claim that my translation has enhanced the clarity but I don’t think it has diminished it much either—I think it is -of the actual clarity. Edmund Landau, who also introduced little-
and did much to rigorize and popularize both notations, only stated that he found the notation in Bachmann’s book, not the definition.
Erratum Erratum
My dictionary defines erratum as an error in writing or printing, but Wikipedia defines it as a correction of such an error. The latter is clearly meant when one publishes an erratum, while an errata page means to indicate both the errors and the fixes. My subtitle “fixing an erratum…” may seem the former—a tony way of saying “fixing an error”—but it’s not. We need to fix the first erratum on our own errata page.
The original error appears in the exercises to Chapter 2 of our recent textbook, Quantum Algorithms via Linear Algebra. In the book the exercise reads,
Show that a function
is bounded by a polynomial in
, written
, if and only if there is a constant
such that for all sufficiently large
,
.
When I wrote the exercise I had in mind this proof of the direction: Pretend
is a real function with that property and consider the integral of
from
to
. We can rewrite this as the integral from
to
of
. The assumption
bounds the latter by
times the integral of
from
to
. Iterate this
times in all until
is less than the fixed and finite value
underlying the phrase “sufficiently large
.” This gives the following bound on the integral:
where is the maximum value of
on
. Jumping back to
being originally defined on integers avoids any potential nastiness about
, and the bound on the integral clearly applies to
. I figured the
direction was immediate since
with
.
I had intended that time functions are monotone, that is
, as holds automatically if
means the worst-case running time on inputs of length at most
. I received one e-mail showing this false when
could decrease, and thought adding the word “monotone” fixed the problem. However it does not.
The Counterexample
A clever counterexample was sent to us last weekend by Marcelo Arenas and Pedro Bahamondes Walters of the Pontifical Catholic University of Chile in Santiago. They defined a function that stays bounded by but alternates between behaving linearly and suddenly jumping up to meet the parabola again:
We can convey in words the essence of the rigorous proof they provided, technically defining a slightly different function : Consider any point
on the parabola. Make a line go northeast at 45
through the points
for
up to some
. The line ends at the point
; put
. Now continue the line on a steeper slope back up to meet the parabola at the point
The slope of the segment from to
is the ratio
, and it is
This shows that the ratio is nowhere near staying bounded by a constant
, either as
or
is made as large as desired, even though
. So the
direction is wrong.
How to Fix and Keep the Motivation?
The motivation that we want to convey is that a polynomial bound is really a linear bound as the size of the data doubles. Dick recalls Bob Sedgewick emphasizing this point in his teaching. An old post discussed Bob’s idea of determining running times of real polynomial algorithms empirically by sampling cases
where
and fitting
.
We could try to fix things by using in place of
. The bad function
above is excluded because it is not
for any
. Now the forward direction holds because:
However, now we have lost the converse direction, because still allows
to wobble enough that it is not Theta-of any one polynomial. We could try to demand that
be asymptotic to
for some fixed constant
(that is, so that
as
), but then we lose the forward direction again.
Besides, once we try to be more specific about the asymptotics, we lose the simple motivation a-la Sedgewick. What is the simplest way to preserve it?
Restrictions
Dick recalled a post by Terence Tao on Mikhail Gromov’s theorem characterizing finitely-generated groups of polynomial growth as those having nilpotent subgroups of finite index. The growth is of neighborhoods of radius
in the Cayley graphs of the groups with generators
. Tao describes a new proof of Gromov’s theorem by Bruce Kleiner, and for simplicity uses the stronger condition
for some constant
and all
. He writes (emphasis his):
In general, polynomial growth does not obviously imply bounded doubling at all scales, but there is a simple pigeonhole argument that gives bounded doubling on most scales, and this turns out to be enough to run the argument below. But in order not to deal with the (minor) technicalities arising from exceptional scales in which bounded doubling fails, I will assume bounded doubling at all scales.
Unfortunately this insight is specific to the geometry of the Cayley graphs and does not carry over simply to growth rates—functions of the form above can be made to violate bounded doubling for many long intervals. We wonder instead about restricting
to some “natural” class of time functions. Donald Knuth’s seminal 1976 paper on asymptotic notation reminds us of this theorem by Godfrey Hardy:
Theorem 1 If
and
are any functions built up recursively from the ordinary arithmetic operations and the exp and log functions, we have exactly one of the three relations
,
for some
, or
![]()
The equivalence in our exercise seems to hold for Hardy’s class of functions—at least the mechanism of the counterexample is excluded. Is the forward direction now easy to prove?
Even so, Hardy’s class might be felt too restrictive to stand for running times of all natural algorithms. For one reason, it excludes the function, which figures in the analysis of algorithms for numerous problems mentioned here. For another, we can readily imagine that natural algorithms could exhibit the kind of stepwise behavior of the counterexample
above, when exceptional inputs requiring extra attention are sparsely distributed.
Open Problems
Is there a simple and natural extra condition that makes bounded doubling equivalent to polynomial growth?
More generally, how does one express something that is “morally true” but technically false?
[missing constant C before Tao quote, earlier f “could decrease”, Marcelo Arena -> Arenas]



Corrigendum …
Missing the C in |B(2r)| < |B(r)| in the beginning of the section titled "Restrictions"
Ah, thanks!
“I received one e-mail showing this false when {f} has zero or negative values”: isn’t the negative “bad case” discarded in the statement of the exercise, since
takes non-negative values?
Just post this on http://www.mathoverflow.net
How about assuming the function is everywhere convex or concave?
Tough I guess “staircase like” functions are not too uncommon, for example $l \lfloor 2^n \rfloor$.
If you don’t assume that f can be bounded up to a constant, then you can never expect to prove a bound like f(2n) < cf(n). Because if f(n+1) < cf(n) isn't ruled out, then f(2n) < cf(n) cannot be ruled out either of course. So my proposal would be something like:
If there exists a constant c_0 such that f(2n) < c_0f(n) for all (large enough) n, then there exist constants c_1 and k such that f(n) < c_1n^k for all (large enough) n. Furthermore, the reverse implication holds provided we can also lower bound f(n) by c_2n^k for some constant c_2.
Thanks—also to Thomas—adding your condition(s) may fix some things I was trying to do.