Scaling and Fame
Scaling the pandemic is different from scaling the US budget
|
| Sydney Morning Herald interview source |
Terence Tao is now “properly” famous. He was cited earlier this month in the NYT science section for help in explaining large numbers. Numbers such as the US federal budget.
Today we discuss caveats on such explanations, after a riff on the popular explanation of mathematics.
Regarding that, let us forget Tao’s work on primes in progressions with Ben Green, forget the Erdős discrepancy problem, and forget his almost-resolution of the Collatz conjecture. Forget it all. Better than a headline, he got his name embedded into the NYT article’s URL. This was for something much less deep that he wrote in 2009—as a blogger.
En passant, we mention that Ken has an event tomorrow (Wed. 6/30) at 3:30 ET. It is a webinar hosted by Marc Rotenberg, who heads the Washington-based Center for AI and Digital Policy (CAIDP), on “Chess and AI: The Role of Transparency.” Registration is free at this link. One aspect of transparency in Ken’s work is that he writes about his model’s methodology here—as a blogger.
Rescaling the Budget
Tao was referenced for a post he wrote in May 2009 when Barack Obama was working on his first budget as President. The ratio of $100 million to $3 that he used scaled the budget income to about $75,000.
With Joe Biden engaged in budget deliberations, Aiyana Green and Steven Strogatz wrote the NYT article on explaining the US federal budget. Green is a student at Cornell: she just completed her junior year in the Department of Policy Analysis and Management.
They updated Tao’s post to scale the income to $100,000. Besides being a round number, this is close to the estimated inflation since 2009. Here is the NYT graphic of the numbers.
A Scaling Caveat
It is attractive to apply this scaling trick elsewhere, even to grim subjects like the coronavirus pandemic. But there we find an element that does not scale.
Suppose we use the same figure of 100,000 to scale down the world’s population. Besides its famous pandemic numbers pages, Worldometer also keeps a running estimate of the total world population, now nearing 7.9 billion. Scaling down means multiplying every ther human number by 0.000012697445274.
We can think of 100,000 people as a city that is not a metropolis. Scaling down the current pandemic figures, we get:
- 182,000,000 total cases become 2,309.
- 11,496,147 active cases become 145.
- 4 million total deaths (the numbers are approaching that millstone as we write) become 50 deaths.
- 80,346 currently listed in critical or serious condition become exactly one.
These numbers are not at all unusual for our size of city if one considers all kinds of illness and mortality. The scaling trick may seem to have reduced the scope of the pandemic, as opposed to statements such as 182 million being over half the US population. Yet in terms of the raw numbers it preserves the proportions.
What the scaling doesn’t preserve is the proportion of relations. The number of possible binary relations—person knows person
—is quadratic in the number
of people. Suppose the number of pairs who know each other is
where
is a small but fixed constant. (Note: we will redo this with something more reasonable below.) If we then scale
down to
, the situation becomes:
-
If we estimate the relatedness of our city, we get
.
-
But if we scaled down the number of
-knows-
relations directly we would get
, which is substantially bigger by a factor of
.
Thus what the scaling really underestimates is the impact of how people are affected by their loved ones being among the 182 million (or the worse numbers). The underestimation logic applies to any form where
. This is not an issue with the budget because dollar bills don’t feel relatedness—at least not so much, even absent a line-item veto. Now we will make values of
approaching
more reasonable.
Small World: Tao and Strogatz Again
Let’s consider people and
who have
degrees of separation in the graph of who-knows-who. Then there are
and
who know each other such that
knows
and
knows
. If something strikes
and
, then
and
will feel a deep connection by that impact. The feeling is amplified if
and
have multiple pairs
and
that make a path. This quantifies the relation that Tao calls “awareness” in his “Lecture Notes 3 FOR 254A” course notes (pages 51–57 overall). A fact way more basic than what Tao is actually talking about in those notes is the following:
The sum over pairs
of the number of pairs
between them who would be affected equals the sum over edges
of the number of pairs
they can be struck by: both are equal to the number of paths of length
in the graph.
That number of paths is what we say is reasonable to model by a function with
. This is the simplest way of approaching why we feel that treating the pandemic like the US budget underestimates its human effect. One can still rebut that other kinds of illness and death have the same scaling properties, but ultimately we are talking about the excess caused by the pandemic—the effect on top of everything else.
We have not tried to make this analysis become rigorous using more-realistic models of human networks. Perhaps our readers can point us to such analysis. But one inkling of why we expect our point to be borne out comes from a key conclusion of the famous 1998 paper of Strogatz with Duncan Watts on ‘small-world’ networks: The phase transition from a lattice network with large average distances to a small-world network with small distances takes place in a range where small clusters cannot recognize it happening locally.
Thus, if our scaling carried the intuitive picture of an isolated city, it would miss the expanding sphere of relations. At the opposite extreme would be taking the union of Monaco and central Venice, which sum to 100,000 people who fan out mightily. There is also the argument that while small-world networks are held tight by “weak ties,” the shared knowledge of misery is something that most tends to strengthen ties. And of course, our point about relations extends to many other activities impacted by the pandemic.
Open Problems
What should govern the appropriateness of scaling down?
It should be noted that if we scale down the US to 100,000 people, the numbers are appreciably higher:
- 34.5 million total cases become 10,366.
- 4,928,564 active cases become 1,480.
- 620,000 total deaths become 186 deaths.
- 3,833 currently listed in critical or serious condition still become exactly one.





My new paper: Prime Optimization:
http://www.iapress.org/index.php/soic/article/view/1063/770
I often wish there was some snappier phrasing to the observation, “some things aren’t scale invariant”.
I wonder whether the failure-of-scaling argument has also been used in understanding the reach of social media and its mis/informative characteristics.