> Tao used this weighting technique to prove that almost all Collatz starting values — 99% or more — eventually reach a value that is quite close to 1. This allowed him to draw conclusions along the lines of 99% of starting values greater than 1 quadrillion eventually reach a value below 200.
Isn't this equivalent to saying that "99% of starting values greater than 1 quadrillion eventually reach 1"?
I mean, once you reach a value below 200 then you will continue and reach 1. Not only below 200, but below any limit that was experimentally verified (i.e. around 10^20)
If Colmin(N) is the lowest value reached by iterating the process from initial value N, Tao proved that
Colmin(N) < f(N)
for arbitrary f provided f tends to infinity, for almost all N. Here, "almost all" means something like "exceptions(N) / N, tends to 0 for large N", where exceptions(N) is the count of values that do not obey the inequality [0]
So it's an asymptotic result. The first million integers could all be exceptions - but eventually the proportion of exceptions dies out.
The ability to pick arbitrary f is very powerful. Pick the slowest-growing function you can think of. e.g. Tenfold-iterated logarithm. The inequality says for all but a negligible fraction of integers, Colmin grows slower than that function.
[0] Nitpick: I'm describing the natural density, but Tao needs the logarithmic density, where each exception n is weighted by 1/n.
If non-computable functions are on the table, define f(n) to be the smallest k such that BB-k >= n, where BB-n is the n-th Busy Beaver number: https://en.wikipedia.org/wiki/Busy_beaver
Edit: looks like @tromp managed to reply before I added the bit about BB-n. :-)
Non-computable functions are indeed on the table as long as they tend to infinity in the limit..at the cost that figuring out when you start to hit that limit is undoubtably itself non-computable.
Actually it's a fair nitpick because logarithmic density 0 does not imply natural density 0 - so the claim in the article is misleading in more than one way.
For example, the set of numbers with a perfect square number of digits has log density 0 but natural density non-existent (inf 0 and sup 0.9). I believe it holds in the other direction though - natural density 0 implies log density 0.
Yes, I believe the key phrase was ”along the lines of” and Tao’s result doesn’t actually prove any particular constant bound which would indeed be a much stronger result.
I don't like the phrasing the article used. The domain of the Collatz function is the set of positive natural numbers, which is countably infinite. Unless I'm misunderstanding the quote, it doesn't make sense to refer to percentages of infinite quantities.
In the formal sense of the term, "almost always" means that a property applies to all elements of an infinite set, with all
exceptions comprising a subset with a smaller cardinality than the original set.
I usually see this term applied in the sense of measure theory, where the sets under study are uncountably infinite (or greater). Being that the naturals are countably infinite, the only way I can interpret this statement is that Tao has proven all but finitely many elements do not have an orbit terminating in 1 under the Collatz map.
Is that correct? If not, would someone mind clarifying the use of percentages here?
The natural density of a set is the limit as N goes to infinity of the proportion of the numbers up to N which are in the set.
For instance the natural density of primes, squares and cubes is 0 and the natural density of even numbers is 1/2. But all these sets are infinite so the same cardinality.
Thank you, but your last sentence is not really correct. The concept of cardinality captures different sizes of infinity. This is why I mentioned countable and uncountable infinites upfront: a countably infinite set and an uncountably infinite set do not have the same cardinality, but both are infinite.
I think your basic mistake here is thinking that sizes of infinite sets has to be measured by cardinality. Yes, that's one way, but usually not the most useful way; it's a very crude measure. Its only real advantage is the fact that it doesn't need any context to work. In most cases, though, more context-dependent measures of size are more appropriate.
For instance, typically, "almost everywhere" doesn't mean "except on a subset of smaller cardinality", it means "on a subset of measure 0".
It's easy to see why you might get these confused. In many of the typical cases of measure theory -- let's say R^n with Lebesgue measure for concreteness -- you're looking at a set of cardinality 2^(aleph_0), and any countable subset will have measure 0. (Indeed, any set of intermediate cardinality will also have measure 0, if such a thing exists, although famously the question of whether it does cannot be resolved.)
But you can also have subsets also of cardinality 2^(aleph_0) which nonetheless have measure 0. E.g., in R, the Cantor set has measure 0, although its cardinality is equal to that of R itself. If a certain statement was true except on the Cantor set, we'd still say it was true "almost everywhere".
(And all this is assuming we're using "almost everywhere" in the measure sense. Sometimes it's instead used to mean, except on a meagre set, which is a different notion; but that's not the usual meaning, so someone using it that way would hopefully say what they mean explicitly.)
In the case of the natural numbers, we frequently measure sizes of subsets by their natural density. The natural density of a subset S of the natural numbers is simply the limit of |S∩{1,...,n}|/n as n goes to infinity (this limit is not guaranteed to exist, of course). So when talking about the natural numbers, if we say something holds "almost everywhere", typically this means it holds except on a set of natural density 0.
(Remember, math terms are heavily overloaded!)
I hear there may be some people use "almost everywhere" when talking about the natural numbers to mean "except on a finite set", but I'd consider this use confusing; if that's what you mean, just say that.
Hope that clears things up.
Edit: It turns out that Tao is actually using the logarithmic density here, not the natural density. Oy. The importance of actually reading the paper, I guess...
From Tao’s blog post, it sounds like the result is more in terms of things like: for almost all n<N, the maximum value of n’s Collatz orbit is bounded by extremely slow-growing functions of N (e.g. log log log log N).
Speculation only, but the method of proof might, in its key step, only bound to ~200. This is a more meaningful result to present if your reader knows that 200 is a as good as 1 because it loses nothing and also highlights something about the bound produced.
Isn't this equivalent to saying that "99% of starting values greater than 1 quadrillion eventually reach 1"?
I mean, once you reach a value below 200 then you will continue and reach 1. Not only below 200, but below any limit that was experimentally verified (i.e. around 10^20)