Worstsort

kccqzy · on March 16, 2019

I'll give it a try at defining the author's intuition of "doesn’t cheat" in the sense that it is always "making real progress". I'll define it as every intermediate value that is generated in the computation is eventually consumed. If a function, say, computes Ackermann(5,5) and throws away the result and then starts the actual sorting process, then the Ackermann(5,5) is unused and would be considered cheating. In a sense we can draw a data-flow diagram of the computation and everything has at least one outgoing edge, except perhaps the final output.

XMPPwocky · on March 16, 2019

Problem is we can define a function mix(a, b) which intermingles two arguments and produces a new value dependent on both- for example, sha256(a.concat(b)).

Then we can take any randomized algorithm and, for its random values, use mix(real_rng.random(), Ackermann(5, 5)).

hyperpape · on March 16, 2019

Another “don’t cheat” intuition is “you can’t statically know that a function call prior to the final result produces the correct answer”.

This fails that criterion, as you can just throw away all the calls except badsort 0.

My criterion isn’t well defined, because what you can statically know isn’t well defined, but I know a cheater when I see one :)

conistonwater · on March 16, 2019

Reminds me a little of reluctant search from Pessimal Algorithms and Simplexity Analysis (http://www.bowdoin.edu/~ltoma/teaching/cs231/fall13/Material...)

tluyben2 · on March 16, 2019

Are there competitions for this? If not, I think there should be; designing the worst possible, but provably terminating and finite and non cheating (as in the author his definition which I think is a good one) versions of algorithms. The proof has to be provided by the author ofcourse.

praptak · on March 16, 2019

There's the mathematical competition to define the largest precisely defined number, i.e. "largest number defined by humans so far, plus one" is not ok, but "largest finite number of cells visited by Turing machines of 7 states on an initially zeroed tape" is ok. It's obviously about finding new tricks rather than rehashing the existing ones.

It seems like the tricks from this competition would be reusable for the slowest algorithm competition.

kccqzy · on March 16, 2019

That seems like, in many cases, reducible to finding the biggest number, and then using that number in some way in the algorithm.

pbhjpbhj · on March 16, 2019

Iterate once with each other worst algo, compare the lists, consider matching moved locations to be sorted and choose that as the input for the next iteration.

Wouldn't that necessarily be slower than all other algos, and also demonstrably be sorting, and - if each other algo's termination is bounded - have a bounded termination time.

It might not terminate for all lists, in theory; you could have a watchdog to cure that I think.

grenoire · on March 16, 2019

Like GCP computing pi this year, they can brag about the immense amounts of hardware they throw into it but still take a year to sort a 5-element list.

lelf · on March 16, 2019

So

  sort ≡ head ∘ sort ∘ permutations

I wonder if there is some neat (and totally useless) algorithm for sort using that info.

ddebernardy · on March 16, 2019

When wondering what's the point in trying to find bad sort algorithms, I shed a few tears thinking about a few StackOverflow questions in its early days where a programming beginner would ask for basic algorithm implementations and a few mischievous answerers would suggest things like boggle sort.

nicoburns · on March 16, 2019

https://gkoberger.github.io/stacksort/

quickthrower2 · on March 16, 2019

A bad sort algorithm makes you look like a hero when you do some “performance improvements” later on.

ginko · on March 16, 2019

Would a Haskell implementation of the second step (recursively call badsort and pick the first) actually exhaust all permutations of permutations before picking the first item, or would this have O(n) (or at least less than O(n!)) complexity due to lazy evaluation?

majewsky · on March 16, 2019

The foldl in `badSort 0` has to evaluate its entire argument list. As far as I can see, this behavior propagates through to all functions shown in the article since they all call `badSort 0` eventually.

gowld · on March 16, 2019

It's foldr, not fold. That's what makes it consume absurd stack space, creating a stack of thunks of length (n-factorial-iterated-k-times, including the nested towers of folders inside each element) before it can use any laziness to shortcut the actual comparisons applied by `insert`.

It creates the whole monolithic expression with all the parenthesis in pessimal positions, forcing all the comparisons in the wrong order (back to front, aka least-significant-bits first), which makes it impossible to use the short circuiting of lexicographical comparisons.

benj111 · on March 16, 2019

Surely worstsort would be

While list not sorted { Random(list) }

?

Sohcahtoa82 · on March 16, 2019

That's Bogosort.

https://en.wikipedia.org/wiki/Bogosort

lelf · on March 16, 2019

  for x in 3 1 2 5 4; do (sleep $x; echo $x) & done

xaedes · on March 16, 2019

I think that is equivalent to selection sort: You put all numbers into the scheduler which periodically checks all scheduled threads and selects the one which timer elapsed, i.e. it sequentially selects the scheduled tasks with increasing timer duration value.

rocqua · on March 16, 2019

That has expected running time N! loops. Far from horrible unless you make the 'is sorted' or 'randomize' calls horrible.

The great thing about this algorithm is that the same piece of code does all these things. There isn't a separate bodge for checking whether something is sorted.

pbhjpbhj · on March 16, 2019

It's not sorting though, you don't get closer to the desired order and there's no guaranteed bound on termination.

benj111 · on March 16, 2019

You kind of do get closer, every roll of the dice increases your chance of getting a 6.

Agreed theres no guaranteed bound, that's my basis for suggesting it as worst.

daveFNbuck · on March 16, 2019

Your expected number of rolls until a 6 doesn't decrease with each roll. The same is true for this algorithm. If you run this algorithm and it has an expected 10 days runtime, you still have an expected 10 days left after 5 days of failure.

The worstsort in the post has a much worse expected runtime than the sort you described. You could make the bound not guaranteed by randomly deciding whether to throw out the result and start over at the end.

benj111 · on March 16, 2019

If I roll a die the chances are 1 in 6 yes, but every throw is another bite of the cherry. If I throw the die 10 times, the chances of one of those throws being a 6 is greater than 1 in 6. The same with this 'sorting' algorithm.

Yes you could add a random element to any sorting algorithm to make it slower. I suppose you could argue with my algorithm, the primary purpose of the random wasn't to slow the algorithm down, the algorithm would totally break!

I'm not sure it even matters though. My primary point was that a random algorithm can be worse than just a slow algorithm. With Worstsort, you can be fairly certain that the heat death of the universe will interupt you. My algorithm you don't know, it might be instant, it might never finish.

More concretely, do you want a game that runs at a steady 30fps, or a game that goes from 5fps to 30000fps, with an average of 60fps ?

zbentley · on March 16, 2019

That's not accurate: https://en.wikipedia.org/wiki/Gambler%27s_fallacy

benj111 · on March 16, 2019

Gamblers fallacy is me rolling a die 10 times and so expecting the 11th time to be anything other than a 1 in 6 chance.

That isn't what I'm saying. I'm saying if you roll a die 10 times, the chance of any one of those rolls being 6 is greater than 1 in 6. It is statistically likely if you roll a die 10 times, that you will roll a 6 at least once.

daveFNbuck · on March 17, 2019

That's not the same thing as making progress though. You never get closer to rolling a 6.

benj111 · on March 17, 2019

I said "kind of", because there isn't a direct link no.

But my chances still increase. If I roll that die once, I know my chances of rolling a 6 are poor, if I roll it 10 times the odds would be better, a million times almost certain.

If my chances of success are steadily increasing, why is that not progress? (especially for an algorithm based solely on chance).

daveFNbuck · on March 17, 2019

Your chances aren't steadily increasing. The chance you finish within a week is higher than the chance you finish within a day, but your chance is no higher after a day than before it.

Consider an alternative algorithm based solely on chance. Pick an element uniformly at random and if it's larger than the element after it, swap them. This algorithm has progress, as the array becomes monotonically more sorted over time, decreasing the expected amount of time until the algorithm is done. This is what it looks like to make progress. Your algorithm does not have that feature.

benj111 · on March 17, 2019

"The chance you finish within a week is higher than the chance you finish within a day, but your chance is no higher after a day than before it."

You seem to be ignoring the fact you've just been spending a day rolling die, where if you had have rolled a 6 you wouldn't be back the next day.

That's survivorship bias.

daveFNbuck · on March 18, 2019

I never said you couldn't eventually win. We're talking about whether you make progress leading up to the win. You could win at any time, but you never get closer to winning. If you had a progress bar based on how long left until you win, it wouldn't move until the moment you won.

daveFNbuck · on March 17, 2019

For real-time applications, worst case can be much more important than the average case. However, not all computation is real-time. Sometimes the average rate is more important than the worst-case.

If we're talking about practical applications like a video game, I'd still say that an algorithm that won't finish within my lifetime is strictly worse than one that might.

benj111 · on March 17, 2019

Lol. If we're talking practical applications, the it doesn't really matter!

Agreed, that sometimes the average rate is fine. That just legitimises my disagreement over the 'worst' sort, there is no worst sort.

Ultimately its just down to semantics. The author stroked his beard and thought about the absolute slowest sort, and called it the worst. I stroked my (metaphorical) beard and found a different kind of worst. Both should be avoided in practical applications where possible.

His is cleverer yes, mine is stupid, that's probably the most relevant metric, and probably why no one submitted a sort that returns the wrong answer, which I would claim could also be 'worst'.

daveFNbuck · on March 17, 2019

> Lol. If we're talking practical applications, the it doesn't really matter!

Then why did you bring up fps in video games?

> there is no worst sort.

Of course there is no worst sort. But some sorts are worse than others. The disagreement here is that your sorting algorithm is clearly better than Worstsort.

If all randomized algorithms with unbounded runtime are worse than worstsort, then flipping a coin until you get heads and then running a fast sort algorithm would be worse than Worstsort. If you instead take a more reasonable approach like comparing expected runtime, Worstsort is clearly worse than bogosort.

pbhjpbhj · on March 16, 2019

The rolls are independent; after an unsuccessful roll your chance P(6) is identical to when you started.

>My algorithm you don't know, it might be instant, it might never finish. //

If it might never finish then it's not [completely] sorting IMO. But I'm a long way off being a comp. sci..

daveFNbuck · on March 17, 2019

An algorithm that guarantees correct answers but has random runtime that may even be unbounded is called a Las Vegas algorithm.

https://en.wikipedia.org/wiki/Las_Vegas_algorithm

Retra · on March 16, 2019

Even then, you could do worse by actively avoiding the sorted state with a bad distribution.

yarg · on March 16, 2019

See kccqzy's comment.

benj111 · on March 16, 2019

That would be consistent for a consistent input though? (I'll admit the maths is beyond me)

Random wouldn't have that 'advantage'.

I'd go for a consistently bad algorithm over a randomly bad algorithm, although that is subjective.

sometimesijust · on March 16, 2019

The use of "foldr insert []" makes this redundant nonsense.

Hackbraten · on March 16, 2019

I disagree. The scientifically interesting thing is that we can have such exploding complexity, while at the same time every step of the algorithm still actively works towards a solution.