Hacker News new | past | comments | ask | show | jobs | submit login
Bogo-bogosort (dangermouse.net)
64 points by luu on Dec 28, 2013 | hide | past | favorite | 32 comments



This seems completely pointless. The elegance of bogosort is that it's an extremely simple algorithm, with a simple description of "randomize until it's sorted". Bogobogosort is complicated for no apparent reason. It's trying to be cute and clever, but there's no rationale for why additional complexity is being added.

Bogobogosort seems in the end to be no more worthwhile than sleepybogosort, where you must sleep() in between every operation.


If all you have is a hammer, everything looks like a nail. It's for this reason that computer scientists created bogobogosort (and sleepybogosort, as you mentioned). For example, bogobogosort tests memory cards more efficiently than a smaller-space algorithm like bogosort. In this case, the recursive nature of bogobogosort acts as space multiplier and fully fills the memory cards (be it DIMM or others). Don't be so quick to judge an algorithm just because it's new.

Also, I'm working on a Bogo programming language with randomized compilation; if anyone is interested in contributing, let me know.


I think half the point of this is the straight-faced, detailed examination of exactly how terrible this proposed sort is.

Or at least that's the part that made me start giggling...


No, since the complexity of bogosort with sleep is still O(n!), while the algorithm from OP has a higher bound.


Depends on how you do the sleeps.


This got me thinking about a more generalized approach to this sort of thing.

Given a task where you can check completion somehow, you can then solve it using the following procedure:

1. Generate a random program. (This could be a Turing machine, bytecode, C source, whatever.)

2. Execute the program on the input for n steps, where n is an incrementing counter.

3. Execute the given check on the output. If the check passes, return the output.

4. Increment n and go back to 1.

The restriction on n steps avoids running afoul of the Halting Problem.

The check has to be a little more rigorous than the one used in Bogosort. It can't just check to see if the output is in order, because the output may contain completely different numbers, since the program could be doing anything. You'd have to check not only for order but also that the output is in fact a permutation of the input.

I wonder what the running time of this algorithm is. I wonder if it can even be calculated.


(Assuming the programs are of length n generated using random ascii characters 0-127, and the number n starts out at 1, and the next time around is 2, then 3, and so forth.)

We are splitting the programs generated this way into two piles.

First pile are all the programs that don't compile and all the programs that compile and don't rely on size of the array that we want to sort. A very small number of programs compile . As n grows larger the number of programs that compile grows smaller as the chance of a random character error grows larger factorially. As n goes to infinity the ratio of programs that don't compile to programs that do, becomes divergent; in other words: the number of programs that don't compile tends to infinity and programs that do compile tends to zero. Therefore we can assume every program in first pile is a program that doesn't compile.

The second pile are the programs that compile and actually rely on the size of the array. The complexity of these programs is dependent on their algorithm. We are assuming the worst case where every algorithm is of complexity O(n!).

But errors in the programs from second pile also grows factorially. So the number of programs in second pile compared to first also approaches zero as n goes to infinity.

Complexity of programs in the first pile, since they don't rely on the input( where input is the size of the array ), is constant. Where the constant is the number n( steps ).

As n tends to infinity the number of the seconds programs tends to zero and their complexity becomes negligible. Therefore the total complexity becomes either O(1) or O(inf).

If you choose the n to stop at a finite number, then the complexity of the program is O(1)

If you choose n never stop, to become infinite then the algorithm never ends.

Feel free to comment, please refer to specific parts of the solution.

edit: Complexity of programs that fails to compile, or compile and don't rely on the size of the array is O(1), since that specific program always executes in the same time as it takes no variable input that would change regarding to size of the array.


> If you choose n to be a non-infinite number, then the complexity of the program is O(1) If you choose n to be infinite then the complexity is O(inf)

I'm not sure I understand this, or you've understood me. The number n starts out at 1, and the next time around is 2, then 3, and so forth.


My explanation is not to be understood easily( that doesn't mean it is very clear( or good ) ). Your best bet here would probably be to study big-o notation(+math, function limits) and think of the solution for yourself( there definitely is one ), as it can become quickly confusing as to what are you measuring really.


I understand big-O notation just fine. I asked a specific question about your post, and nothing that indicated I misunderstood the principle in general. You spoke of "choosing n" to be either infinite or non-infinite, which doesn't make sense to me given the algorithm I proposed.


I didn't explain that( was edited ), of course you start at with n at 1,2,3... and so forth. But you can decide when to stop, and so I made two cases.

Anyway; you gave me a good exercise to practice on. Thanks.


We both came up with nearly the same idea at nearly the same time (within one minute) - what are the odds of that?

My solution is slightly different - to avoid the Halting Problem, you can use a distributed approach. You have a very large (but finite) number of processors that you pass your programs to.


"what are the odds of that?"

I'm not sure, maybe we should build an unbelievably inefficient program to calculate them.


But is it really inefficient if it is efficient at being inefficient?


Why not just pass it to an infinite number of processors and use a nondeterministic Turing machine?

But with an NDTM, even NP-complete problems can be solved within polynomial time.


My favourite is still Intelligent Design Sort:

http://www.dangermouse.net/esoteric/intelligentdesignsort.ht...

O(1) run time!


Genius :)


My favorite line: "as anyone who knows anything at all about computer science knows, recursion is always good and cool."


I snort in fake outrage that nobody has seen fit to reference the definitive work on this topic: http://ivanych.net/doc/PessimalAlgorithmsAndSimplexityAnalys... (Check out section 5 in particular for today's topic.)


Here is a "distributed" sorting algorithm that requires even less thinking - you don't even have to write a shuffling algorithm!

1) Start with an empty string.

2) Increment the string in such a way that will eventually generate all strings.

3) Give the string (and your unsorted array) to a processor that attempts to run the string as a program and go back to step 2.

When a processor is given a program, if the output contains all the elements of A and the output is sorted, stop everything! You've sorted the array.

It requires O(n^m) processors, where n is the number of characters possible in a program, and m is the string length of the code that finds the solution. Lots of the processors will be stuck in infinite loops.


I have a soft spot in my head for turdsort, which attempts to optimize over bogosort, but ends up being no better.

  #!/usr/bin/python
  from random import shuffle

  def turdsort(a):
    def turds(a):
      n = 0
      for i in xrange(len(a)-1):
        if a[i] > a[i+1]:
          n += 1
      return n

    count = 1
    n = len(a)
    t = turds(a)
    while n > 0:
      while t >= n:
        shuffle(a)
        t = turds(a)
        count += 1
      n = t
    return count

  if __name__ == '__main__':
    a = range(10)
    print a
    print turdsort(a)
    shuffle(a)
    print a
    print turdsort(a)


There is also the slightly less pointless ordersort, which works only on permutations. The complexity of ordersort is given by the Landau function. It is more efficient than bogosort, however.

  #!/usr/bin/python

  def gcd(a,b):
    while b:
      a, b = b, a % b
    return a

  def order(a):
    """compute the order of the permutation a"""
    lcm = 1 
    for i in range(len(a)):
      j = a[i]
      while j > i:
        j = a[j]
      if j == i: # i is a cycle leader
        j = a[j]	# get next element of cycle
        cyc = 1;	# the cycle has length at least 1
        while j != i:   # the cycle hasn't closed
          cyc += 1
          j = a[j]
        lcm = (lcm/gcd(lcm,cyc))*cyc
    return lcm

  def ordersort(a):
    ord = order(a)
    print ord
    b = range(len(a))
    while ord > 0:
      b = map((lambda i: a[i]), b)
      ord -= 1
    return b

  if __name__ == '__main__':
    from random import shuffle
    a = range(50);
    shuffle(a)
    print a, ordersort(a)


My addition to the algorithm: Every time your check if arrays is sorted and it isn't, start from the very beginning throwing away all progress made so far.

I doubt this way n == 6 would finish in some normal time period.


bogosort does that already.

I ran it on an the array [0,1,2,3,4,5,6] and it took between 3 hundred million and 1 billion operations (by my marginally accurate counter) and between 6 and 20 minutes, but mine was in python, not C.


Welcome to HN.

I was referring to OPs algorithm which was bogo-bogosort.

At first I though my change would not increase the complexity, but I think it is dependent on size of input since for every recursion there is a chance to fail and start completely over, where the number of recursion is dependent on input size. Therefore it adds to complexity.


You say it will take on average n! attempts to find a sorted list randomly. This is false. It will take on average n!/2 attempts.


O(n!/2) == O(n!), the whole paragraph is obviously referring to big o notation.

edit:( I have edited the post to respond to your comment, trying to clarify what OP meant; I in no way tried to make your comment "look silly" )


Yes, but that's not what he says. He says "The loop will repeat on average n! times". That's not true. He's not referring to O notation there, he brings that in later. Here he's talking about the imperative number of times a loop will run on average.

Edit: you've edited your paragraph to make my reply look silly. The whole paragraph is not in O notation. In fact - the exact opposite. In the last sentence of that paragraph he shows what the expression with constant factors looks like before converting it into O notation, and it's wrong! "The product (n-1)n! is O(n × n!)." Bzzzzzzt! Wrong! He should say "The product (n-1)(n!/2) is O(n × n!)."


True, but that's still O(n!), so the overall complexity doesn't change.


The mean would certainly be n! (note that it is possible, and even likely, that you may create the same sorting multiple times by random chance).

The median, which I don't think you're referring to, is something else (I imagine it would be substantially less than the mean).

(See http://en.wikipedia.org/wiki/Geometric_distribution)


Isn't this how evolution works?


No.

Evolution doesn't involve resets - the hint is in the name.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: