This seems completely pointless. The elegance of bogosort is that it's an extremely simple algorithm, with a simple description of "randomize until it's sorted". Bogobogosort is complicated for no apparent reason. It's trying to be cute and clever, but there's no rationale for why additional complexity is being added.
Bogobogosort seems in the end to be no more worthwhile than sleepybogosort, where you must sleep() in between every operation.
If all you have is a hammer, everything looks like a nail. It's for this reason that computer scientists created bogobogosort (and sleepybogosort, as you mentioned). For example, bogobogosort tests memory cards more efficiently than a smaller-space algorithm like bogosort. In this case, the recursive nature of bogobogosort acts as space multiplier and fully fills the memory cards (be it DIMM or others). Don't be so quick to judge an algorithm just because it's new.
Also, I'm working on a Bogo programming language with randomized compilation; if anyone is interested in contributing, let me know.
This got me thinking about a more generalized approach to this sort of thing.
Given a task where you can check completion somehow, you can then solve it using the following procedure:
1. Generate a random program. (This could be a Turing machine, bytecode, C source, whatever.)
2. Execute the program on the input for n steps, where n is an incrementing counter.
3. Execute the given check on the output. If the check passes, return the output.
4. Increment n and go back to 1.
The restriction on n steps avoids running afoul of the Halting Problem.
The check has to be a little more rigorous than the one used in Bogosort. It can't just check to see if the output is in order, because the output may contain completely different numbers, since the program could be doing anything. You'd have to check not only for order but also that the output is in fact a permutation of the input.
I wonder what the running time of this algorithm is. I wonder if it can even be calculated.
(Assuming the programs are of length n generated using random ascii characters 0-127, and the number n starts out at 1, and the next time around is 2, then 3, and so forth.)
We are splitting the programs generated this way into two piles.
First pile are all the programs that don't compile and all the programs that compile and don't rely on size of the array that we want to sort. A very small number of programs compile . As n grows larger the number of programs that compile grows smaller as the chance of a random character error grows larger factorially. As n goes to infinity the ratio of programs that don't compile to programs that do, becomes divergent; in other words: the number of programs that don't compile tends to infinity and programs that do compile tends to zero. Therefore we can assume every program in first pile is a program that doesn't compile.
The second pile are the programs that compile and actually rely on the size of the array. The complexity of these programs is dependent on their algorithm. We are assuming the worst case where every algorithm is of complexity O(n!).
But errors in the programs from second pile also grows factorially. So the number of programs in second pile compared to first also approaches zero as n goes to infinity.
Complexity of programs in the first pile, since they don't rely on the input( where input is the size of the array ), is constant. Where the constant is the number n( steps ).
As n tends to infinity the number of the seconds programs tends to zero and their complexity becomes negligible. Therefore the total complexity becomes either O(1) or O(inf).
If you choose the n to stop at a finite number, then the complexity of the program is O(1)
If you choose n never stop, to become infinite then the algorithm never ends.
Feel free to comment, please refer to specific parts of the solution.
edit: Complexity of programs that fails to compile, or compile and don't rely on the size of the array is O(1), since that specific program always executes in the same time as it takes no variable input that would change regarding to size of the array.
My explanation is not to be understood easily( that doesn't mean it is very clear( or good ) ). Your best bet here would probably be to study big-o notation(+math, function limits) and think of the solution for yourself( there definitely is one ), as it can become quickly confusing as to what are you measuring really.
I understand big-O notation just fine. I asked a specific question about your post, and nothing that indicated I misunderstood the principle in general. You spoke of "choosing n" to be either infinite or non-infinite, which doesn't make sense to me given the algorithm I proposed.
We both came up with nearly the same idea at nearly the same time (within one minute) - what are the odds of that?
My solution is slightly different - to avoid the Halting Problem, you can use a distributed approach. You have a very large (but finite) number of processors that you pass your programs to.
Here is a "distributed" sorting algorithm that requires even less thinking - you don't even have to write a shuffling algorithm!
1) Start with an empty string.
2) Increment the string in such a way that will eventually generate all strings.
3) Give the string (and your unsorted array) to a processor that attempts to run the string as a program and go back to step 2.
When a processor is given a program, if the output contains all the elements of A and the output is sorted, stop everything! You've sorted the array.
It requires O(n^m) processors, where n is the number of characters possible in a program, and m is the string length of the code that finds the solution. Lots of the processors will be stuck in infinite loops.
I have a soft spot in my head for turdsort, which attempts to optimize over bogosort, but ends up being no better.
#!/usr/bin/python
from random import shuffle
def turdsort(a):
def turds(a):
n = 0
for i in xrange(len(a)-1):
if a[i] > a[i+1]:
n += 1
return n
count = 1
n = len(a)
t = turds(a)
while n > 0:
while t >= n:
shuffle(a)
t = turds(a)
count += 1
n = t
return count
if __name__ == '__main__':
a = range(10)
print a
print turdsort(a)
shuffle(a)
print a
print turdsort(a)
There is also the slightly less pointless ordersort, which works only on permutations. The complexity of ordersort is given by the Landau function. It is more efficient than bogosort, however.
#!/usr/bin/python
def gcd(a,b):
while b:
a, b = b, a % b
return a
def order(a):
"""compute the order of the permutation a"""
lcm = 1
for i in range(len(a)):
j = a[i]
while j > i:
j = a[j]
if j == i: # i is a cycle leader
j = a[j] # get next element of cycle
cyc = 1; # the cycle has length at least 1
while j != i: # the cycle hasn't closed
cyc += 1
j = a[j]
lcm = (lcm/gcd(lcm,cyc))*cyc
return lcm
def ordersort(a):
ord = order(a)
print ord
b = range(len(a))
while ord > 0:
b = map((lambda i: a[i]), b)
ord -= 1
return b
if __name__ == '__main__':
from random import shuffle
a = range(50);
shuffle(a)
print a, ordersort(a)
My addition to the algorithm: Every time your check if arrays is sorted and it isn't, start from the very beginning throwing away all progress made so far.
I doubt this way n == 6 would finish in some normal time period.
I ran it on an the array [0,1,2,3,4,5,6] and it took between 3 hundred million and 1 billion operations (by my marginally accurate counter) and between 6 and 20 minutes, but mine was in python, not C.
I was referring to OPs algorithm which was bogo-bogosort.
At first I though my change would not increase the complexity, but I think it is dependent on size of input since for every recursion there is a chance to fail and start completely over, where the number of recursion is dependent on input size. Therefore it adds to complexity.
Yes, but that's not what he says. He says "The loop will repeat on average n! times". That's not true. He's not referring to O notation there, he brings that in later. Here he's talking about the imperative number of times a loop will run on average.
Edit: you've edited your paragraph to make my reply look silly. The whole paragraph is not in O notation. In fact - the exact opposite. In the last sentence of that paragraph he shows what the expression with constant factors looks like before converting it into O notation, and it's wrong! "The product (n-1)n! is O(n × n!)." Bzzzzzzt! Wrong! He should say "The product (n-1)(n!/2) is O(n × n!)."
Bogobogosort seems in the end to be no more worthwhile than sleepybogosort, where you must sleep() in between every operation.