Wednesday, May 28, 2014

Travelling Salesman

Random Path Finding: Travelling Salesman

I was reading this bit of article about quantum computing where there is a commercial quality version being sold for $10 million dollars. Of course, quantum computing promises a lot of performance gain, but in this case, D-WAVE, is specific for optimisation algorithm.

In brief, optimisation algorithm can be described as "hill climbing" algorithm. Basically, you're looking for the highest point in the area. So, as long as there is higher ground, you keep going up until you reach the highest point. The problem is that you will encounter local hills, that is the highest in the area, but not the highest overall. So, you have to watch out for "false peak".

Another problem is when you want to find a path through the area, and you want to find the highest overall path, or maybe the lowest. When the hills and valleys represent productivity and cost, then the path you have represents the highest productivity and the lowest cost.

This is all very desirable in computing. The payoff is immense and immediate. So, efficient path finding algorithm is extremely desirable in computing industry. An example of this is the travelling salesman problem where he needs to find the least distance to travel in order to work his route. This is applicable not just to travel, but also in manufacturing where the least distance a tool traverse, be it cutter, welder, or packer, is desirable.

There are two main approaches in order to solve this problem. I will call this the "forest" and the "tree" approaches. This is just a simplification of the process. There is another approach that I will describe later, but for now, let's focus on these two methods.

The forest method looks at the overall picture of the area and try to find the absolute best path possible. The simple, traditional way to do it, is to just do permutation of all possible path, and decide which one is the best. It has the good point of finding optimal solution. The bad point is that it is extremely costly in terms of computing power. Permutation has factorial cost.

The tree method looks at immediate surrounding and simply choose the best solution that is immediately present. The good point is that it is extremely cheap to do. The bad point is that it is susceptible to false peaks, dead ends, and other traps.

So, most people who are serious in solving the travelling salesman problem would do a compromise. They take a section of the local area and solve that optimally using the forest method, and by connecting several sections, they solve the whole area using the tree method. It works, and works very well. It is not the optimum method, but it is a good enough solution.

Apparently, that's not what the quantum computing promises. Due to the extremely efficient computation power available to the quantum computer, they instead solve it by "random" method. That is, they simply lay a path haphazardly, then they jiggle the path to see if a better path presents itself. Repeat that many, many times and eventually, you'll come up with a pretty good solution.

What the quantum computer can do, however, is to just lay down thousands of path all at once, and therefore be able to find the optimum path extremely quickly. There is a problem in reading the quantum result, but we'll ignore that for now. So, what it boils down to, is simply trial and error method. Is that wise?

The proponent of the method points out that such trial and error method is desirable where the problem is so difficult that it is not solvable. I concede the point of the solution, but I am not convinced that the path finding problem is "difficult".

So, I will point out another high respected "random stab" method that has since fallen out of favor: Neural Network. The field of neural network was very interesting with the proponent promises fantastic dreams. Yet, all it boils down to is a few mathematical functions. So, you have one gigantic mathematical equation, and what you do is twiddle the parameters until the error rate is near zero. There is a very sophisticated algorithm that you use in order to determine the parameters, but it looks random to me.

There is another alternative parameter determination method. That is the "Bayesian" method: Fire together, wire together. This is the tree method, where as the former was the forest method. As you can surmise, the Bayesian method is very quick and effective. Will that effect run into false hill problem? So far that hasn't happened because the problem isn't about path, but of pattern recognition. So, the domain of the problem is limited, and it turns out that Bayesian algorithm is perfect for such thing. However, path finding is beyond Bayesian algorithm to solve.

So, this brings me to another idea: Bogo sort. For those who doesn't know Bogo sort, it's also called Clown sort. What you do, basically, is to throw out all the elements up in the air, and if they fall down sorted, you're done. Obviously, that will take an extremely long time to complete, and there is no guarantee that it will finish the job. This is the forest method.

So, if that's the forest method, what about  the tree method? Turns out, it's bubble sort, where you simply compare the elements that are next to each other. This is also slow, because you're comparing one to another repeatedly. It's still faster than Bogo sort, at O(N^2), but slow when compared to other method.

And here's the interesting thing: I found out that if you combine these two slowest sorting algorithm, you will speed up the execution speed appreciably. It's still not the greatest, but it's a good step up. Here's the algorithm:

Random Sort:
1. For each element on list
1.1. Randomly compare this with other elements, swap if out of order
2. Repeat until sorted.

Bubble Sort:
1. For each element on list
1.1. Compare this element with the next one, swap if out of order.
2. Repeat until sorted.

Random Bubble Sort:
1. For each element on list
1.1. Randomly compare this with other elements, swap if out of order
1.2. Compare this element with the next one, swap if out of order.
2. Repeat until sorted.

There you have synergy where the whole is greater than the sum of its part. By combining the two approaches, the problem of randomness is eliminated by fastidious bubble sort, and the problem of slowness is eliminated by large random jumps. Combined together, it represents a large cost reduction available. There is a better solution than this, and that is comb sort. But that's topic for another time. Let's focus on this random solution for now.

Going back to the original travelling salesman problem, what does this have to do with path finding? It turns out that at its heart, the travelling salesman problem can be looked at as a sorting problem. If you take the distance between two points, and you compare them, then you can see that a comparison sort can yield a rather quick solution. A bubble sort TSP will yield interesting result, indeed.

The problem with bubble sort TSP is easy to see: Being the tree solution, it falls foul to the false peak problem. A random sort TSP, however, will not fall so easily. A permutation (sort) TSP will yield optimum solution, but as we can see, extremely expensive solution.

So, a "random stab" method of TSP, in effect is inefficient because it lacks that self-organization factor that sorting does. However, that problem is alleviated by using Random Bubble sort on TSP:

1. Create a lookup table for distance travelled between two points.
2. Do Random Bubble Sort, using the lookup table for comparison.
3. Repeat until time runs out or no improvement to the solution.

The key point here is "random". You always want to jiggle the points repeatedly to see if reversing the order of visiting two towns will yield better result. Eventually, a path will present itself that will consists of points that are next to each other. It will automatically allow backtrack, too. Will it be optimum? Maybe not, but it will be good enough.

Obviously, there is room for improvement. The obvious one is to lay down all the points in order when comparing distant points containing various points enroute. Another improvement is to restrict points consideration to smaller and smaller distance. But that is for further research and not for this session.

Another question would be if another sorting algorithm will work just as well. The answer to that is no. The more efficient sorting algorithm will not work because it lacks the "random jiggle" method. We want a random algorithm that constantly scans for better solutions.

And yet, there will come a time where the solution is fixed, and it will not change. Will that be optimum? Chances are, it won't. However, it will be close enough and in quick enough time. Whenever you are curious, simply run the program multiple times and see if the solution changes.

And there you have it. A random bubble sort may not be desirable in sorting problem domain. However, as it turns out, it's extremely desirable in path finding domain. It's all about mating the tools to the problem. That is all.