Neat Algorithms - Harmony Search
Here I'll try and demonstrate a neat optimization algorithm based on the principles of performing jazz musicians by applying it to solve Sudoku puzzles. Update Sept 28th 2015: Turns out this algorithm is ballyhoo and I don't like it any more, use something else. Kind of a fun idea though. See Dennis Weyland's analysis.
Harmony Search (often abbreviated HS) is a metaheuristic optimization algorithm pioneered by Dr Zong Woo Geem. Metaheuristic algorithms like harmony search attempt to find the optimal input to some objecting measure of quality, or in other words, find the "best" solution to a given problem. Harmony search has been successfully applied to a vast array of such problems, such as the Travelling Salesman problem, water network design, and actual algorithmic music generation.
See the algorithm in action solving a Sudoku puzzle:
| 1 | 5 | 4 | 3 | 1 | 6 | 8 | 9 | 7 |
| 6 | 6 | 3 | 9 | 8 | 5 | 1 | 2 | 4 |
| 1 | 9 | 8 | 4 | 2 | 7 | 6 | 5 | 3 |
| 9 | 8 | 1 | 7 | 5 | 3 | 2 | 5 | 6 |
| 2 | 3 | 6 | 7 | 5 | 8 | 4 | 1 | 5 |
| 5 | 4 | 7 | 2 | 6 | 1 | 9 | 3 | 8 |
| 4 | 7 | 5 | 6 | 9 | 2 | 3 | 8 | 1 |
| 3 | 1 | 9 | 5 | 7 | 8 | 4 | 4 | 2 |
| 8 | 2 | 2 | 1 | 3 | 4 | 5 | 7 | 9 |
Heuristics and Fitness
Harmony search as well as the above mentioned algorithms are useful for solving a very wide class of problems. Below we'll apply it to one problem in particular, but first lets examine the role of a metaheuristic algorithm.
The central idea is that when trying to solve some given optimization problem, you have some set of input variables that can be evaluated for their quality, and you want to know what inputs produce the best quality. Metaheuristic algorithms try to find this global optimum using some strategy which is better than brute force. For problems where it is hard to decipher why changing an input changes the quality (and thus the optimal solution isn't very obvious), these algorithms are extremely useful. Harmony search and its siblings in this category do not guarantee that the globally optimal solution will be found, but often they do find it, and they are often much more efficient than an exhaustive brute force search of all input combinations.
A Basic Example
Say I have a killer exam tomorrow, and I have function which represents what mark I'll get depending on how much time I spend studying and how much time I spend sleeping. For the sake of the example, we'll say that I can spend a maximum of 10 hours doing either activity, and any time I don't spend doing either activity will be filled by normal day to day activities. The problem is I'll get burned out if I study too much, but I won't pass if I don't study enough. I could also be groggy during the exam I sleep too much, or be weary and slow if I don't sleep enough. How do I balance the time before the exam appropriately, given that I have this magical function which predicts the future?
Above is a heat map showing where the best marks are earned. You'll find the hours spent studying on the X axis and the hours spent sleeping on the Y axis, and the mark achieved encoded in the color at that point on the map. A white color represents 100%, and a black color represents a failing grade. You can see that around the edges of the map the colors are darker, indicating a poorer mark. There also appears to be a hotspot right in the middle at about 5 hours spent studying and 8 hours spent sleeping. This is easy for us to see because the data is laid out in such a way we can see the whole problem space at once, and see the exact range of marks earned just by looking at the colors. Us humans can identify a pattern of increasing marks by watching the colors get closer to white as the inputs approach that sweet spot.
The task of an optimization algorithm is to do exactly what we do with our eyes on the heat map. It can also search non differentiable functions, or functions which few assumptions can be made. Also note that this exam example is a tad silly, because every input combination is represented in that heat map, so we could write a brute force program to just try them all and find the max pretty easily and quickly. To make it even worse, the source code for the relatively simple and continuous quality function is also in this page, so just applying some first year calculus we can find the global maximum. For computationally complex functions of many more variables, or non differentiable functions, these brute force and calculus approaches aren't feasible, and we are left to find a better strategy.
Enter Harmony Search
Harmony search is one such strategy for finding an optimal set of inputs to an often complicated quality function, among others like random search, simulated annealing, genetic algorithms, and tabu search. It works by imitating the activity of musicians while improvising. The choice of which note to play next while playing as part of a trio or quartet is something which takes years to learn to do effectively, because its hard to know what notes your accompaniment might play, and its hard to know what notes might sound good or great in tandem with the others. Musicians can be seen as trying to play some set of notes simultaneously to produce a fantastic harmony, although this is a somewhat naive take on the whole thing, so don't let me ruin the magic for you.
Each musician in the ensemble is often faced with the problem of picking the next note. To do so they can reference their knowledge of the notes in the key they are playing in (what notes sound good in the context of the song), as well as the notes they've played previously (what notes sound good in the recent context). The notes they played recently most likely sounded alright, so often these are a good choice. Also, it can be wise to pick a particular note that the audience might expect and adjust the pitch of it away from the expected note to create an artistic effect and a new, potentially better, harmony.
These decisions that said bland hypothetical musician makes correspond exactly to how harmony search works. Harmony search seeks an optimal combination of inputs, just as a musician seeks a fantastic harmony. Harmony search generates "harmonies" of inputs which it then evaluates for quality, and iterates this process until it finds the best one possible. The quality of a musical harmony is analogous to the quality of a particular solution, so you might say that harmony search is trying to achieve a fantastic combination of inputs, or that musicians are trying to optimize the note selection problem using their own heuristics.
Each input to the problem is seen as a different instrument in an ensemble, each potential note one of those instruments could play corresponds to each potential value on of the inputs of the function might adopt. The musical harmony of notes is modeled as a programmatic harmony of values. We evaluate the former's quality using our ears, and the latter's using a quality function describing the problem.
Improvising New Solutions
Harmony search continues to use the musician metaphor to iteratively improve its solution. Each search iteration of the algorithm generates a new harmony to evaluate for quality. Using the note selection strategies mentioned previously, notes for each instrument, or values for each input, are selected. These inputs are fed to the quality function to determine the harmony's quality. The notes are selected for each instrument by either randomly selecting a note from the range of playable notes, selecting a note from the set of recently played ones, and/or occasionally adjusting the pitch of a note up or down.
Getting better
Each iteration a new harmony is generated, its quality is calculated, and if it makes the cut it's "included" in the musician's memory. This way, iteration by iteration, old, poor quality harmonies are kicked out and replaced by better ones. The average quality of the set of harmonies in this memory as a whole gradually increases as these new harmonies replace poor ones. The brilliance of the algorithm comes from this: the new harmonies that are generated, which you may recall often reference notes from the memory, start to use notes belonging to known high-quality harmonies. Thus, the newly generated harmonies use good notes, and often have higher qualities because of it. This process repeats, where the increasing the quality of individual harmonies generated increases the average quality of the memory, which increases the quality of the individual generated harmonies, and so on and so forth. At some point (it is hoped), the algorithm generates a harmony which meets the "fantastic" quality hoped for.
Thats it! Harmony search isn't too complicated, but its a neat algorithm inspired by some everyday natural phenomena. Read on for the code and an example application.
The Code
First, lets more formally define some terms.
- Instrument: One of the inputs to the quality function.
- Note: One of the possible values of an input.
- Harmony: A combination of each instrument playing a particular note, or in reality the set of chosen inputs for each argument to the quality function.
- Quality: A quantitative measure of a harmony's desirability, how close or far it is from the fantastic harmony, or optimal solution.
- Harmony Memory: The collection of good harmonies, stored in memory for examination by the harmony generation algorithm.
- Harmony Memory Consideration: The process of generating a new harmony using random notes from the harmony memory.
- Pitch Adjustment: The process of moving a particular instrument's note up or down
Pseudo code for the actual algorithm
- Initialize the Parameters for Problem and Algorithm.
- Initialize the Harmony Memory (HM).
- Improvise a New Harmony.
- Update the Harmony Memory if the new harmony is better than the worst harmony in the memory.
- Check the stopping criterion, and if we can continue, go back to 3.
The Parts
The algorithm, once applied to a problem, is composed of 3 main parts:
- The harmony generator, which spits out new, potentially good harmonies based on the contents of the harmony memory and the set of possible notes
- The quality function, which evaluates a particular harmony for its quality.
- The search algorithm, which moves harmonies in and out of the memory based on their quality.
I chose to encapsulate the generator and the search algorithm into a HarmonySearch class, and to make the whole thing reusable by making a Harmony class which in a particular problem would be extended to implement the quality function.
Next, we'll define the formal parameters for the algorithm:
- Harmony Memory Consideration Rate or HMCR: the probability that when generating notes for a new harmony, a note from the harmony memory will be picked, instead of just picking a random one out of the possible notes
- Pitch Adjustment Rate or PAR: the probability of randomly shifting a note up or down once it has been chosen
The Core
Below is the core of the search algorithm, which generates new harmonies and moves them into the harmony memory if they are better than the current worst.
class HarmonySearch
search: (callback) ->
@running = true
@harmonyMemory = for i in [[email protected]]
this.getRandomHarmony()
@tries = 0
[worstQuality, worstIndex] = this._getWorst()
[bestQuality, bestIndex] = this._getBest()
while true
@tries++
if @tries > @options.maxTries || bestQuality >= @options.targetQuality
break
harmony = this.getNextHarmony()
if harmony.quality() > worstQuality
@harmonyMemory.push(harmony)
@harmonyMemory.splice(worstIndex, 1)
[worstQuality, worstIndex] = this._getWorst()
if harmony.quality() > bestQuality
bestQuality = harmony.quality()
[bestQuality, bestIndex] = this._getBest()
return @harmonyMemory[bestIndex]
Exam Mark Example
Consider the exam mark problem shown above. We're trying to find the global optimum to the exam mark equation. To model this in harmony search, we ask how many instruments there are, what notes each of them can play, and how to determine the quality of the harmony produced.
In this case, the Exam.mark equation is the one we are trying to optimize. We model its input arguments as notes, and use harmonies composed of different combinations of times. There are two instruments, one for each argument to the function, and each instrument can "play" any number between 0 and 10, which are the bounds as outlined in the problem. A harmony's quality is the mark achieved when the time is spent in it's particular allotment, which we model as the evaluation of the Exam.mark function for the two notes.
The ring visualization shows the harmony memory. Each wedge represents one harmony, colored by quality (darker = lower quality, more purple = higher). Green border = best, red border = worst. As the search runs, poor harmonies are replaced by better ones, and the ring grows increasingly purple as the memory converges on the optimal solution.
Sudoku Example
Harmony search can be applied to more complex problems than simple functions like the above. Sudoku is a specific case of the graph coloring problem, one of Karp's 21 NP-complete problems. In other words, its very time consuming to brute force the solution to a sudoku by just trying random numbers and seeing if they work. There are excellent algorithms that often run faster than harmony search or any of its metaheuristic brethren which solve the sudoku using intelligent, problem aware methods and guess when needed.
These "smart" solvers are I'm sure the algorithms employed by true Sudoku software, but they rely on intimate knowledge of the Sudoku solving process and an understanding of the techniques used. We have to encode our knowledge of how to solve sudokus into a program, implementing the guessing feature, the backtracking, and all the methods for eliminating possibilities for a particular cell. Instead of developing an algorithm like this, we can use a search method to find us a solution as long as we have a heuristic to tell the quality of a given solution. By solving them in this way, we don't need to concern ourselves with finding a general method or exploring edge cases or algorithmic nuances, and we let the search algorithm figure these things out on its own. We are able to lift the burden of understanding the relationship between the input variables from our own shoulders, and instead allow the algorithm to explore these relationships itself.
Hopefully you can see the advantage of using a search algorithm for problems where the smart, human written implementation is hard or impossible to create. If we have some measure of quality for a solution, and thus a way to tell when a solution is optimal, we can let the search algorithm, well, search.
The Sudoku Model
Let's solve a particular Sudoku puzzle using harmony search. First, let us identify what the notes of a harmony are, and after, how to calculate the quality of one.
First off, notice that for any solution to be considered as such, each cell must have a value. Some of the values are given by the puzzle, and some must be decided by us. We seek a choice for each cell such that there are no conflicts, or in other words, the optimal solution to a sudoku is one which has all the cells filled in and breaks no rules.
We model the value of each one of the unknown cells as one note in a harmony, with the note's value being an integer between 1 and 9. The harmony is the chord struck when we insert each of these choices into the puzzle, and the quality of the solution is how close to a valid filled-in puzzle this solution is. The order the array of notes is entered into the puzzle doesn't really matter all that much, as long as it is consistent the algorithm will work just the same. The number of instruments is the count of unsolved cells.
Next, we decide how to evaluate the quality of a given solution. The optimal solution is the global minimum of Q, where Q is the sum of the absolute differences between the actual sums of each row, column, and box compared to 45 (the sum of 1 through 9). A correct solution for a sudoku would have Q = 0. As noted in the original paper, its important to see that the sum of a row may be 45 even though the numbers in it are not exactly the set from 1 to 9. However, if this case occurs in one row, then the sum for the columns passing through the row, or the sum for one of the boxes containing the row won't be 45, moving the final value of Q away from 0. The only way to get a row, column, and box sum of 45 is to have precisely the set from 1 - 9 in each container.
In summary, the notes for a harmony are the set of values for the unknown cells, and the quality of the harmony is the evaluation of the function Q on the generated sudoku puzzle. With these two decisions made, we can now use harmony search to find a solution (if one exists) to a given sudoku puzzle.
In the sudoku grid above, white cells are clues given by the puzzle, green cells are correctly placed values, and red cells violate row, column, or block constraints. The quality score of 135 means a perfect solution with zero violations.
Discussion
I tried quite hard to achieve similar results to those to Geem's, but I was downright stumped. Geem managed to solve the default sudoku in "285 improvisations", which to me is just absurdly low. It takes my implementation anywhere from 5000 to 50000 improvisations to find a valid solution, which is an awful lot more than 285. So I think I either made a serious mistake when implementing, a serious mistake when interpreting Geem's results, or discovered some academic fraud. I trust the inventor of the algorithm to be better at implementing it than I am, so I am pretty sure I made a blunder at some point or another.
Update Sept 28th 2015: Turns out that it may have indeed been academic fraud! Dennis Weyland has published some results which match mine here and contradict those in the original paper concerning Harmony Search's efficiency and novelty. A sad jazz trombone to you, Mr Geem. See them here: http://www.dennisweyland.net/blog/?p=12, and thanks Dennis!
The puzzle in question has 41 unsolved cells, giving a search space with 9^41 different solutions. That number has 40 digits. Its big. It's big enough that finding a solution after only 235 tries is really, really impressive. In an attempt to get my numbers down to at least the same order of magnitude, I tried precomputing the possible choices for each cell instead of letting each one be any number from 1 to 9. This is silly because it shows we don't need to use HS to solve this problem at all, because the algorithm to determine the possible choices for each cell is one that we could use to just solve the puzzle. If we can get the possible choices for a cell using some algorithm, we can just pick one choice, see if the solution works, and if not, pick the next choice, and repeat. We are implementing only the first step of the smart solving algorithm in order to make the dumb one just a tad smarter. If its possible for us to come up with an algorithm which can solve a sudoku deterministically instead of using a heuristic to search, we should most probably take the former approach.
In any case, adding in this precomputation step got the numbers down as expected because it drastically reduces the size of the search space, but still no where close to Geems. I don't know why this is the case, and I've spent an obscene amount of time trying to figure it out, but alas, I have been unable. If you can figure it out by looking at the code or just based on my (perhaps incorrect) description of the algorithm, do tell me so I can put this to rest.
Lastly, the above issues demonstrate that sudoku isn't really that good an example for a metaheuristic algorithms. We know that there are more efficient algorithms which solve them in itty bitty tiny amount of time, and unfortunately this solver algorithm isn't really that far from the quality heuristic we already have to write for HS. I also included no real strategy for dealing with unsolvable sudokus, which is a whole other class of problem. Shame on me for not dealing with these, but with this class of algorithm in particular its extraordinarily difficult. When using HS, there is no way to know if a solution exists or not until all possible harmonies have been tried. This brute force search is what we're trying to avoid by using a heuristic search. If our tries count reaches some user-defined ceiling, which is the stopping condition used in these demos, we wont know if a solution wasn't found because it doesn't exist, or because we just haven't waited long enough. Since it's so hard to know, we ask the algorithm to stop once its tried 10000000 harmonies, and assume that the solution doesn't exist, even though the optimal solution could be the 10000001st harmony tried.
Conclusion
Hopefully this has been an exciting journey through the world of metaheuristic optimisation algorithms, and you learned a thing or two. I sure did. The takeaways are:
- Metaheuristic optimisation algorithms are useful for finding the optimal solution to some function which describes its arguments' "quality" or "fitness".
- Harmony search is a neat example of these algorithms which attempts to optimize a solution based on the principles of jazz musicians.
- HS and company are applicable to a very wide range of problems, including solving puzzles like sudoku.
- Sudoku however isn't really a good testbed for these algorithms because its easy enough to write a solving algorithm, and you have to write most of that algorithm to apply HS to sudoku anyways.
Thanks for reading!
References
- Geem, Z.W.: Harmony Search Algorithm for Solving Sudoku. Knowledge-Based Intelligent Information and Engineering Systems. http://dx.doi.org/10.1007/978-3-540-74819-9_46
Thanks
Thanks to Mo and Tomas for helping edit. Thanks to Dr Geem for creating and publishing so much about the algorithm.