Consider the task of computing all permutations of a given string. For example, the string rum has six permutations: rum, rmu, urm, umr, mur, mru. The string drum has 24 permutations.

Apparently, this is a common interview question. There is a recursive solution—you can find it on StackOverflow.

Recursion is great for interview questions, but does it really perform in a multicore world? You can always replace recursion with a work queue. In each processing step, grab the head of the queue. Simplify the state, which usually results in more than one state to examine further. Add those to the tail of the queue. Or, if the work that you got is a final result, yield it as the answer.

Then wrap that in a loop. As long as the queue is not empty, keep processing. The process steps could run on separate cores, speeding up the computation.

We won't do the “separate cores” part in this exercise. Simply implement a processing step for computing permutations. A WorkInProgress object has a partial permutation and some letters that haven't yet been used. For example, if partialPermutation is ru, and unusedLetters is dm, then take the d and stick it in every possible place:

dru rdu rud

For each of these, add a new WorkInProgress object to the queue, with partialPermutation being one of these strings and unusedLetters being m.