Thursday, July 26, 2007

Korn Shell

So I was looking at the Facebook programming puzzles the other day. For those of you who haven't stumbled upon them, the puzzles are meant to be completed by would-be applicants in one or more of the suggested languages (those used by Facebook, according another article). In most puzzles, additional solutions in other languages are encouraged, and in some puzzles partial or probabilistic solutions are also accepted.

One problem I thought was interesting was the Korn Shell problem. Go ahead and read it now, otherwise this post is pretty pointless :-P


When I first read the puzzle, I don't think I quite understood it, so here's a quick example. Say we're working in a six letter alphabet (A-F). The keyboard could've been swapped in many ways, but I'll give 2 of them:

Keyboard 1


A->F
B->E
C->B
D->D
E->C
F->A


Keyboard 2

A->F
B->E
C->D
D->C
E->B
F->A


A -> F means that typing A on the keyboard yields F on the screen, as you would expect. So if we take the word "FACE", we get these iterations











Keyboard 1Keyboard 2
AFBCAFDB
FAEBFACE
AFCE
FABC
AFEB
FACE

So the first keyboard yields many more iterations than the second, with a total of 6. In fact, in an alphabet of 6, the maximum number of iterations you can have for any word is 6, but I'm getting ahead of myself.

The first thing you may notice is that a keyboard is really just a permutation. One thing you might quickly notice after that is, for any given keyboard, repeats of a character are irrelevant. Finally, even thought the characters of a word matter in a given keyboard, there is always another keyboard that will take the same number of steps for a transformed version of the word. For instance, since there are keyboards that transform "FACE" in six and two steps, there are also keyboards that transform "AFCE" and "DBCA" in six and two steps. Thus, for any given word, the only determining factor is the number of unique characters.

Now let's look at the keyboards more carefully. In mathematics, the permutation group Sn is the... group of all permutations of n elements. These elements are often written as disjoint cycles. For instance, (a b c)(d e) is a permutation with two disjoint cycles. The first cycle imposes that a -> b, b -> c, and c -> a, and the second that d -> e and e -> d. If there are other elements are . So, our keyboards can be written as (A F)(B E C) and (A F)(B E)(C D).

For any k-cycle, it takes k iterations to get back to the starting point. Of course, we also restart after 2k iterations, or 3k, or j*k for that matter. So, if we have a permutation made up of m cycles, of sizes k1 .. km, then the permutation will take L = lcm(k1, .., km) iterations to restart, where lcm is the least common multiple.

So, our keyboard problem is strictly a number theory problem! For a given number of unique characters C, we need to find m, and k1, ..., km > 0 so that LC is maximized, subject to some constraints. Specifically, we can't have more cycles than we have unique letters, but multiple letters can belong to the same cycle, so m <= C. Also, if our keyboard only has K letters, k1 + .. + km <= K.

(Equivalently, we could specify that k1, .., km >= 0, and that m = C. The previous way seems more intuitive, but this way might be more tractable).

For small values of K, this problem is pretty easy. Like I said before (and you can verify this pretty easily), for K = 6, L4 = 6. As a matter of fact, LC = 6 for all valid C. (C = 1 can use a full 6-cycle, and C > 1 can use 2-cycle and a 3-cycle). Of course, the Korn Shell problem is for K = 26. I've yet to come up with any insightful (read: non-brute force) way of computing it, though.