Monday, April 11, 2016

A binary week

Last week was reasonably calm, TopCoder SRM 687 being its highlight (problems, resultsproblem discussion). Once again, tourist didn't give others many chances, but matthew99a has managed to shine in a slightly different way — by getting into the top 5 for the third SRM in a row. Congratulations to both!

Now I want to come back to the problem I mentioned in my previous summary. However, since I've published it just a few minutes earlier, I'd like to mention once again that the problem is super interesting and that I encourage you to try solving it yourself before reading the solution!

The problem was to interactively guess a hidden number between 1 and n (which was up to 109) spending at most log2(n+1)+0.1 steps on average (notice the lack of ceil() around the log). One guess was asking "is the hidden number less than x?" for any value of x. The guessing was repeated 10000 times with possibly different (and possibly the same) hidden numbers to compute the average.

Suppose k=ceil(log2n). Then the standard binary search requires k-1 steps for 2k-n hidden numbers, and k steps for the remaining n-(2k-n)=2n-2k hidden numbers. On average, this fits under the log2n+0.1 limit (which by itself doesn't seem to be an obvious mathematical statement, but it feels right and can be verified empirically, and that's probably how the +0.1 part was chosen by the problemsetter); however, the hidden numbers are not chosen randomly, so we might not get the average treatment.

On the other hand, there are many possible binary search trees with such structure (2k-n leaves on level k-1, and 2n-2leaves on level k), That leads to the following solution framework: we need a family of such binary search trees such that every hidden number is equally likely to be on level k across the entire family (and thus equally likely to be on level k-1, too). Our strategy will then be to pick a tree from the family at random, and then do binary search according to it, giving us the required average number of steps irrespective of what the hidden numbers are!

How does one find such a family? Well, after formulating so clearly what we need to find, the actual construction is not that difficult. It's quite clear that the only restriction we have on the tree is that leaves on level k have to come in consecutive pairs; modulo that restriction, we can construct any possible binary search tree. More precisely, let t=n-2k-1 be the number of those consecutive pairs of leaves on level k.

One tree in our family will have numbers (1, 2), (3, 4), ..., (2t-1, 2t) as level-k leaves. Another tree will have (3, 4), (5, 6), ..., (2t+1, 2t+2). We can continue like this, taking the leaf numbers modulo n until we make a full cycle, and it's easy to see that the resulting family has each leaf at level k the same number of times (the example for n=6 is on the left).

This approach requires n to be even, as otherwise at some point we'd need to group the first and last numbers into one consecutive pair, and we can't do that. Here's when the "+1" part in the problem statement comes into play: since we're allowed log2(n+1)+0.1 guesses on average, and not log2n+0.1, we can afford to just solve the problem for n+1 when n is odd.

Thanks for reading, and check back next week!

No comments:

Post a Comment