Monday, April 14, 2014

This week in competitive programming

This week was so busy that I didn't have time to solve programming contests, so here's just a quick summary of their results.

TopCoder SRM 616 took place on Thursday (problems, results, top 5 on the left).

The first round of TopCoder Open 2014, Round 1A, took place on Saturday (problems, results, top 5 on the left).

However, top 250 by rating who have participated in SRMs in 2014 were automatically advanced to Round 2, and there was a special Parallel Round where those could compete (same problems as Round 1A, top 5 on the left). Surprisingly, the top fives of both rounds are comparable.

Google Code Jam 2014 Qualification Round ran for 27 hours between Friday and Saturday (problems, results, top 5 on the left).

And finally, the 13th round of the Open Cup took place on Sunday (results, top 5 on the left).

Thanks for reading, and check back next week for a hopefully more informative report!

Saturday, April 12, 2014

Google Code Jam 2014 - just 10 hours left in the Qualification Round!

The qualification round for Google Code Jam 2014 has started, and will run for about 10 more hours (until 6am on Sunday Moscow time). If you haven't already, you should take part. I'm involved with preparing problems for this competition like last year, and hope you will like them :)

Note that the time when you submit your solution doesn't matter in the qualification round, so those who started before you have no unfair advantage. You just have to submit before the round ends.

Also note that you must not discuss the problems and solutions with others before the round ends. Online programming competitions are based in a large part on the honesty of their participants, so don't spoil it for everyone!

Monday, April 7, 2014

This week in competitive programming

TopCoder SRM 615 took place on Friday (problems, results, my screencast, top 5 on the left). The medium problem asked a somewhat classical but still interesting question: given an undirected graph with at most 50 vertices, at most 50 edges, and relatively small integer edge lengths (up to 10000), is there a path from vertex 1 to vertex 2 that is exactly T long (where T can be very big, up to 1018)? The path can of course pass through the same vertex or edge many times.

This question turned out to be really tough - it was only solved correctly by 17 contestants, compared to more than 100 contestants solving the hard problem which was supposed to be more difficult. I guess the point value for this problem was set so low because it's standard, so the judges assumed people would know how to solve it?..

Open Cup has returned on Sunday with its 12th stage (results, top 5 on the left). This round had wider participation than usual with strong teams from the United States and China in the standings. 39 teams that will participate in the ACM ICPC World Finals in June were participating, and here's the link to standings filtered to only those teams.

And finally, Codeforces wrapped up the weekend with their Round 240 (problems, results, top 5 on the left). I didn't solve the round myself, so I don't have anything to say about the problems. The round's winner was decided on challenges, which are always a big gamble in the Codeforces format where you have to choose between spending time on challenges early in the contest (while the points for each unsolved problem are ticking down) and solving the problems first (while the easy challenges are taken away by others). Personally, I think that challenging early makes sense only if you're looking for a concrete mistake that you can spot in several seconds, so that you can gain a lot if the mistake turns out to be widespread, or give up quickly if people don't seem to make that mistake.

Thanks for reading, and see you next week!

Sunday, March 30, 2014

This week in competitive programming

TopCoder SRM 614 took place on Saturday (problems, results, top 5 on the left). The most interesting things happened with the hard problem. It went like this: you are given a NxM rectangular field wrapped like a torus - to the right of the last column is the first column and to the top of the last row is the first row. You start in cell (0, 0) and do one random step per second - you go one step to the right or one step to the top, each with probability 50%. What is the expected number of steps before reaching the given cell (X, Y) the first time? N and M are up to 100.

If you're logged into TopCoder, you can view the fastest solution for this problem, which required just 9 minutes to write (and could've probably been done even faster). It solves a natural system of equations: if we introduce a variable for the expected number of steps to reach the goal from each cell, we get a very simple set of equations: for each non-final cell, its variable is one plus half of sum of the variables of the two cells we can move to. This system has 10000 equations and 10000 variables, but each equation has just 3 variables in it. The code in question solves the system iteratively: start with all zeroes, then 7000 times update the value of each variable according to the corresponding equation using the current value of the other two variables.

There are different ways to organize such iteration. One way is to use "generations": we produce NxM new values based on NxM current values. Such approach is essentially equivalent to a DP which computes the expected value assuming we'll need at most 1, 2, ... steps to reach the goal. And that's why such approach doesn't work here - we'd need much more than 7000 steps to get a good approximation of the answer.

But the solution in question does something more tricky: it has just one set of NxM values, and updates them in-place, going in the decreasing-row then decreasing-column order. Such order means that when processing subsequent values, we'll be more likely to use the values that were already updated during the current iteration, and thus potentially "consider" more than 7000 steps: instead of processing just one step per one NxM iteration, we're processing something like O(M) horizontal steps and O(N) vertical steps, since we're tracing an entire part of the path that does not wrap around the torus. This part is a bit hand-wavy, but I hope it makes some sense (alternatively, please explain why it doesn't!). This speeds up the convergence about 50-100 times, and several hundred thousand steps  is already enough to get a good enough approximation.

During the round, the admins have apologized that such simple iterative solution worked, saying it was unintentional. But I think there's nothing to apologize for - this is actually a clever solution that relies on a tricky insight about the right order of processing of the cells, so kudos to everyone who came up with it!

Codeforces Round 239 took place on Sunday (problems, results, top 5 on the left). My solution for problem C relied on the fact that falling factorials, or falling factorial powers, adhere to the same binomial law as normal powers. More precisely, let xn = x*(x-1)*...*(x-n+1). Then (x+y)n = sum(C(n, k)*xk*yn-k). See this Wikipedia article and its outgoing links for more details. This and related identities allow to operate on polynomials expressed as a sum of falling factorial powers instead of the usual sum of powers in exactly the same way, and in this problem using such polynomials allowed to speed up the computations 100 times!

Thanks for reading, and see you next week!

P.S. I'd like my posts to be written in good English. But I'm afraid people with really good English don't think they are - please don't hesitate to point out grammar and other errors! I want to improve :)

Sunday, March 23, 2014

This week in competitive programming

The working week was free of programming contests - but come Friday evening, the competition was in full swing with TopCoder SRM 613 (problems, results, my screencast, top 5 on the left). The screencast is full of frustration with not being able to squeeze the 500 inside the time limit (my solution was about 2-3x too slow, although I believe the asymptotic complexity was similar to the correct solutions), and then the excitement of getting the 900 working with just 1 minute to go in the coding phase.

The 500 problem went like this: how many ordered n-tuples of numbers exist with each number between a and b, and all numbers in the n-tuple relatively prime as a set? a, b and n are up to 109, b-a is up to 105. There is, in some sense, only one possible approach: we have to use the inclusion-exclusion principle: first count all n-tuples, then subtract those where all numbers are divisible by 2, and so on. But there are several possible implementations of this approach, some faster than others :)

Codeforces carried the competitive spirit on with Codeforces Round 238 on Saturday evening (problems, results, top 5 on the left). al13n has won after a long break - in fact, his last win was one year minus one day ago :) As a result, he has jumped back into the top 10 by rating - congratulations! Komaki, on the other hand, showed some great consistency by getting into the top 5 both on TopCoder and on Codeforces.

And finally, the Open Cup returned on Sunday with the 11th round of this season (results, top 5 on the left). With about 3 months left before the World Finals in Ekaterinburg, it's most natural to view the OpenCup as the best available show of teams' strength. Not much has changed on this front, though: among the teams qualified for the World Finals, SPb SU 4 holds the lead, followed by Moscow SU Tapirs.

The Open Cup also gave rise to a nice strategy tip from my teammate Pavel Mavrin:

Thanks for reading, and see you next week!

Monday, March 17, 2014

This week in competitive programming

This week had two usual suspects: a TopCoder round and a Codeforces round. TopCoder SRM 612 (problems, results, top 5 on the left) featured a somewhat unexpected 450-pointer: you were given a set of cells on the infinite grid, and were asked to find another set of cells that has the same number of cells in each row and in each column as the original set, but has the smallest possible intersection with the original set. I'm calling it unexpected because the only solution I'm aware of involves min-cost-max-flow, and that seems pretty tough for 450 points. Is some simple approach possible here?

Codeforces Round 236 (problems, results, top 5 on the left) had the following problem as the hardest: you are given two trees, blue and red, with the same set of vertices. Then, we remove one edge from the blue tree, separating all vertices into two parts. There are some red edges connecting the parts - we remove all of them on the second step, separating the red tree into several parts. Now, there are some blue edges connecting the separate red parts - we remove all of them on the third step, and so on. You need to output which edges will be removed on each step, given the two trees and the first blue edge to remove. Trees have up to 200000 vertices.

The author's solution and both accepted solutions involve interval trees and require O(NlogN) memory. So here's my challenge: can you come up with a solution that doesn't use any complicated data structures and uses O(N) memory?

Thanks for reading, and see you next week!

Monday, March 10, 2014

This week in competitive programming

We start by discussing what was left from last week - the 10th Open Cup round (problems, results, top 5 on the left). The announcement of its results was delayed because of issues in the following problem (problem B in the statements): you are given the n integer lengths, and need to construct any convex polygon with the given side lengths. This is a nice, if standard, problem. The most common approach is to build a polygon that is inscribed in a circle, and find the radius of such circle using binary search (if all sides covered more than 2π, increase the radius, otherwise decrease it; handle the case where one side covers more than π carefully). The devil is in the details: as there were up to 105 sides of length up to 105, the radius of the circle had to be more than 109 which led to awful precision issues given that a precision of 10-5 was required in the output, and the jury checker program was not free of such issues, too. I think this is a common mistake that contest authors do: they set the problem's limits so that their solution barely runs in time/space/precision. Instead, when setting the limits, one should set them by taking into account both the correct solutions and possible wrong solutions. In this problem, I don't see any wrong solution of polynomial complexity, so having just 100 sides of length up to 100 would make the problem much better because the precision issues would go away and one would still need to come up with the binary search and properly handle the corner cases.

This week had just one contest: TopCoder SRM 611 (problems, results, top 5 on the left). I've skipped it, so I won't go into detail about its problems.

Instead, let's talk about last week's medium problem that I mentioned in my previous post: you are given several tasks, i-th task consuming ai energy when started and returning bi energy when completed (but energy is not allowed to become negative while performing a task), bi < ai. What is the maximum number of tasks you can perform given your starting energy?

Such problems are often solved by considering the following natural question: consider the final order in which we are solving the tasks. In order for the solution to be optimal, swapping two adjacent tasks in that order should not make it better (for some meaning of "better"). So let's see what happens when we swap two adjacent tasks in that order.

Let's say the first task consumes a1 energy and returns b1 energy, and the second task consumes a2 energy and returns b2 energy. For the first task to be doable, we need at least a1 energy, and the amount of energy will decrease by a1-b1. For the second task to be doable after the first task, we need to have at least a2+a1-b1 energy in the beginning (since a1-b1 was already spent on first task). If we do them in the reverse order, then by the same argument we see that we need to have at least a1+a2-b2 energy in the beginning. The a1+apart is the same in both constraints, so the only difference is -b1 vs -b2. We can see that if b1>=b2, then -b1<=-b2 and thus a2+a1-b1<=a1+a2-b2, and doing the second task after the first task has lower or equal requirement on the amount of starting energy than doing them in the reverse order, so it makes no sense to do them in the reverse order. That mean that we can sort the tasks in non-increasing order of bi, breaking ties arbitrarily, and then only consider solving them in the sorted order.

Having made this observation, the rest is relatively easy: we can do Dynamic Progamming using (number of tasks solved, current amount of energy) as the state.

This simple and somewhat obvious idea - to consider what happens when we swap two tasks - is often crucial to solving this kind of problem. Sometimes it leads to a solution outright, and sometimes it can give a clue to what the next step is.

Thanks for reading, and see you next week! Also, here's a sunny Moscow landscape I see - spring is here!