Sunday, April 9, 2017

A dominator week

First of all, please note that there's still a bit more than 3 hours left in the Google Code Jam qualification. You can still hop on the Code Jam train!

The last week was of the extremely busy type. First off, Codeforces Round 407 took place on Wednesday (problems, results, top 5 on the left, analysis). -XraY- and Um_nik stood out from the pack by solving all problems, and slightly better speed has earned -XraY- his first - but definitely not his last - victory of the week. Congratulations!

A Swiss-resident-only Helvetic Coding Contest 2017 took place on Saturday (results, top 5 on the left). Just like last year, the problems will be used for an online mirror some time later, so I will not spoil you the fun of solving them.

AtCoder Grand Contest 012 took place at the same time (problems, results, top 5 on the left, analysis). -XraY-'s second (but still not his last) victory was more clear-cut than the first one, as he managed to solve five problems fifteen minutes faster than everybody else, and was the only one at the top without any penalty minutes. Well done!

A few hours later TopCoder Open 2017 has taken off with its Round 1A (problems, results, top 5 on the left). With the top 250 active coders receiving a bye into Round 2, the contest has once again given semi-retired veterans a chance to shine, and it seems that ACRush did not forget how to win TopCoder rounds :) Congratulations!

Sunday took off with Open Cup 2016-17 Grand Prix of Tatarstan (problemsresults, top 5 on the left), where -XraY- has earned his third and final victory of the week, this time together with his team - amazing!

Problem E was of the kind that I prefer to give on my own contests, as it makes preparing the testcases very easy: with just two numbers in the input :) On a more serious side, here's what it was about: you are given two numbers n and k. You need to choose the smallest amount of pairs (a,b) such that 1<=a,b<=k for each pair, and such that any sequence of n not necessarily distinct numbers, each between 1 and k, will contain at least one of your chosen pairs as a subsequence.

Can you see the solution? Can you prove it?

Finally, Russian Code Cup 2017 Qualification Round 1 wrapped up the week on Sunday evening (problems, results, top 5 on the left, analysis). Even though the main goal here was to get into the top 200, there was still a scoreboard and it was possible to get the first place - which eatmore did thanks to flawless execution without any incorrect attempts. Way to go!

In my previous summary, I have mentioned a couple of problems. The first one went like this: you are given 300 trees on the same set of 300 vertices. You want to pick exactly one edge in each tree, remove those, then add the edge you removed from i-th tree to the (i+1)-th tree, for all i (the edge from the last tree is added to the first one), in such a way that the resulting graphs are still trees. How many ways are there to do that, modulo 109+7?

Let's say we picked the edge x-y in the first tree. After we add it to the second tree, a cycle consisting of this edge and the (only) simple path from x to y in the second tree will be formed, so we need to remove any edge in the said simple path to obtain a tree again. This edge in turn will define a path on the third tree where we need to pick an edge to remove, and so on. After we make a full cycle, we need to get back to the first edge.

If we fix the edge we pick in the first tree, we can use dynamic programming to find the number of ways to pick the edge to remove in the first a trees, such that b-th edge of the a-th tree is removed. This dynamic programming has O(n2) states, each state can be processed in O(n) (we need to traverse the corresponding path in the next tree), and we have O(n) outside iterations for the edge of the first tree, so the total running time is O(n4).

However, we can notice that the innermost O(n) is used to support an operation "add a constant to all edges of the tree along the given path". Here's how we can do it faster. Every path in a tree always goes towards the root until we reach the lowest common ancestor, then away from root. Now, instead of having values on edges and adding a constant c to the entire path from x to y, we can have values in vertices in such a way that the value for each edge is computed as the sum of values of vertices in the corresponding subtree, and then in order to add a constant to a path we need to add c to vertices x and y, and subtract 2c from lca(x, y) - just 3 numbers to be updated. This makes handling of each dynamic programming state O(1), and the total running time is now O(n3).

The second problem was: you are given a directed acyclic graph with 200000 vertices and 500000 arcs. We're going to remove one of its arcs. For each vertex v of the graph, you have to answer a question: is it true that v is guaranteed to be reachable from vertex 1, no matter which arc we remove?

This problem is closely related to the concept of dominator tree. Here, we would define that arc a dominates vertex v, if one must pass through a when going from 1 to v. Our goal is to tell if there are any dominating arcs for each vertex.

It turns out the following approach (corrected by Shubham in comments below - thanks!) works: assuming we have topologically sorted our graph from left to right, let's compute the leftmost dominating arc (or the fact there's no dominating arc) for each vertex. The computation will also go from left to right, and work like this for vertex u: if for each vertex reachable from 1 that have arcs into u the leftmost dominating arc is the same, then this arc is the leftmost dominating arc for u as well. If that's not the case, but there's just one vertex v reachable from 1 that has an arc into u, then the arc from v to u is the leftmost dominating arc for u (note that in this case v has no dominating arcs).

So far, everything is somewhat intuitive and easy to prove, but here comes the insight: in all other cases, there are no dominating arcs for u! Suppose it's not the case, and some arc a dominates u. We have at least two vertices reachable from 1 that have arcs into u, and the don't have the same dominating arc. Thus a can't be directly incoming to u, and at least one of the vertices that have arcs to u doesn't have a as its dominating arc, which leads to a contradiction since then we can bypass a on our way to u, too.

Thanks for reading, and check back next week!

Saturday, April 1, 2017

A moejy0viiiiiv week

Last week's contests started with Codeforces Round 406 on Thursday (problems, results, top 5 on the left, analysis). Quite surprisingly, this time the winning strategy involved going with the cheaper problem in the end instead of the more expensive one, although that might have involved squeezing through a solution that was not supposed to pass :) Nevertheless, congratulations to moejy0viiiiiv on his amazing non-asymptotic optimization skills!

TopCoder SRM 711 took place on Saturday (problems, results, top 5 on the left, my screencast). This time I've recorded the usual silent screencast, but it didn't seem to improve my results :) There was a moment with about 30 minutes left in the coding phase when I had 1224 points, which would be enough for the second place in this SRM. However, there was quite a bit of uncertainty: the solution I've submitted for the hardest problem was theoretically O(n4) with n up to 300, although it seemed to behave more like O(n3) on random testcases, and even that already took 1.5 seconds.

I've tried to come up with a case where it would time out, could not do that, but it seemed very likely that such cases exist, so I've decided that the authors would see the obvious O(n4) approach and make sure it fails, and implemented and submitted a true O(n3) solution losing lots of points but avoiding the possible loss of all points for that problem. After the contest ended, I've submitted my possibly-O(n4) solution in the practice room, and guess what? It passed the system tests as well. I could have pulled a moejy0viiiiiv :)

The natural question, of course, is this: can you challenge that solution? It should be available in the practice room, submitted by me. The problemsetter mentions a possible idea that could make this solution fail, but I'm wondering if it actually fails in practice.

I should also mention that even without the resubmission I could not challenge rng_58 for the first place, and his solution for that problem was O(n3) from the start. Congratulations on the well-deserved victory!

Here's the actual problem I've talked so much about. You are given 300 trees on the same set of 300 vertices. You want to pick exactly one edge in each tree, remove those, then add the edge you removed from i-th tree to the (i+1)-th tree, for all i (the edge from the last tree is added to the first one), in such a way that the resulting graphs are still trees. How many ways are there to do that, modulo 109+7?

Open Cup 2016-17 Grand Prix of Poland has wrapped up the week on Sunday (results, top 5 on the left). Makoto continued to be on fire, winning a team contest right after winning the individual one on Saturday. Congratulations again!

This round had several nice problems, but if I have to pick one, it would be problem C. You are given a directed acyclic graph with 200000 vertices and 500000 arcs. We're going to remove one of its arcs. For each vertex v of the graph, you have to answer a question: is it true that v is guaranteed to be reachable from vertex 1, no matter which arc we remove?

In my previous summary, I have mentioned a Codeforces problem. You are given a grid with two rows and n columns, each cell containing a (not necessarily positive) integer. Your goal is to find as many non-intersecting rectangles on this grid as possible, such that each rectangle has a sum of 0. n is up to 300000.

First, we can find all rectangles with zero sum. Of course, there might be too many of those, but we can notice that for each "type" of rectangle (row 1, row 2, or both rows) we can find the partial sums from column 0 to column i, and a zero-sum rectangle corresponds to two equal partial sums, and since in the maximum answer there would never be a rectangle that can be split into two, we only need to consider pairing each partial sum with one previous occurrence of the same value, and thus have at most n useful rectangles per type, at most one ending in each column.

That observation yields a straightforward O(n2) dynamic programming solution: we'll compute aij - the maximum number of rectangles that we can choose in the first i cells of the first row, and in the first j cells of the second row. In order to compute this value, we consider whether we take the single-row rectangles ending in columns i and j, and also the double-row rectangle in case i=j. But O(n2) is obviously too slow with n=300000, so how to speed this up further?

The next idea is based on the fact that we currently have too much leeway in constructing the optimal solution. Whenever we pick the single-row rectangles between two double-row ones, we can pick them in any order, and the current solution effectively considers all such ways. But let's assume we always pick them "from right to left": if our current state is (i,j) and i>j, then we'll consider what happens with cell i in the first row, and if i<j, we'll consider what happens with cell j in the second row. If we do that, then the states that are important for computing the final answer have a peculiar property: if i<j, then ajj-1<=aij<=ajj (and similar for i>j), because if aij<=ajj-2, we could have skipped the last rectangle we took in the first row (that led us to i), and get a better answer.

Because of this, we just need to compute the "diagonal" ajj of the matrix, and also for each j find the smallest i that aij is still equal to ajj, and the smallest i that that aji is still equal to ajj, which we can do with binary search to obtain a O(nlogn) solution.

Finally, please note that Google Code Jam 2017 is just around the corner - the qualification round starts on April 7! I'm setting some of the problems this year, and I hope you'll find them interesting.  

A week with two rows

On the Mar 13 - Mar 19 week, VK Cup 2017 started with its Round 1 (problems, results, top 5 on the left, parallel Codeforces round results, my screencast of the Codeforces round with commentary, analysis). Congratulations to Zlobober and zemen on the win!

I've made a second attempt at explaining myself to the camera during the competition, and this time it it went even worse. I've managed to implement an incorrect solution for C before realizing it and rewriting from scratch, and if that wasn't bad enough followed up with doing the same - implementing an incorrect solution that needs to be dropped on the floor - twice for problem D. Maybe talking is indeed incompatible with critical thinking?..

Here is problem D that has stopped me in my tracks. You are given a grid with two rows and n columns, each cell containing a (not necessarily positive) integer. Your goal is to find as many non-intersecting rectangles on this grid as possible, such that each rectangle has a sum of 0. n is up to 300000.

In my previous summary, I have mentioned a cute AtCoder problem. You are given a huge number n with at most 500000 (decimal) digits, and need to find the smallest number k such that n can be represented as a sum of k numbers with non-decreasing decimal representation (for example, 1157888).

The approach I talk about in my screencast is a bit vague, and I like the approach from the editorial more. First, we notice that numbers with non-decreasing decimal representation are exactly those numbers that can be represented as a sum of 9 numbers which consist only of ones (like 1 or 1111; we will also count 0 as such a number with zero ones) - the main Young diagram trick strikes back :) So our goal is to represent n as a sum of 9k such numbers.

Each such number is equal to (10m-1)/9 for some m, so we actually need to represent 9n+9k as a sum of exactly 9k numbers of form 10m. The smallest amount of such numbers needed to achieve 9n+9k is naturally given by the sum of digits in the decimal representation of 9n+9k, and is divisible by 9. We can then repeatedly replace a power of 10 with ten smaller powers to get any bigger amount divisible by 9, up to 9n+9k (when the sum is all ones). We need to achieve 9k which is also divisible by 9, so we just need to check if the sum of digits in the decimal representation of 9n+9k is less than or equal to 9k.

Now that we know how to check any given k, we can use binary search to find the minimum possible value of k. VoilĂ !

Thanks for reading, and check back soon for the last week's summary.