Saturday, February 6, 2016

A week with boomerangs

The end of Jan 25 - Jan 31 week was tightly packed with contests.

First off, TopCoder SRM 680 took place on Thursday evening  (problems, results, top 5 on the left). tourist was once again the only contestant to solve the hard problem, and thus enjoyed a 400+-point victory margin - great job! It was especially impressive given that at the end of the coding phase there were 23 submissions for this problem, and one could not suspect all but one would fail.

Here's the tricky problem: you have a string with 2500 characters, each character being one of the 6 letters from 'a' to 'f', and a number k up to 6 of the same parity as n. You need to select k of the string's characters, and split the remaining (n-k) characters into (n-k)/2 pairs, in such a way that each pair has distinct characters. Pairing character i and character j together costs c[i] + c[j] + 100 * abs(i - j), where c is a given array. What is the minimum total cost to build all those pairs? The statement is a bit convoluted, but once you wrap your head around it, the problem itself is quite interesting.

Wunder Fund Round 2016 happened on Codeforces about one day later (problems, results, top 5 on the left). Egor has managed to pull out an amazing last-minute victory, submitting problem F on the last minute - and getting it pass the system test.

That problem is similar to the ones from mathematical competitions. You are given two multisets of n integers each, each integer between 1 and n inclusive. Find a non-empty subset of the first multiset and a non-empty subset of the second multiset, such that they have the same sum.

Come the next evening - and there's one more competition, this time the final elimination round of the Facebook Hacker Cup, with 25 spots in the onsite competition in London up for grabs (problems with Facebook login, results with Facebook login, top 5 on the left). The difficulty of the problemset was chosen very appropriately, and the contest was mostly about solving difficult problems instead of speed coding, which is awesome. Bruce has solved 4 problems with the smallest penalty time, and will be one of the favorites to win the final round!

I think the second problem was the most beautiful one in this round (picture on the left from the official website). You were given a 100x100 grid with each cell containing a boomerang pointing to two adjacent sides of this cell. When two boomerangs point towards one another, we call it a conflict. We can rotate a boomerang by 90 degrees in either direction at most K times (K <= 1000, we can pick K different boomerangs for rotation, or maybe rotate some boomerang several times). What is the smallest number of conflicts we can have after at most K rotations by 90 degrees?

Finally, the Open Cup returned to action for the first time in 2016 on Sunday with the Grand Prix of Asia (problems, results, top 5 on the left). In line with their awesome performance at the ongoing Petrozavodsk training camp, the Warsaw Eagles team won this contest as well - congratulations!

This round had so many interesting and beautiful problems, that I will highlight two. First, problem D asked you to count the number of ways to interleave two given permutations of size N (two ways are considered different if the resulting sequences of 2N numbers are different). N is up to 2000, and you need to find the answer modulo 109+7. For example, if the given permutations are "1, 2" and "2, 1", then there are four such ways: "1, 2, 2, 1", "1, 2, 1, 2", "2, 1, 1, 2" and "2, 1, 2, 1".

Problem H was concerned with random walks on a 2D grid. You start from cell (0, 0), and make N random independent steps, moving in one of the four cardinal directions at each step. What is the expected number of cells you will visit? N is up to 5000.

That was one hell of a problem-packed post :) I hope you enjoy solving them!

A week without FFT

The Jan 18 - Jan 24 week started with TopCoder SRM 679 on Wednesday (problems, results, top 5 on the left). The hard problem in this round is somewhat weird: the solution is quite straightforward, but somehow a few people, including myself, missed it and tried to squeeze in an incorrect FFT-based solution that obviously times out. It didn't stop tourist, who won the round with an impressive 350+-point margin.

Here's how the problem went: you were given 500 bags, each containing a lot of cards, with each card containing an integer between 1 and 500 (more precisely, you have a 500x500 matrix, describing how many cards contain each integer in each bag). You're also given a set of "good" integers, each between 1 and 1000. For each pair of bags, you need to answer the following question: how many ways are there to pick a card from the first bag and a card from the second bag in such a way that the sum of their numbers is a good number?

The correct solution runs in O(n^3), while the somewhat obvious FFT-based approach is only O(n^3*logn), and thus too slow. Can you see the O(n^3) one?

Facebook Hacker Cup selection continued with Round 2 on Saturday (problems with Facebook login, results with Facebook login, top 5 on the left). This was the first round where the time of submitting one's solutions was taken into account, and thus the first round where the contestants were competing against each other, not just against the problems. Eryx has claimed the first place with a solid margin - congratulations!

The problems were quite technical again, but the hardest problem at least allowed one to simplify the solution using a creative idea. You were given a tree with 1000 nodes, and needed to paint it with 30 colors, with the cost of painting each node with each colors given as a matrix. So far, the solution is very straightforward - just pick the cheapest color for each node. However, there was a small twist: you also paid a given penalty P for each node that has neighbors of the same color. You pay P only once per such node, no matter how many neighbors of the same color does it have. Can you solve this problem in O(N*K4) (N=1000, K=30)? How about O(N*K3)?

Let me also explain the solution to a problem from the previous post: you were given a tree with 100000 vertices, and at most 100000 queries. Each query highlighted some subset of vertices, and to answer a query you needed to compute the minimum number of non-highlighted vertices to remove that would leave all highlighted vertices disconnected, or to report there isn’t a way (in case two highlighted vertices are directly connected with an edge). The total number of highlighted vertices in all queries also doesn’t exceed 100000.

First, let's learn to solve the problem for just one query. It's not hard to see that this can be done greedily: we go from leaves to the root, returning a boolean value from the subtree: does this subtree contain a highlighted vertex? When a vertex has more than one subtree containing a highlighted vertex, or when its parent is highlighted and it has at least one highlighted child, we need to remove it.

Given that our solution returns just one boolean value for each subtree, we can handle 64 queries at once by returning a bitmask from it, using bitwise operations to implement the logic described above. This is (arguably?) not an asymptotic optimization, but 1000002/64 for this to pass under the time limit.

Thanks for reading, and check back soon for the summary of the last week of January!

An Ewok week

The Jan 11 – Jan 17 week saw the return of the main algorithmic competition anchors in 2016: TopCoder and Codeforces.

A Star Wars-themed TopCoder SRM 678 happened very early on Wednesday (problems, results, top 5 on the left). Congratulations to freak93 on the victory and on being the only contestant to solve the hard problem correctly! Apparently the Ewoks are really good at algorithms, as only one human could solve their problem :)

Codeforces Round 339 took place at the usual Codeforces time on Thursday evening (problems, results, top 5 on the left). All problems were somewhat technical, but at the same time allowed one to demonstrate algorithmic competition experience. Problem D, for example, tested one’s skills in batch processing of requests on a tree. You were given a tree with 100000 vertices, and at most 100000 queries. Each query highlighted some subset of vertices, and to answer a query you needed to compute the minimum number of non-highlighted vertices to remove that would leave all highlighted vertices disconnected, or to report there isn’t a way (in case two highlighted vertices are directly connected with an edge). The total number of highlighted vertices in all queries also doesn’t exceed 100000. Such problems usually allow O(n^2), O(N*sqrt(n) and O(n*polylog(n)) solutions, with the former supposed not to run in time. However, in this particular case a O(n^2) solution could pass if optimized using bitmasks – can you see how?

Facebook Hacker Cup Round 1 on the weekend gave one a chance to get closer to the first onsite competition of the year, to happen in London in March (problems with Facebook login, results with Facebook login, top 5 on the left). The round also featured somewhat standard technical problems. The hardest problem was perhaps the least technical: you were given 16 players, a 16x16 matrix of who wins whom, and needed to find the lowest and the highest possible elimination round for each player in an Olympic elimination tournament with 4 rounds, over all possible seedings.

I’ve described a New Year problem last week: you are given a positive integer A with at most 1000 digits. Find any two positive integers X and Y such that X+Y=A, and both X and Y are palindromes when written in decimal notation without leading zeroes.

In order to solve this problem, let’s first assume there’s no carry – each digit of A is formed directly as the sum of corresponding digits of X and Y, which is always less than 10. In case X and Y have different lengths, there’s at most one solution with given lengths of X and Y and it’s easy to find. Without loss of generality, assume X has more digits than Y. The first digit of X is then equal to the first digit of A (remember, there’s no carry yet!). Since X is a palindrome, we know its last digit as well. But then we can find the last digit of Y by subtraction. And then we can find the second digit of X, and so on. In case they have equal lengths, there might be a whole lot of solutions, but all of them are equivalent in some sense – in effect, we know the sum of corresponding digits of X and Y, and have several ways to decompose each sum into two summands independently. For example, 4=1+3=2+2, so 44=11+33=22+22=12+32=21+23.

Now how does one handle the carry? First of all, notice that for the last digits, we can determine that carry as we go from right to left. For the first digits, however, we go from left to right, and thus need to know the carry from the next digit to determine a given digit by subtraction. When we don’t know it, we will do backtracking – try both a carry of 0 and a carry of 1 recursively. At the first glance, this would seem exponential, but after taking some time to contemplate, one can see that most branches would be truncated very early because some digit would get negative or more than 9, and in fact this backtracking runs very fast.

Thanks for reading, and check back soon for the next week’s problems!

Tuesday, January 26, 2016

A prime week

The second week of 2016 was very calm, with the main anchor being the SnarkNews New Year Prime Contest (problems requiring Yandex login, results, top 5 on the left), containing the problems that were previously given in 2015 but were not solved by anybody then. XZ Team and Swistakk showed amazing skill in solving 11 and 8 problems respectively - remember, nobody could solve any one of those problems in actual contests!

Here's a problem from this contest that I could solve. You are given a positive integer A with at most 1000 digits. Find any two positive integers X and Y such that X+Y=A, and both X and Y are palindromes when written in decimal notation without leading zeroes.

Finally, let's remember the problem I highlighted in the previous post: you're given a list of all edges in a tree with n vertices, n is at most 200000. Each edge is given by the numbers of its vertices, but the numbers are not given directly - instead, you just now the number of digits in each number. For example, the edge "123 45" would be given as "??? ??". Your goal is to replace the question marks by digits, such that each number has no leading zeros, and the edges form a valid tree.

We have at most 6 types of vertices in this problem: from "?" to "??????". Since we know the number n, we know how many vertices of each type we have. Let's construct the tree a-la the Prim algorithm: start with a single vertex, let's say the one with label "?", and then add edges one by one, each edge connecting a vertex already in the tree with a new vertex. When we add an edge "??? ??", for example, we either connect an existing "??" vertex with a new "???" vertex, or an existing "???" vertex with a new "??" vertex, so we either increase the number of existing "??" vertices by 1, or increase the number of existing "???" vertices by 1. And our goal is to achieve the predetermined number of vertices of each type in the end.

This looks to be a standard flow problem now: our network will have a source, a sink, a vertex for each edge type in the input (i.e. "?? ???" and "??? ??" are the same type), and a vertex for each vertex type (i.e. "??") in the resulting graph. We add an arc from the source to each edge type with capacity equal to the number of edges of this type in the input, two infinite capacity arcs from each edge type to the corresponding vertex types, and an arc from each vertex type to the sink with capacity equal to the number of vertices of this type in the result, minus one for "?" type. A unit of flow from "?? ???" to "??" will tell us that this edge should connect an existing vertex of type "???" to a new vertex of type "??".

However, this solution doesn't yet work as written. As we start to construct the actual tree from the flow, we might find that we want to connect an existing vertex of type "???" to something, but there's no existing vertex of type "???" yet! And there might not even be an ordering of edges that avoids this situation. In order to overcome this difficulty, let's split all edges of the tree into two types: an edge that adds the first vertex of some type, and all other edges. It's not hard to see that we can change the order of adding the edges in such a way that the edges of the first type are added before all others, and that for all other edges we can use the flow solution above since now we have at least one vertex of each type. So the only remaining idea is to just iterate over all possible sequences of the edges of the first type - since there at most 5 such edges, there are very few possibilities to try.

Thanks for reading, and check back soon for the Jan 11 - Jan 17 week!

Monday, January 25, 2016

A multiple language week

The New Year week contained a few topical contests, the first of which was Good Bye 2015 at Codeforces (problems, results, top 5 on the left). Congratulations to Gennady on wrapping up 2015 with another victory, to go with all other amazing victories he has achieved last year!

The hardest problem H required quite a bit of creativity to solve, but had a very simple problem statement: you're given a list of all edges in a tree with n vertices, n is at most 200000. Each edge is given by the numbers of its vertices, but the numbers are not given directly - instead, you just now the number of digits in each number. For example, the edge "123 45" would be given as "??? ??". Your goal is to replace the question marks by digits, such that each number has no leading zeros, and the edges form a valid tree. How would you approach this one?

SnarkNews New Year Blitz contest was another traditional contest for this week (results, top 5 on the left), with specific New Year-themed rules: it runs from a few hours before the New Year to a few hours after it, and the penalty time for each problem is counted as the distance from the New Year - so if you solve a problem before the New Year, it might make sense to wait before submitting it. Of course, submitting exactly at New Year usually clashes with the New Year celebrations themselves, so this contest usually attracts the most devoted participants :) This year an additional rule has complicated the matters: one could earn a -100 penalty minute bonus by submitting a solution in a different programming language.

Congratulations to ariacas on solving all 12 problems and submitting most of them right at New Year, and to Xellos and Alexander Udalov on successfully using 10 different languages each - an amazing accomplishment!

I'm sorry for the interruption in my posts, caused by some pretty busy weeks at work and amazing winter weekends - winter is my favorite season, and I've been doing some cross-country skiing instead of updating this blog. I hope to get back to the current week soon, so please come back for updates!

Thursday, December 31, 2015

A Christmas week

Last week got going with Codeforces Round 336 (problems, results, top 5 on the left). Nobody was able to solve the hardest problem, which allowed matthew99 to win by solving the remaining four problems in about an hour. Successful challenges by tourist and ACRush brought them closer, but it was barely not enough for the victory. Congratulations matthew99!

On Saturday, TopCoder SRM 677 (problems, results, top 5 on the left) featured problems by cgy4ever, which are usually very entertaining. I didn't manage to take part since I was on a plane during the time of the SRM. Maybe I'll still manage to solve the problems in the practice room, will then share the best one in a later post.

ACRush continued his impressive comeback by winning the second SRM in a row - great job! In my last post, I've mentioned him climbing into the third place in the all-time SRM victory stats, but I was somewhat expecting tourist to take that third place soon, as he was just two SRMs short. Well, now I'm not so certain :)

(Christmas in Moscow on the left) Finally, let's discuss the tough problem C from the previous week's Open Cup: two strings of 0s and 1s are considered equivalent, if one can be obtained from the other by repeatedly removing or inserting (anywhere in the string) the following substrings: 00, 111, 0101010101. For example, 100110 and 010011 are equivalent, since we can do the following sequence of transformations: 100110 --> 1110 --> 0 --> 0111 --> 010011. Given 4000 strings of total length at most 50000, find out the groups of equivalent strings among them.

The first approach that comes to mind is to build some kind of "reduction" process - in the example above, both strings could be reduced to just "0" by repeatedly removing the 00/111 substrings. However, it is very tricky since sometimes you have to first add something in order to be able to remove a bigger pattern. For example, if you have substring 001010101, then you can insert 111 to get 011101010101, then insert 00 to get 01100101010101, and finally remove the last 10 characters to get 0110. Building a system of reduction that always works is hard, and can probably only be accomplished by implementing a stress-test that will give you missing reductions with small lengths until you can catch all of them. Here's the description of such approach by darnley in Russian.

However, the problem becomes much more tractable if we bring in some powerful theory. One can notice that we have a presentation of a group with two generators and three relations here, and need to check if the given products represent the same element of the group. Which group is it? Well, the Wikipedia article referenced above actually has an answer: this is the group of rotations of an icosahedron, with 1 corresponding to a 120-degree rotation around a line passing through centers of two opposing faces, and 0 corresponding to a 180-degree rotation that swaps two opposing vertices. Now the only remaining challenge is to see how the vertices of an icosahedron are permuted by each of those two rotations, and the actual code simply boils down to multiplying permutations of 12 elements.

Reading the Wikipedia article more carefully, we can notice that the group is actually isomorphic to A5, so there exist an even permutation of degree 2 and an even permutation of degree 3 with just 5 elements that will also work in the above solution. How do we find those permutations? Well, there's just one permutation of each kind up to renumbering of elements (two transpositions and a cycle of length 3), and we have to pick such two permutations that their product has degree 5. Now it's very easy to find two permutations that fit.

However, the Wikipedia article in question is not easy to find if you forget the right terms. What to do if you remember that this is related to groups but don't know about the icosahedral symmetry? We can approach the problem from the other side. Suppose we can come up with _any_ two permutations p and q such that p*p=e, q*q*q=e, and p*q*p*q*p*q*p*q*p*q=e. Then, we can multiply those permutations according to the sequences from the input, and in case two sequences resulted in different products we know that they're not equivalent. In case they resulted in the same product, we can't be sure (for example, if both p=e and q=e, this will always happen). But if we try enough (p, q) pairs, most likely we'll be able to differentiate. So we can use simple brute force to find a few such (p, q) pairs, and use them to compare our sequences!

Thanks for reading, and Happy New Year! Now on to the New Year Contest, let's see how many different languages I can remember in the remaining time :)

Thursday, December 24, 2015

An icosahedral week

TopCoder SRM 676 was the first event of last week (problems, results, top 5 on the left). The match has once again provided ample challenge opportunities, but ACRush made sure he can relax during the challenge phase by being the only one to submit three correct solutions - congratulations! As a result, he has also moved to the clear third place in the all-time SRM victory stats.

Open Cup 2015-16 Grand Prix of Peterhof wrapped up the week on Sunday (problems, results, top 5 on the left). Problem C was very beautiful as it allowed several different approaches, and yet required a significant insight to come up with any of them. Here's what it was about: two strings of 0s and 1s are considered equivalent, if one can be obtained from the other by repeatedly removing or inserting (anywhere in the string) the following substrings: 00, 111, 0101010101. For example, 100110 and 010011 are equivalent, since we can do the following sequence of transformations: 100110 --> 1110 --> 0 --> 0111 --> 010011. Given 4000 strings of total length at most 50000, find out the groups of equivalent strings among them.

Finally, let me give the final pieces of the puzzle for the interactive string guessing problem from NEERC I've mentioned in two previous posts: the judges have a hidden binary string of 1000 zeroes and ones, and you have to guess it using at most 1500 attempts. For each attempt, you output some string of 1000 zeroes and ones, and the judge responds with one of three answers: either your guess is correct (and that means you've won); it's incorrect but _exactly_ half (500) of all characters are correct; all other incorrect situations grouped together.

Let's just make a random guess, 1000 random bits. What's the probability that we'll get 1000 matches? It's just 1/2**1000, a very, very small number. However, the probability that we'll get the other interesting answer - exactly 500 matches - is much higher, since there are many ways for the matches to happen. More precisely, it's C(1000, 500)/2**1000, which is about 2.5%! Which means that, on average, we'll get at least one "500" answer after just 40 random attempts.

After we get one "500" answer, we can do the following: let's flip exactly two bits in our string - bit 0 and bit x, for all values of x from 1 to 999. If the answer stays "500", it means that exactly one of bit 0 and bit x is correct, and in the other case they're either both correct or both incorrect. After doing this for all 999 pairs, we know the correct value of each bit if we assume that bit 0 is correct, and the correct value of each bit if we assume that bit 0 is incorrect, so we have just two remaining candidate strings to check, and one of them is guaranteed to give answer 1000.

In total, we're using about 1040 queries on average. Of course, on some testcases we might require more, but it's not hard to check that we're extremely unlikely to come even close to the allowed limit of 1500.

An interesting fact about this problem is that it's still not clear if any deterministic solution that fits into 1500 queries exists. Can you come up with one?