Sunday, October 26, 2014

2014 NEERC Southern Subregional

The ICPC NEERC South Subregional was mirrored on Codeforces. It was a very nice contest, with some approachable but also challenging problems. Here are my thoughts on the solutions (I solved everything except A, J and L during the contest).

A: Nasta Rabbara

This is quite a nasty one, and my initial attempt during the contest was all wrong. The first thing to be aware of is that a single query can be answered in O(L log L) time (or less) using a modification on the standard union-find data structure: each edge in the union-find structure is labelled to indicate whether end-points have the same or the opposite parity, and the find operation tracks whether the returned root has the same or opposite parity as the query point. That way, edges that create cycles can be found to be either odd- or even-length cycles. Of course, this won't be fast enough if all the queries are large.

Ideally, one would like a data structure that allows both new edges to be added and existing edges to be removed. That would allow for a sliding-window approach, in which we identify the maximal left end-point for each right end-point. However, the 10 second time limit suggests that this is not the right approach.

Instead, there is a \(O(N+(M+Q)\sqrt{M}\log N)\) solution. Dividing the series into blocks of length \(\sqrt{M}\). For each block, identify all the queries with a right end-point inside the block. Now build up a union-find structure going right-to-left, starting from the left edge of the block. Whenever you hit the left end-point of one of the identified queries, add the remaining episodes for the query (which will all come from inside the block) to answer the query, then undo the effect of these extra episodes before continuing. As long as you don't do path compression, each union operation can be unwound in O(1) time. This will miss queries that occur entirely inside a block, but these can be answered by the algorithm from the first paragraph as they are short.

B: Colored blankets

It turns out that it is always possible to find a solution. Firstly, blankets with no colour can be given an arbitrary colour on one side. It is fairly easy to see that we need only allocate blankets to kits such that each kit contains blankets of at most two colours. Repeat the following until completion:
  1. If any colour has at most K/N blankets remaining and that colour has not been painted onto any kit, put those blankets into a new kit and paint it with that colour (this might involve zero blankets into that kit).
  2. Otherwise, pick any colour that has not been painted on a kit. There must be a more than K/N blankets of that colour. Use enough of them to fill up any non-full painted kit. There must be such a kit, otherwise there are more than K blankets in total.
Since each kit is painted with a unique colour, this generates at most N kits; and since each kit has K/N blankets in it, it must generate exactly N kits.

C: Component tree

The only difficulty with this problem is that the tree might be very deep, causing the naive algorithm to spend a lot of time walking up the tree. This can be solved with a heavy-light decomposition. On each heavy path, for each attribute that appears anywhere on the path, store a sorted list (by depth) of the nodes containing that attribute. When arriving on a heavy path during a walk, a binary search can tell where the nearest ancestor with that property occurs on the heavy path. I think this makes each query O(log N) time.

D: Data center

This is a reasonably straightforward sliding window problem. Sort the servers of each type, and start with the minimum number of low voltage servers (obviously taking biggest first). One might also be required to take all the low voltage servers plus some high voltage servers. Then remove the low voltage servers one at a time (smallest first), and after each removal, add high voltage servers (largest first) until the capacity is made up. Then compare this combination to the best solution so far.

E: Election

Firstly, ties can be treated as losses (both within a station and in the election), because we want the mayor to win. When merging two stations, there are two useful results that can occur: win+loss -> win, or loss+loss -> loss; in both cases the number of wins stays the same, while the number of losses goes down by one. So they are equally useful. We can determine the number of viable merges by DP: either the last two can be merged, and we solve for the other N-2; or the last one is untouched, and we solve for the other N-1.

F: Ilya Muromets

Note that the gap closing up after a cut is just a distraction: any set of heads we can cut, can also be cut as two independent cuts of the original set of heads.

We can determine the best cut for every prefixes in linear time, by keeping a sliding window of sums (or using a prefix sum). Similarly, we can determine the best cut for every suffix. Any pair of cuts can be separated into a cut of a prefix and of the corresponding suffix, so we need only consider each split point in turn.

G: FacePalm

Let's consider each k contiguous days in turn, going left to right. If the current sum is non-negative, we need to reduce some of the values. We might as well reduce the right-most value we can, since that will have the effect on as many future values as possible (past values have already been fixed to be negative, so that is not worth considering). So we reduce the last value as much as needed, or until it reaches the lower limit. If necessary, we then reduce the second-last value, and so on until the sum is negative.

The only catch is that we need a way to quickly skip over long sequences of the minimum value, to avoid quadratic running time. I kept a cache of previous non-minimum value (similar to path compression in union-find structures); a stack of the positions of non-minimum values should work too.

H: Minimal Agapov code

The first triangle will clearly consist of the minimum three labels. This divides the polygon into (up to) three sections. Within each section, the next triangle will always stand on the existing diagonal, with the third vertex being the lowest label in the section. This will recursively subdivide the polygon into two more sections with one diagonal, and so on. One catch is that ties need to be broken carefully: pick the one furthest from the base with the smaller label (this lead to me getting a WA). Rather than considering all the tie-breaking cases for the first triangle, I started with a first diagonal with the two smallest labels, and found the first triangle through the recursion.

The main tool needed for this is an efficient range minimum query. There are a number of data structures for this, and any of them should work. I used two RMQ structures to cater for the two possible tie-breaking directions. The cyclic rather than linear nature of the queries, but it is just a matter of being careful.

I: Sale in GameStore

This was the trivial problem: sort, get your friends to buy the most expensive item, start filling up on cheap items until you can't buy any more or you've bought everything.

J: Getting Ready for the VIPC

I got a wrong answer to test case 53 on this, and I still don't know why. But I think my idea is sound.

The basic approach is dynamic programming, where one computes the minimum tiredness one can have after completing each contest, assuming it is possible to complete it (this also determines the resulting skill). To avoid some corner cases, I banned entering a contest with tiredness greater than \(h_i - l_i\). However, this is \(O(N^2)\), because for each contest, one must consider all possible previous contests.

The first optimisation one can do is that one can make a list of outcomes for each day, assuming one enters a contest that day: a list of (result skill, tiredness), one for each contest. If one contest results in both less skill and more tiredness than another, it can be pruned, so that one ends up with a list that increases in both skill and tiredness. Now one can compute the DP for a contest by considering only each previous day, and finding the minimum element in the list for which the skill in great enough to enter the current contest. The search can be done by binary search, so if there are few distinct days with lots of contests each day, this will be efficient; but if every contest is on a different day, we're no better off.

The second optimisation is to note the interesting decay function for tiredness. After about 20 days of inactivity, tireness is guaranteed to reach 0. Thus, there is no need to look more than this distance into the past: beyond 20 days, we only care about the maximum skill that can be reached on that day, regardless of how tired one is. This reduces the cost to \(O(N\log N\log maxT\). 

K: Treeland

Pick any vertex and take its nearest neighbour: this is guaranteed to be an edge; call the end-points A and B. An edge in a tree partitions the rest of the tree into two parts. For any vertex C, either d(A, C) < d(B, C) or vice versa, and this tells us which partition C belongs to. We can thus compute the partition, and then recursively solve the problem within each partition.

L: Useful roads

I didn't solve this during the contest, and I don't know exactly how to solve it yet.

M: Variable shadowing

This is another reasonably straightforward implementation problem. For each variable I keep a stack of declarations, with each declaration tagged with the source position and a "scope id" (each new left brace creates a new scope id). I also keep a stack of open scope ids. When a right brace arrives, I check the top-of-stack for each variable to see if it matches the just closed scope id, and if so, pop the stack.