Tuesday, November 22, 2016

TCO 2016 finals

The final round of TCO 2016 did not go well for me. I implemented an over-complicated solution to the 500 that did not work (failed about 1 in 1000 of my random cases), and I started implementing the 400 before having properly solved it, leading to panic and rushed attempts to fix things in the code rather than solving the problem.

Easy

The fact that there are only four mealtimes is a clue that this will probably require some exponential time solution. I haven't worked out the exact complexity, but it has at least a factor of $$2^\binom{n}{n/2}$$ for n mealtimes.

Let's restrict our attention to meal plans that contain no meals numbered higher than m and have t meals in the plan (t can be less than 4). We can further categorise these meals by which subsets of the mealtimes can accommodate the plan. This will form the state space for dynamic programming. To propagate forwards from this state, we iterate over the number of meals of type m+1 (0 to 4 of them). For each choice, we determine which subsets of mealtimes can accommodate this new meal plan (taking every subset that can accommodate the original plan and combining it with any disjoint subset that can accommodate the new meals of type m+1).

Medium

Let's start by working out whether a particular bracket sequence can be formed. Consider splitting the sequence into a prefix and suffix at some point. If the number of ('s in the prefix differs between the initial and target sequence, then the swaps that cross the split need to be used to increase or decrease the number of ('s. How many ('s can we possibly move to the left? We can answer that greedily, by running through all the operations and applying them if and only if they swap )( to (). Similarly, we can find the least possible number of ('s in the final prefix.

If our target sequence qualifies by not having too many or too few ('s in any prefix, can we definitely obtain it? It turns out that we can, although it is not obvious why, and I did not bother to prove it during the contest. Firstly, if there is any (proper) prefix which has exactly the right number of ('s, we can split the sequence at this point, choose not to apply any operations that cross the split, and recursively figure out how to satisfy each half (note that at this point we are not worrying whether the brackets are balanced — this argument applies to any sequence of brackets).

We cannot cross between having too few and too many ('s when growing the prefix without passing through having the right number, so at the bottom of our recursion we have sequences where every proper prefix has too few ('s (or too many, but that can be handled by symmetry). We can solve this using a greedy algorithm: for each operation, apply it if and only if it swaps )( to () and the number of ('s in the prefix is currently too low. We can compare the result after each operation to the maximising algorithm that tries to maximise the number of ('s in each prefix. It is not hard to see (and prove, using induction) that for any prefix and after each operation is considered, the number of ('s in the prefix is:
• between the original value and the target value (inclusive); and
• the smaller of the target value and the value found in the maximising algorithm.
Thus, after all operations have been considered, we will have reached the target.

So now we know how to determine whether a specific bracket sequence can be obtained. Counting the number of possible bracket sequences is straightforward dynamic programming: for each prefix length, we count the number of prefixes
that are valid bracket sequence prefixes (no unmatched right brackets) for each possible nesting depth.

Hard

This is an exceptionally mean problem, and I very much doubt I would have solved it in the fully 85 minutes; congratulations to Makoto for solving all three!

For convenience, let H be the complement of G2 (i.e. n copies of G1). A Hamiltonian path in G2 is simply a permutation of the vertices, such that no two consecutive vertices form an edge in H. We can count them using inclusion-exclusion, which requires us to count, for each m, the number of paths that pass through any m chosen edges. Once we pick a set of m edges and assign them directions (in such a way that we have a forest of paths), we can permute all the vertices that are not tails of edges, and the remaining vertices have their positions uniquely determined.

Let's start by solving the case n=1. Since k is very small, we can expect an exponential-time solution. For every subset S of the k vertices and every number m of edges, we can count the number of ways to pick m edges with orientation.  We can start by solving this for only connected components, which means that m = |S| - 1 and the edges form a path. This is a standard exponential DP that is used in the Travelling Salesman problem, where one counts the number of paths within each subset ending at each vertex.

Now we need to generalise to m < |S| - 1. Let v be the first vertex in the set. We can consider every subset for the path containing v, and then use dynamic programming to solve for the remaining set of vertices. This is another exponential DP, with a $$O(3^n)$$ term.

Now we need to generalise to n > 1. If a graph contains two parts that are not connected, then edges can be chosen independently, and so the counts for the graph are obtained by convolving the counts for the two parts. Unfortunately, a naïve implementation of this convolution would require something like O(n²k) time, which will be too slow. But wait, why are we returning the answer modulo 998244353? Where is our old friend 1000000007, and his little brother 1000000009? In fact, $$998244353 = 2^{23} \times 7 \times 17 + 1$$, which strongly suggests that the Fast Fourier Transform will be involved. Indeed, we can perform an 1048576-element FFT in the field modulo 998244353, raise each element to the power of n, and then invert the FFT.

Despite the large number of steps above, the solution actually requires surprisingly little new code, as long as one has library code for the FFT and modulo arithmetic.

Anonymous said...

Hadoop concepts, Applying modelling through R programming using Machine learning algorithms and illustrate impeccable Data Visualization by leveraging on 'R' capabilities.With companies across industries striving to bring their research and analysis (R&A) departments up to speed, the demand for qualified data scientists is rising.
data science training in bangalore

Anonymous said...

thanks for posting this blog
machine learning training in bangalore