Saturday, January 20, 2018

COCI 2017/2018 r5 analysis

For the problem texts, see here. I found this contest easier than usual for a COCI.


Given two wands A < B and two boxes X > Y, there is never any point in putting A in X and B in Y; if they fit, then they also fit the other way around. So sort the boxes and wands, match them up in this order and check if it all fits.


This is really just an implementation challenge. For each spiral, walk along it, filling in the grid where the spiral overlaps it. With more than one spiral, set each grid point to the minimum of the values induced by each spiral. Of course, it is not necessary to go to \(10^{100}\); about 10000 is enough to ensure that the grid is completely covered by the partial spiral.


The rule for determining which employee does the next task is a red herring. Each employee does exactly one task, and the order doesn't matter. The money earned by an employee is the sum of the distance from that employee to all underlings. The can be computed by a tree DP (computing both this sum and the size of each subtree).


Suppose there is a solution. It can be modified by a series of transformations into a canonical form.Firstly, moving a false claim below an adjacent true claim will not alter either of them; thus, we can push all the false claims to the bottom and true claims to the top. At this point, the true claims can be freely reordered amongst themselves. Similarly, if we have a false claim with small a above one with a larger a, we can swap them without making either true. So we can assume that the false claims are increasing from bottom to top in the deck. Finally, given a true claim with large a and a false claim with small a, we can swap them and they will both flip (so the positions of false claims stay the same).

After these transformations, the deck will, from bottom to top, contain the largest K cards in increasing order, followed by the rest in arbitrary order (let's say increasnig). To solve the problem, we simply construct this canonical form, then check if it indeed satisfies the conditions.


When building roads on the day with factor F, we don't actually need to build roads between every pair of multiples of F: it is equivalent to connect every multiple of F to F. This gives O(N log M) roads. We can represent the connected components after each day with a standard union-find structure. For reasons we'll see later, we won't use path compression, but always putting the smaller component under the larger one in the tree is sufficient to ensure a shallow tree (in theory O(log N), but I found the maximum depth was 6). A slow solution would be to check every remaining query after each day to see whether the two mathematicians are in the same component yet.

To speed this up, we can record extra information in the tree: for each edge, we record the day on which it was added. If the largest label on a path from A to B is D, then D was the first day on which they were connected (this property would be broken by path compression, which is why we cannot use it). Thus, to answer a query, we need only walk up the tree from each side to the least common ancestor; and given the shallowness of the tree, is this cheap.


I really liked this problem. Take a single starting peak P. Let A be the size of the maximum matching the graph, and let B be the size of the maximum matching excluding P. Suppose A = B. Then Mirko can win as follows: take the latter matching, and whenever Slavko moves to a valley, Mirko moves to the matched peak. Slavko can never move to a valley without a match, because otherwise the journey would form an augmenting path that would give a matching for the graph of size B + 1.

Conversely, suppose A > B. Then take a whole-graph matching, which by the assumption must include a match for P. Slavko can win by always moving to the matched valley. By a similar argument, Mirko can never reach an unmatched peak, because otherwise toggling all the edges on their journey would give a maximum matching that excludes P.

To implement it, it may not be efficient enough to construct a new subgraph matching from every peak. Instead, one can start with a full-graph matching, remove P and its match from the graph (if any), then re-augment starting from that match. This should give an O(NM) algorithm (O(NM) for the initial matching, then O(M) per query).

Sunday, November 26, 2017

Analysis of Croatian Open contest 2017r3

This round of COCI was more challenging than usual, with some of the lower-scoring problems being particularly difficult.


This was quite straightforward: run through the letters, and each time a letter is different to the previous one (or is the first one), increment a counter.


Firstly, one word can be made by rearranging the letters of another if and only if they have the same letter frequencies. So if we have a fast way to find the letter frequencies in any substring, the problem is solved. Consider the sequence whose ith element is 1 if S[i] is 'a' and 0 otherwise: the number of a's in a substring is then just a contiguous sum of this sequence. By precomputing the prefix sum, we can find the sum of any interval in constant time. We thus just need to store 26 prefix sum tables, one for each letter.


The first part (finding the length) is a reasonably standard although fiddly dynamic programming problem. At any point in the process, one has a position relative to the original grid and a nesting depth, and need to know the longest suffix that will complete a valid bracket sequence. For each of the (up to) three possible moves, one can compute the state would be reached, and find the longest possible sequence from previously computed values.

My attempt at the second part during the contest was overly-complex and algorithmically too slow, involving trying to rank all length-L strings represented by DP states, for each L in increasing order. This required a lot of sorting, making the whole thing O(N³ log N) (where N is O(R + C)).

A simpler and faster solution I wrote after the contest is to reconstruct the path non-deterministically, one bracket at a time. Instead of working from a single source state, keep a set of candidate states (in super-position). From those, run a breadth-first (or depth-first) search to find all states where one might encounter the next bracket. Any transition that is optimal (for length) in the original DP now becomes an edge in this search. Once the candidate states for the next bracket has been found, there might be a mix of opening and closing brackets. If so, discard those states corresponding to closing brackets, since they will not be lexicographically minimal.


I didn't attempt this problem, and am not sure how to solve it. I suspect it requires coming up with and proving a number of simplifying assumptions which allow the search space to be reduced.


I really liked this problem. First of all, M = 1 is a bit tricky, because it's the only case where you can't swap two numbers without having an effect. It's easy enough to solve by hand and treat as a special case.

Now let's count all the cases where the player can't win. Let's say that the lit elements have an XOR of K. For M > 1, we can't have \(K = 2^M - 1\), since otherwise the player could swap either two lit or two unlit numbers and win. Let \(K' = K \oplus 2^M - 1\), and consider a lit number A. To win, the player has to swap it with B such that  \(A \oplus B = K'\). As the RHS can't be 0, the lit numbers must be grouped into pairs with XOR of K'. The XOR of all the lit numbers must thus be either 0 or K', depending on whether there are an even or odd number of such pairs. But this is K (by definition); and since K = K' is impossible, we must have K = 0.

So now we need to find intervals such that the XOR of all elements is 0, and for every pair of complementary elements A and B, they are either both inside or both outside the interval. The first part is reasonably easy to handle: we can take a prefix sums (using \(\oplus\) instead of addition), and an interval with XOR of 0 is one where two prefix sums are equal. For the second part, we can use some sweeps: for each potential left endpoint, establish an upper bound on the right endpoint (considering only complementary pairs that lie on opposite sides of the left endpoint), and vice versa - doable with an ordered set or a priority queue to keep track of right elements of complementary pairs during the sweep. Now perform another left-to-right sweep, keeping track of possible left endpoints that are still "active" (upper bound not yet reached), and each time a right endpoint is encountered, query how many of the left endpoints with the same suffix sum are above the lower bound for the right end-point. Actually getting the count in logarithmic time requires a bit of fancy footwork with data structures that I won't go into (e.g. a Fenwick tree, a segment tree or a balanced binary tree with node sizes in the internal nodes).


I somehow miscounted and didn't realise that this problem even existed until after the contest (maybe because I don't expect more than three harder problems in COCI). It's disappointing, because it only took me about 15 minutes to solve after the contest, yet was worth the most points.

For each K-summary, draw a line between each pair of adjacent segments of that summary. For example, with a 2-summary and 3-summary and N=9, you would get |01|2|3|45|67|8|9|. Clearly in any marked interval with more than one number, none of them can be determined. It is also not hard to see that in any interval with exactly one number, that number can be determined (hint: find the sum of all numbers except that one). Thus, we need the count of numbers X such that X is a multiple of some Ki and X+1 is a multiple of another Kj. Note that if any K=1 then the answer is simply N, which we'll assume is handled as a special case.

If one has chosen a single Ki and Kj, it is not difficult to determine the numbers that satisfy the property, using the Chinese Remainder Theorem (they will be in arithmetic progression, with period Ki×Kj, and so are easy to count). However, this can easily lead to double-counting, and we thus need to apply some form of inclusion-exclusion principle. That is, we can pick any two subsets S and T of the K's, with products P and Q, and count the number of X's such that P divides X and Q divides X + 1. Note that S and T must be disjoint, as otherwise the common element would need to divide X and X+1 (and we assumed Ki > 1). We also need to determine the appropriate coefficient to scale this term in the sum. Rather than trying to derive equations, I let the computer do the work: for each value of |S| and |T| I consider every possible subset of S and of T (actually, just each possible size), add up the coefficients they're already contributing, and set the coefficient for this value of |S| and |T| to ensure that the total is 1.

Monday, June 12, 2017

Extra DCJ 2017 R2 analysis


My solution during the contest was essentially the same as the official analysis. Afterwards I realised a potential slight simplification: if one starts by computing the second-order differences (i.e., the differences of the differences), then one is looking for the longest run of zeros, rather than the longest run of the same value. That removes the need to communicate the value used in the runs at the start and end of each section.

Number Bases

I missed the trick of being able to uniquely determine the base from the first point at which X[i] + Y[i] ≠ Z[i]. Instead, at every point where X[i] + Y[i] ≠ Z[i], I determine two candidate bases (depending on whether there is a carry of not). Then I collect the candidates and test each of them. If more than three candidates are found, then the test case is impossible, since there must be two disjoint candidate pairs.

Broken Memory

My approach was slightly different. Each node binary searches for its broken value, using two other nodes to help (and simultaneously helping two other nodes). Let's say we know the broken value is in a particular interval. Split that interval in half, and compute hashes for each half on the node (h1 and h2) and on two other nodes (p1 and p2, q1 and q2). If h1 equals p1 or q1, then the broken value must be in interval 2, or vice versa. If neither applies, then nodes p and q both have broken values, in the opposite interval to that of the current node. We can tell which by checking whether p1 = q1 or p2 = q2.

This does rely on not having collisions in the hash function. In the contest I relied on the contest organisers not breaking my exact choice of hash function, but it is actually possible to write a solution that works on all test data. Let P be a prime greater than \(10^{18}\). To hash an interval, compute the sums \(\sum m_i\) and \(\sum i m_i\), both mod P, giving a 128-bit hash. Suppose two sequences p and q collide, but differ in at most two positions. The sums are the same, so they must differ in exactly two positions j and k, with \(p_j - q_j = q_k - p_k\) (all mod P). But then the second sums will differ by
\(jp_j + kp_k - jq_j - kq_k = (j - k)(p_j - q_j)\), and since P is prime and each factor is less than P, this will be non-zero.

Alternative Code Jam 2017 R3 solution

This last weekend was round 3 of Code Jam 2017 and round 2 of Distributed Code Jam 2017. I'm not going to describe how to solve all the problems since there are official analyses (here and here), but just mention some alternatives. For this post I'll just talk about one problem from Code Jam; some commentary on DCJ will follow later.

Slate Modern (Code Jam)

The idea I had for this seems simpler (in my opinion, without having tried implementing both) than the official solution, but unfortunately I had a bug that I couldn't find until about 10 minutes after the contest finished.

As noted in the official analysis, one can first check whether a solution is possible by comparing each pair of fixed cells: if the difference is value is greater than D times the Manhattan distance, then it is impossible; if no such pair exists, it is possible. The best solution is then found by setting each cell to the smallest lower bound imposed by any of the fixed cells.

Let's try to reduce the complexity by a factor of C, by computing the sum of a single row quickly. If we look at the upper bound imposed by one fixed cell, it has the shape \(b + |x - c| D\), where \(x\) is the column and b, c are constants. When combining the upper bounds, we take the lowest of them. Each function will be the smallest for some contiguous (possibly empty) interval. By sweeping through the fixed cells in order of c, we can identify those that contribute, similar to finding a Pareto front. Then, by comparing adjacent functions one can find the range in which each function is smallest, and a bit of algebra gives the sum over that range.

This reduces the complexity to O(RN + N²) (the N² is to check whether a solution is possible, but also hides an O(N log N) to sort the fixed cells). That's obviously still too slow. The next insight is that most of the time, moving from one row to the next changes very little: each of the functions increases or decreases by one (depending on whether the corresponding fixed cell is above or below the row), and the range in which each function is smallest grows or shrinks slightly. It thus seems highly likely that the row sum will be a low-degree polynomial in the row number. From experimentation with the small dataset, I found that it is actually a quadratic (conceptually it shouldn't be too hard to prove, but I didn't want to get bogged down in the details).

Note that I said "most of the time". This will only be true piecewise, and we need to find the "interesting" rows where the coefficients change. It is fairly obvious that rows containing fixed cells will be interesting. Additionally, when the range in which a function is smallest disappears or appears will be interesting. Consider just two fixed cells, and colour the whole grid according to which of the two fixed cells gives the lower upper bound. The boundary between the two colours can have diagonal, horizontal and vertical portions, and it is these horizontal portions that are (potentially) interesting. I took the conservative approach of adding all O(N²) such rows as interesting.

Now that we have partitioned the rows into homogeneous intervals, we need to compute the sum over each interval efficiently. Rather than determine the quadratic coefficients analytically, I just interfered them by taking the row sums of three consecutive rows (if there are fewer than three rows in the interval, just add up their row sums directly). A bit more algebra to find the sum of a quadratic series, and we're done! There are O(N²) intervals to sum and it requires O(N) to evaluate each of the row sums needed, giving a running time of O(N³). I suspect that this could probably be reduced to O(N² log N) by being smarter about how interesting rows are picked, but it is not necessary given the constraints on N.

Tuesday, November 22, 2016

An alternative DCJ 2016 solution

During this year's Distributed Code Jam I had an idea for solving the hard part of Toothpick Sculptures, but had no time to implement it, or even to work out all the details. When the official solution was presented, I was surprised that it was very different to mine. Recently Pablo from Google has been able to help me check that my solution does indeed pass the system tests (thanks Pablo), so it seems like a good time to describe the solution.

There are going to be a lot of different trees involved, so let's give them names. A "minitree" is one of the original sculptures of N toothpicks, and its root is a "miniroot". The "maxitree" is the full tree of 1000N toothpicks. Let's assume that each minitree is built out of toothpicks of one colour. We can then consider a tree whose vertices are colours, called the "colour tree", which contains an edge between two colours if and only if the maxitree contains an edge between two vertices of those colours.

For each colour, we can find its children in the colour tree by walking the corresponding minitree looking for children which are miniroots. This can be done in parallel over the colours.

For a given vertex, we can use DP to solve the following two problems for the subtree: what is the minimum cost to stabilise the subtree if we stabilise the root, and what is the minimum cost if we do not stabilise the root? But this will not be so easy to parallelise. We can go one step further: if we fix some vertex C and cut off the subtree rooted at C, then we can answer the same two problems for each case of C being stabilised or not stabilised (four questions in total). If we later obtain the answers to the original two queries for C, then we can combine this with the four answer we have to answer the original queries for the full tree.

This is sufficient to solve the small problem, and more generally any test case where the colour tree does not branch. For each miniroot, compute the DP, where the miniroot corresponding to the single child (if any) in the colour tree is the cutoff. These DPs can be done in parallel, and the master can combine the results.

Let's consider another special case, where the colour tree is shallow. In this case, one can solve one layer at a time, bottom up, without needing to use the cutoff trick at all. The colours in each layer of the colourtree are independent and can be solved in parallel, so the latency is proportional to the depth of the tree. The results of each layer are fed into the calculations of the layer above.

So, we have a way to handle long paths, and a way to handle shallow trees. This should immediately suggest light-heavy decomposition. Let the "light-depth" of a node in the colour tree be the number of light edges between it and the root. The native of light-heavy decomposition guarantees that light-depth is at most logarithmic in the tree size (which is 1000). We will process all nodes with the same light-depth in parallel. This means that a node in a tree may be processed at the same time as its children, but only along a heavy path. We thus handle each heavy path using the same technique as in the small problem. For other children in the colour tree, the subtree results were computed in a previous pass and are sent to the slave.

TCO 2016 finals

The final round of TCO 2016 did not go well for me. I implemented an over-complicated solution to the 500 that did not work (failed about 1 in 1000 of my random cases), and I started implementing the 400 before having properly solved it, leading to panic and rushed attempts to fix things in the code rather than solving the problem.


The fact that there are only four mealtimes is a clue that this will probably require some exponential time solution. I haven't worked out the exact complexity, but it has at least a factor of \(2^\binom{n}{n/2}\) for n mealtimes.

Let's restrict our attention to meal plans that contain no meals numbered higher than m and have t meals in the plan (t can be less than 4). We can further categorise these meals by which subsets of the mealtimes can accommodate the plan. This will form the state space for dynamic programming. To propagate forwards from this state, we iterate over the number of meals of type m+1 (0 to 4 of them). For each choice, we determine which subsets of mealtimes can accommodate this new meal plan (taking every subset that can accommodate the original plan and combining it with any disjoint subset that can accommodate the new meals of type m+1).


Let's start by working out whether a particular bracket sequence can be formed. Consider splitting the sequence into a prefix and suffix at some point. If the number of ('s in the prefix differs between the initial and target sequence, then the swaps that cross the split need to be used to increase or decrease the number of ('s. How many ('s can we possibly move to the left? We can answer that greedily, by running through all the operations and applying them if and only if they swap )( to (). Similarly, we can find the least possible number of ('s in the final prefix.

If our target sequence qualifies by not having too many or too few ('s in any prefix, can we definitely obtain it? It turns out that we can, although it is not obvious why, and I did not bother to prove it during the contest. Firstly, if there is any (proper) prefix which has exactly the right number of ('s, we can split the sequence at this point, choose not to apply any operations that cross the split, and recursively figure out how to satisfy each half (note that at this point we are not worrying whether the brackets are balanced — this argument applies to any sequence of brackets).

We cannot cross between having too few and too many ('s when growing the prefix without passing through having the right number, so at the bottom of our recursion we have sequences where every proper prefix has too few ('s (or too many, but that can be handled by symmetry). We can solve this using a greedy algorithm: for each operation, apply it if and only if it swaps )( to () and the number of ('s in the prefix is currently too low. We can compare the result after each operation to the maximising algorithm that tries to maximise the number of ('s in each prefix. It is not hard to see (and prove, using induction) that for any prefix and after each operation is considered, the number of ('s in the prefix is:
  • between the original value and the target value (inclusive); and
  • the smaller of the target value and the value found in the maximising algorithm.
Thus, after all operations have been considered, we will have reached the target.

So now we know how to determine whether a specific bracket sequence can be obtained. Counting the number of possible bracket sequences is straightforward dynamic programming: for each prefix length, we count the number of prefixes
that are valid bracket sequence prefixes (no unmatched right brackets) for each possible nesting depth.


This is an exceptionally mean problem, and I very much doubt I would have solved it in the fully 85 minutes; congratulations to Makoto for solving all three!

For convenience, let H be the complement of G2 (i.e. n copies of G1). A Hamiltonian path in G2 is simply a permutation of the vertices, such that no two consecutive vertices form an edge in H. We can count them using inclusion-exclusion, which requires us to count, for each m, the number of paths that pass through any m chosen edges. Once we pick a set of m edges and assign them directions (in such a way that we have a forest of paths), we can permute all the vertices that are not tails of edges, and the remaining vertices have their positions uniquely determined.

Let's start by solving the case n=1. Since k is very small, we can expect an exponential-time solution. For every subset S of the k vertices and every number m of edges, we can count the number of ways to pick m edges with orientation.  We can start by solving this for only connected components, which means that m = |S| - 1 and the edges form a path. This is a standard exponential DP that is used in the Travelling Salesman problem, where one counts the number of paths within each subset ending at each vertex.

Now we need to generalise to m < |S| - 1. Let v be the first vertex in the set. We can consider every subset for the path containing v, and then use dynamic programming to solve for the remaining set of vertices. This is another exponential DP, with a \(O(3^n)\) term.

Now we need to generalise to n > 1. If a graph contains two parts that are not connected, then edges can be chosen independently, and so the counts for the graph are obtained by convolving the counts for the two parts. Unfortunately, a naïve implementation of this convolution would require something like O(n²k) time, which will be too slow. But wait, why are we returning the answer modulo 998244353? Where is our old friend 1000000007, and his little brother 1000000009? In fact, \(998244353 = 2^{23} \times 7 \times 17 + 1\), which strongly suggests that the Fast Fourier Transform will be involved. Indeed, we can perform an 1048576-element FFT in the field modulo 998244353, raise each element to the power of n, and then invert the FFT.

Despite the large number of steps above, the solution actually requires surprisingly little new code, as long as one has library code for the FFT and modulo arithmetic.

Monday, November 21, 2016

TCO 2016 semifinal 2 solutions

Semifinal 2 was much tougher than semifinal 1, with only one contestant correctly finishing two problems. The first two problems each needed a key insight, after which very little coding is needed (assuming one has the appropriate library code available). The hard problem was more traditional, requiring a serious of smaller steps to work towards the solution.


Consider the smallest special clique. What happens if we remove an element? Then we will still have a clique, but the AND of all the elements is non-zero. In particular, there is some bit position which is 1 in every element of the special clique except for 1.

We can turn that around: it suffices to find a bit position B, a number T which is zero on bit B, and a special clique containing T whose other elements are all 1 on bit B. We can easily make a list of candidates to put in this special clique: any element which is 1 on bit B and which is connected to T. What's more, these candidates are all connected to each other (through bit B), so taking T plus all candidates forms a clique. What's more, if we can make a special clique by using only a subset of the candidates, then the clique formed using all candidates will be special too, so the latter is the only one we need to test.


The key insight is to treat this as a flow/matching problem. Rather than thinking of filling and emptying an unlabelled tree, we can just build a labelled tree with the properties that if A's parent is B, then B < A and A appears earlier in p than B does. A few minutes thinking should convince you that any labelled tree satisfying these constraints can be used for the sorting.

We thus need to match each number to a parent (either another number or the root). We can set this up with a bipartite graph in the usual way: each number appears on the left, connected to a source (for network flow) with an edge of capacity 1. Each number also appears on the right, along with the root, all connected to the sink with N edges of capacity 1 (we'll see later why I don't say one edge of capacity N). Between the two sides, we add an edge A → B if B is a viable parent for A, again with capacity 1. Each valid assignment can be represented by a maxflow on this network (with flow N), where the edges across the middle are parent-child relationships. Note that normally using this approach to building a tree risks creating cycles, but the heap requirement on the tree makes that impossible.

That tells us which trees are viable, but we still need to incorporate the cost function. That is where the edges from the right numbers to the sink come in. A (non-root) vertex starts with a cost of 1, then adding more children increases the cost by 3, 5, 7, ... Thus, we set the costs of the edges from each number to the sink to this sequence. Similarly, the costs of the edges from the root to the sink have costs 1, 3, 5, 7, ... The min-cost max-flow will automatically ensure that the cheaper edges get used first, so the cost of the flow will match the cost of the tree (after accounting for the initial cost of 1 for each number, which can be handled separately).


This one requires a fairly solid grasp on linear algebra and vector spaces. The set with \(2^k\) elements is a vector subspace of \(\mathbb{Z}_2^N\) of dimension k, for some unknown N. It will have a basis of size k. To deal more easily with ordering, we will choose to consider a particular canonical basis. Given an arbitrary initial basis, one can use Gaussian reduction and back propagation to obtain a (unique) basis with the following property: the leading 1 bit in each basis element is a 0 bit in every other basis element. With this basis, it is not difficult to see that including any basis element in a sum will increase rather than decrease the sum. The combination of basis elements forming the ith sorted element is thus given exactly by the binary representation of i.

We can now take the information we're given an recast it is a set of linear equations. Furthermore, we can perform Gaussian elimination on these linear equations (which we actually do in code, unlike the thought experiment Gaussian elimination above). This leaves some basis elements fixed (uniquely determined by the smaller basis elements) and others free. Because of the constraints of the basis elements, we also get some more information. If basis i appears in an equation, its leading bit must correspond to a 1 bit in the value (and to the first 1 bit, if i is the leading basis in the equation). Similarly, if basis i does not appear, then it must correspond to a 0 bit in the value. This allows us to build a mask of where the leading bit for each basis element can be.

We now switch to dynamic programming to complete the count. We count the number of ways to assign the first B basis elements using the first C value bits. Clearly dp[0][C] = 1. To add new basis, we can consider the possible positions of its leading bit (within the low C bits), and then count the degrees of freedom for the remaining bits. If the basis is fixed then there is 1 degree of freedom; otherwise, there is a degree of freedom for each bit that isn't the leading bit of a previous basis, and it is easy to count these.

There are a few details that remain to do with checking whether there are an infinite number of solutions (which can only happen if the largest basis is unconstrained), and distinguishing between a true 0 and a 0 that is actually a multiple of 1000000007, but those are left as exercises for the reader.

Sunday, November 20, 2016

Topcoder Open 2016 Semifinal 1

Here are my solutions to the first semifinal of the TCO 2016 algorithm contest.


Let's start with some easy cases. If s = t, then the answer is obviously zero. Otherwise, it is at least 1. If s and t share a common factor, then the answer is 1, otherwise it is at least 2.
We can eliminate some cases by noting that some vertices are isolated. Specifically, if either s or t is 1 or is a prime greater than n/2 (and s≠t) then there is no solution.
Assuming none of the cases above apply, can the answer be exactly 2? This would mean a path s → x → t, where x must have a common factor with s and a (different) common factor with t. The smallest x can be is thus the product of the smallest prime factors of s and t. If this is at most n, then we have a solution of length 2, otherwise the answer is at least 3.
If none of the above cases applies, then answer is in fact 3. Let p be the smallest prime factor of s and q be the smallest prime factor of t. Then the path s → 2p → 2q → t works (note that 2p and 2q are at most n because we eliminated cases where s or t is a prime greater than n/2 above).


We can create a graph with the desired properties recursively. If n is 1, then we need just two vertices, source and sink, with an edge from source to sink of arbitrary weight.
If n is even, we can start with a graph with n/2 min cuts and augment it. We add a new vertex X and edges source → X and X → sink, both of weight 1. For every min cut of the original graph, X can be placed on either side of the cut to make a min cut for the new graph.
If n is odd, we can start with a graph with n - 1 min cuts and augment it. Let the cost of the min cut of the old graph be C. We create a new source, and connect it to the original source with an edge of weight C. The cost of the min cut of this new graph is again C (this can easily be seen by considering the effect on the max flow rather than the min cut). There are two ways to achieve a cut of cost C: either cut the new edge only, or make a min cut on the original graph.
We just need to check that this will not produce too many vertices. In the binary representation of n, we need a vertex per bit and a vertex per 1 bit. n can have at most 10 bits and at most 9 one bits, so we need at most 19 vertices.


I made a complete mess of this during the contest, solving the unweighted version and then trying to retrofit it to handle a weighted version. That doesn't work.

Here's how to do it, which I worked out based on some hints from tourist and coded at 3am while failing to sleep due to jetlag. Firstly, we'll make a change that makes some of the later steps a little easier: for each pair of adjacent vertices, add an edge with weight ∞. This won't violate the nesting property, and guarantees that there is always a solution. Let the length of an edge be the difference in indices of the vertices (as opposed to the cost, which is an input). Call an edge right-maximal if it's the longest edge emanating from a particular vertex, and left-maximal if it is the longest edge terminating at a particular vertex. A useful property is that an edge is always either right-maximal, left-maximal, or both. To see this, imagine an edge that is neither: it must then be nested inside longer edges terminating at each end, but these longer edges would cross, violating the given properties.

We can now make the following observations:
  1. If x → y is right-maximal and color[x] is used, then color[y] must be used too. This is because the path must pass through x, and after x cannot skip over y.
  2. If x → y is the only edge leaving x, then the edge will appear in a path if and only if color[x] is used.
  3. If x → y is right-maximal and x →  z is the next-longest edge from x, then x → y is in a path if and only if color[x] is used and color[z] is not used.
These statements all apply similarly for left-maximal edges of course. We can consider vertices 0 and n to be of colour 0 in the above, which is a color that must be picked. What is less obvious is that condition 1 (together with its mirror) is sufficient to determine whether a set of colours is admissible. Suppose we have a set of colours that satisfies condition 1, and consider the subset S of vertices whose colours come from this set. We want to be sure that every pair of consecutive vertices in S are linked by an edge. Suppose there is a pair (u, v) which are not. Consider the right-maximal edge from u: by condition 1, it must terminate at some vertex in S, which must then be to the right of v. But the left-maximal edge to v must start to the left of u, and so these two edges cross.

We can now use this information to construct a new graph, whose minimum cut will give us the cost of the shortest path. It has a vertex per colour, a source vertex corresponding to the special colour 0, and a sink vertex not corresponding to a colour. The vertices on the source side of the cut correspond to the colours that are visited. Every time colour C implies colour D, add an edge C → D of weight ∞, to prevent C being picked while D is not. If an edge falls into case 2 above, add an edge from color[x] to the sink with weight equal to the edge cost. Otherwise it falls into case 3; add an edge from color[x] to color[y] with weight equal to the edge cost. Each edge in the original graph now corresponds to an edge in this new graph, and the original edge is traversed if and only if the new edge forms part of the cut.

Monday, June 13, 2016

Code Jam and Distributed Code Jam 2016 solutions

Code Jam

Since the official analysis page still says "coming soon", I thought I'd post my solutions to both the Code Jam and Distributed Code Jam here.

Teaching Assistant

We can start by observing that when we take a new task, it might as well match the mood of the assistant: doing so will give us at least 5 points, not doing so will give us at most 5. Also, there is no point in getting to the end of the course with unsubmitted assignments: if we did, we should not have taken that assignment, instead submitting a previous assignment (a parity argument proves that there was a previous assignment). So what we're looking for is a way to pair up days of the course, obeying proper nesting, scoring 10 points if the paired days have the same mood and 5 points otherwise.

That's all we need to start on the small case, which can be done with a standard DP. For every even-length interval of the course, we compute the maximum score. We iterate to pick the day to match up to the first day of the interval, then use previous results to find the best scores for the two subintervals this generates.

That's O(N³), which is too slow for the large case. It turns out that a greedy approach works: every day, if the mood matches the top of the stack, submit, otherwise take a new assignment. The exception is that once the stack height matches the number of remaining days, one must submit every day to ensure that the stack is emptied.

I haven't quite got my head around a proof, but since it worked on the test case I had for the small input I went ahead and submitted it.

Forest university

The very weak accuracy requirement, and the attention drawn to it, is a strong hint that the answer should be found by simulation rather than analytically. Thus, we need to find a way to uniformly sample a topological walk of the forest. This is not as simple as always uniformly picking one of the available courses. Consider one tree in the forest: it can be traversed in some number of ways (all of them starting with the root of the tree), and the rest of the forest can be traversed in some number of ways, and then these can be interleaved arbitrarily. If we consider all ways to interleave A items and B items, then A/(A+B) of them will start with an element of A. Thus, the probability that the root of a particular tree will be chosen is proportional to the size of that tree.

After picking the first element, the available courses again form a forest, and one can continue. After uniformly sampling a sequence of courses, simply check which of the cool words appear on the hat.

Rebel against the empire

I consider this problem to be quite a bit harder than D. It's easier to see what to do, but reams of tricky code.

For the small case, time is not an issue because the asteroids are static. Thus, one just needs to find the bottleneck distance to connect asteroids 0 and 1. I did this by adding edges from shortest to longest to a union-find structure until 0 and 1 were connected (ala Kruskal's algorithm), but it can also be done by priority-first search (Prim's algorithm) or by binary searching for the answer and then checking connectivity.

For the large case we'll take that last approach, binary searching over the answers and then checking whether it is possible. Naturally, we will need to know at which times it's possible to make jumps between each pair of asteroids. This is a bit of geometry/algebra, which gives a window of time for each pair (possibly empty, possibly infinite). Now consider a particular asteroid A. In any contiguous period of time during which at least one window is open, it's possible to remain on A, regardless of S, by simply jumping away and immediately back any time security are about to catch up. Also, if two of these intervals are separated in time by at most S, they can be treated as one interval, because one can sit tight on A during the gap between windows. On the other hand, any period longer than S with no open windows is of no use.

The key is to ask for the earliest time at which one can arrive at each interval. This can be done with a minor modification of Dijkstra's algorithm. The modification is that the outgoing edges from an interval are only those windows which have not already closed by the time of arrival at the interval.


I liked this problem, and I really wish the large had been worth fewer points, because I might then have taken it on and solved it.

The key is to realise that the good strings don't really matter. Apart from the trivial case of the bad string also being a good string, it is always possible to solve the problem by producing a pair of programs that can produce every string apart from the bad one.

For the small case, we can use 0?0?0?0? (N repetitions) for the first program, and 111 (N-1 repetitions) for the second (but be careful of the special case N=1!) Each of the 1's can be interleaved once into the first program to produce a 1 in the output, but because there are only N-1 1's, it is impossible to produce the bad string.

For the large case, the same idea works, but we need something a bit more cunning. The first program is basically the same: alternating digits and ?'s, with the digits forming the complement of the bad string. To allow all strings but the bad string to be output, we want the second program to be a string which does not contain the bad string as a subsequence, but which contains every subsequence of the bad string as a subsequence. This can be achieved by replacing every 1 in the bad string by 01 and every 0 by 10, concatenating them all together, then dropping the last character. Proving that this works is left as an exercise for the reader.

Distributed Code Jam


The code appears to be summing up every product of an element in A with an element in B. However, closer examination shows that those where the sum of the indices is a multiple of M (being the number of nodes) are omitted. However, once we pick an index modulo M for each, we're either adding all or none. The sum of all products of pairs is the same as the product of the sums. So, for each remainder modulo M, we can add up all elements of A, and all elements of B. Then, for each pair of remainders, we either multiply these sums together and accumulate into the grand total, or not. There are some details regarding the modulo 1000000007, but they're not difficult.

For the large case, we obviously need to distribute this, which can be done by computing each sum on a different node, then sending all the sums back to the master.


Here, a valid lisp program is just a standard bracket sequence (although it is required to be non-empty). It's well-known that a sequence is valid if and only if:
  • the nesting level (number of left brackets minus number of right brackets) of every prefix is non-negative; and
  • the nesting level of the whole sequence is zero.
If the nesting level ever goes negative in a left-to-right scan, then the point just before it went negative is the longest sequence that can be completed. This makes the small case trivial to implement.

To distribute this, we can do the usual thing of assigning each node an interval to work on. We can use a parallel prefix sum to find the nesting level at every point (there is lots of information on the internet about how to do this). Then, each node can find it's first negative nesting level, and send this back to a master to find the globally first one. We also need to know the total nesting level by sending the sum for each interval to the master, but that's already done as part of the parallel prefix sum.


I liked this one better than the Code Jam asteroids problem, but it was also rather fiddly. The small case is a straightforward serial DP: going from bottom to top, compute the maximum sum possible when ending one's turn on each location.

At first I thought that this could be parallelised in the same manner as Rocks or Mutex from last year's problems. However, those had only bottom-up and left-to-right propagation of DP state. In this problem, we have bottom-up, left-to-right and right-to-left. On the other hand, the left-to-right and right-to-left propagation is slow: at one most unit for every unit upwards. We can use this!

Each node will be responsible for a range of columns. However, let's say it starts with knowledge of the prior state for B extra columns on either side. Then after one row of DP iteration, it will correctly know the state for B-1 extra columns, and after B iterations it will still have the correct information for the columns its responsible for. At this stage it needs to exchange information with its neighbours: it will receive information about the B edge columns on either side, allowing it to continue on its way for another B rows.

There are a few issues involved in picking B. If it's too small (less than 60), then nodes will need to send more than 1000 messages. If it's too large, then nodes will do an excessive number of extra GetPosition queries. Also, B must not be larger than the number of columns a node is handling; in narrow cases we actually need to reduce the number of nodes being used.

Gas stations

The serial version of this problem is a known problem, so I'll just recap a standard solution quickly. At each km, you need to decide how much petrol to add. If there is a town within T km that is cheaper, you should add just enough petrol to reach it (the first one if there are multiple), since any extra you could have waited until you got there. Otherwise, you should fill up completely. A sweep algorithm can give the next cheaper town for every town in linear time.

What about the large case? The constraint is interesting: if we had enough memory, and if GetGasPrice was faster, there is enough time to do it serially! In fact, we can do it serially, just passing the baton from one node to the next as the car travels between the intervals assigned to each node.

What about computing the next cheaper station after each station? That would be difficult, but we don't actually need it. We only need to know whether there is a cheaper one within T km. We can start by checking the current node, using a node-local version of the next-cheapest array we had before. If that tells us that there is nothing within the current node, and T km away is beyond the end of the current node, then we will just find the cheapest within T km and check if that is cheaper than the current value. The range will completely overlap some nodes; for those nodes, we can use a precomputed array of the cheapest for each node. There will then be some prefix of another node. We'll just call GetGasPrice to fetch values on this node. Each iteration this prefix will grow by one, so we only need one call to GetGasPrice per iteration (plus an initial prefix). This means we're calling GetGasPrice at most three times per station (on average), which turns out to the fast enough. It's possible to reduce it to 2 by computing the minimum of the initial prefix on the node that already has the data, then sending it to the node that needs it.

There is one catch, which caused me to make a late resubmission. I was calling GetGasPrice to grow the prefix in the serial part of the code! You need to make all the calls up front in the parallel part of the code and cache the results.