tag:blogger.com,1999:blog-318472812017-02-16T21:30:33.547-08:00Entropy always increasesBruce Merrynoreply@blogger.comBlogger100125tag:blogger.com,1999:blog-31847281.post-53295121464490044972016-11-22T22:45:00.001-08:002016-11-22T22:45:15.422-08:00An alternative DCJ 2016 solution<div dir="ltr" style="text-align: left;" trbidi="on">During this year's Distributed Code Jam I had an idea for solving the hard part of Toothpick Sculptures, but had no time to implement it, or even to work out all the details. When the official solution was presented, I was surprised that it was very different to mine. Recently Pablo from Google has been able to help me check that my solution does indeed pass the system tests (thanks Pablo), so it seems like a good time to describe the solution.<br /><br />There are going to be a lot of different trees involved, so let's give them names. A "minitree" is one of the original sculptures of N toothpicks, and its root is a "miniroot". The "maxitree" is the full tree of 1000N toothpicks. Let's assume that each minitree is built out of toothpicks of one colour. We can then consider a tree whose vertices are colours, called the "colour tree", which contains an edge between two colours if and only if the maxitree contains an edge between two vertices of those colours.<br /><br />For each colour, we can find its children in the colour tree by walking the corresponding minitree looking for children which are miniroots. This can be done in parallel over the colours.<br /><br />For a given vertex, we can use DP to solve the following two problems for the subtree: what is the minimum cost to stabilise the subtree if we stabilise the root, and what is the minimum cost if we do not stabilise the root? But this will not be so easy to parallelise. We can go one step further: if we fix some vertex C and cut off the subtree rooted at C, then we can answer the same two problems for each case of C being stabilised or not stabilised (four questions in total). If we later obtain the answers to the original two queries for C, then we can combine this with the four answer we have to answer the original queries for the full tree.<br /><br />This is sufficient to solve the small problem, and more generally any test case where the colour tree does not branch. For each miniroot, compute the DP, where the miniroot corresponding to the single child (if any) in the colour tree is the cutoff. These DPs can be done in parallel, and the master can combine the results.<br /><br />Let's consider another special case, where the colour tree is shallow. In this case, one can solve one layer at a time, bottom up, without needing to use the cutoff trick at all. The colours in each layer of the colourtree are independent and can be solved in parallel, so the latency is proportional to the depth of the tree. The results of each layer are fed into the calculations of the layer above.<br /><br />So, we have a way to handle long paths, and a way to handle shallow trees. This should immediately suggest light-heavy decomposition. Let the "light-depth" of a node in the colour tree be the number of light edges between it and the root. The native of light-heavy decomposition guarantees that light-depth is at most logarithmic in the tree size (which is 1000). We will process all nodes with the same light-depth in parallel. This means that a node in a tree may be processed at the same time as its children, but only along a heavy path. We thus handle each heavy path using the same technique as in the small problem. For other children in the colour tree, the subtree results were computed in a previous pass and are sent to the slave.</div>Bruce Merrynoreply@blogger.com1tag:blogger.com,1999:blog-31847281.post-45148395329081301192016-11-22T22:37:00.003-08:002016-11-22T22:46:24.051-08:00TCO 2016 finals<div dir="ltr" style="text-align: left;" trbidi="on">The final round of TCO 2016 did not go well for me. I implemented an over-complicated solution to the 500 that did not work (failed about 1 in 1000 of my random cases), and I started implementing the 400 before having properly solved it, leading to panic and rushed attempts to fix things in the code rather than solving the problem.<br /><h3 style="text-align: left;">Easy</h3><div style="text-align: left;">The fact that there are only four mealtimes is a clue that this will probably require some exponential time solution. I haven't worked out the exact complexity, but it has at least a factor of \(2^\binom{n}{n/2}\) for n mealtimes.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">Let's restrict our attention to meal plans that contain no meals numbered higher than m and have t meals in the plan (t can be less than 4). We can further categorise these meals by which subsets of the mealtimes can accommodate the plan. This will form the state space for dynamic programming. To propagate forwards from this state, we iterate over the number of meals of type m+1 (0 to 4 of them). For each choice, we determine which subsets of mealtimes can accommodate this new meal plan (taking every subset that can accommodate the original plan and combining it with any disjoint subset that can accommodate the new meals of type m+1).<br /><h3 style="text-align: left;">Medium</h3><div style="text-align: left;">Let's start by working out whether a particular bracket sequence can be formed. Consider splitting the sequence into a prefix and suffix at some point. If the number of ('s in the prefix differs between the initial and target sequence, then the swaps that cross the split need to be used to increase or decrease the number of ('s. How many ('s can we possibly move to the left? We can answer that greedily, by running through all the operations and applying them if and only if they swap )( to (). Similarly, we can find the least possible number of ('s in the final prefix.<br /><br />If our target sequence qualifies by not having too many or too few ('s in any prefix, can we definitely obtain it? It turns out that we can, although it is not obvious why, and I did not bother to prove it during the contest. Firstly, if there is any (proper) prefix which has exactly the right number of ('s, we can split the sequence at this point, choose not to apply any operations that cross the split, and recursively figure out how to satisfy each half (note that at this point we are not worrying whether the brackets are balanced — this argument applies to any sequence of brackets).<br /><br />We cannot cross between having too few and too many ('s when growing the prefix without passing through having the right number, so at the bottom of our recursion we have sequences where every proper prefix has too few ('s (or too many, but that can be handled by symmetry). We can solve this using a greedy algorithm: for each operation, apply it if and only if it swaps )( to () and the number of ('s in the prefix is currently too low. We can compare the result after each operation to the maximising algorithm that tries to maximise the number of ('s in each prefix. It is not hard to see (and prove, using induction) that for any prefix and after each operation is considered, the number of ('s in the prefix is:</div><ul style="text-align: left;"><li>between the original value and the target value (inclusive); and</li><li>the smaller of the target value and the value found in the maximising algorithm.</li></ul><div style="text-align: left;">Thus, after all operations have been considered, we will have reached the target.<br /><br />So now we know how to determine whether a specific bracket sequence can be obtained. Counting the number of possible bracket sequences is straightforward dynamic programming: for each prefix length, we count the number of prefixes<br />that are valid bracket sequence prefixes (no unmatched right brackets) for each possible nesting depth.</div><h3 style="text-align: left;">Hard</h3><div style="text-align: left;">This is an exceptionally mean problem, and I very much doubt I would have solved it in the fully 85 minutes; congratulations to Makoto for solving all three!<br /><br />For convenience, let H be the complement of G2 (i.e. n copies of G1). A Hamiltonian path in G2 is simply a permutation of the vertices, such that no two consecutive vertices form an edge in H. We can count them using inclusion-exclusion, which requires us to count, for each m, the number of paths that pass through any m chosen edges. Once we pick a set of m edges and assign them directions (in such a way that we have a forest of paths), we can permute all the vertices that are not tails of edges, and the remaining vertices have their positions uniquely determined.<br /><br />Let's start by solving the case n=1. Since k is very small, we can expect an exponential-time solution. For every subset S of the k vertices and every number m of edges, we can count the number of ways to pick m edges with orientation. We can start by solving this for only connected components, which means that m = |S| - 1 and the edges form a path. This is a standard exponential DP that is used in the Travelling Salesman problem, where one counts the number of paths within each subset ending at each vertex.<br /><br />Now we need to generalise to m < |S| - 1. Let v be the first vertex in the set. We can consider every subset for the path containing v, and then use dynamic programming to solve for the remaining set of vertices. This is another exponential DP, with a \(O(3^n)\) term.<br /><br />Now we need to generalise to n > 1. If a graph contains two parts that are not connected, then edges can be chosen independently, and so the counts for the graph are obtained by convolving the counts for the two parts. Unfortunately, a naïve implementation of this convolution would require something like O(n²k) time, which will be too slow. But wait, why are we returning the answer modulo 998244353? Where is our old friend 1000000007, and his little brother 1000000009? In fact, \(998244353 = 2^{23} \times 7 \times 17 + 1\), which strongly suggests that the Fast Fourier Transform will be involved. Indeed, we can perform an 1048576-element FFT in the field modulo 998244353, raise each element to the power of n, and then invert the FFT.<br /><br />Despite the large number of steps above, the solution actually requires surprisingly little new code, as long as one has library code for the FFT and modulo arithmetic.</div></div></div>Bruce Merrynoreply@blogger.com0tag:blogger.com,1999:blog-31847281.post-42541078882516302972016-11-21T03:59:00.003-08:002016-11-21T03:59:32.870-08:00TCO 2016 semifinal 2 solutions<div dir="ltr" style="text-align: left;" trbidi="on">Semifinal 2 was much tougher than semifinal 1, with only one contestant correctly finishing two problems. The first two problems each needed a key insight, after which very little coding is needed (assuming one has the appropriate library code available). The hard problem was more traditional, requiring a serious of smaller steps to work towards the solution.<br /><h3 style="text-align: left;">Easy</h3><div style="text-align: left;">Consider the smallest special clique. What happens if we remove an element? Then we will still have a clique, but the AND of all the elements is non-zero. In particular, there is some bit position which is 1 in every element of the special clique except for 1.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">We can turn that around: it suffices to find a bit position B, a number T which is zero on bit B, and a special clique containing T whose other elements are all 1 on bit B. We can easily make a list of candidates to put in this special clique: any element which is 1 on bit B and which is connected to T. What's more, these candidates are all connected to each other (through bit B), so taking T plus all candidates forms a clique. What's more, if we can make a special clique by using only a subset of the candidates, then the clique formed using all candidates will be special too, so the latter is the only one we need to test.</div><h3 style="text-align: left;">Medium</h3><div style="text-align: left;">The key insight is to treat this as a flow/matching problem. Rather than thinking of filling and emptying an unlabelled tree, we can just build a labelled tree with the properties that if A's parent is B, then B < A and A appears earlier in p than B does. A few minutes thinking should convince you that any labelled tree satisfying these constraints can be used for the sorting.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">We thus need to match each number to a parent (either another number or the root). We can set this up with a bipartite graph in the usual way: each number appears on the left, connected to a source (for network flow) with an edge of capacity 1. Each number also appears on the right, along with the root, all connected to the sink with N edges of capacity 1 (we'll see later why I don't say one edge of capacity N). Between the two sides, we add an edge A → B if B is a viable parent for A, again with capacity 1. Each valid assignment can be represented by a maxflow on this network (with flow N), where the edges across the middle are parent-child relationships. Note that normally using this approach to building a tree risks creating cycles, but the heap requirement on the tree makes that impossible.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">That tells us which trees are viable, but we still need to incorporate the cost function. That is where the edges from the right numbers to the sink come in. A (non-root) vertex starts with a cost of 1, then adding more children increases the cost by 3, 5, 7, ... Thus, we set the costs of the edges from each number to the sink to this sequence. Similarly, the costs of the edges from the root to the sink have costs 1, 3, 5, 7, ... The min-cost max-flow will automatically ensure that the cheaper edges get used first, so the cost of the flow will match the cost of the tree (after accounting for the initial cost of 1 for each number, which can be handled separately).</div><h3 style="text-align: left;">Hard</h3><div style="text-align: left;">This one requires a fairly solid grasp on linear algebra and vector spaces. The set with \(2^k\) elements is a vector subspace of \(\mathbb{Z}_2^N\) of dimension k, for some unknown N. It will have a basis of size k. To deal more easily with ordering, we will choose to consider a particular canonical basis. Given an arbitrary initial basis, one can use Gaussian reduction and back propagation to obtain a (unique) basis with the following property: the leading 1 bit in each basis element is a 0 bit in every other basis element. With this basis, it is not difficult to see that including any basis element in a sum will increase rather than decrease the sum. The combination of basis elements forming the ith sorted element is thus given exactly by the binary representation of i.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">We can now take the information we're given an recast it is a set of linear equations. Furthermore, we can perform Gaussian elimination on these linear equations (which we actually do in code, unlike the thought experiment Gaussian elimination above). This leaves some basis elements fixed (uniquely determined by the smaller basis elements) and others free. Because of the constraints of the basis elements, we also get some more information. If basis i appears in an equation, its leading bit must correspond to a 1 bit in the value (and to the first 1 bit, if i is the leading basis in the equation). Similarly, if basis i does not appear, then it must correspond to a 0 bit in the value. This allows us to build a mask of where the leading bit for each basis element can be.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">We now switch to dynamic programming to complete the count. We count the number of ways to assign the first B basis elements using the first C value bits. Clearly dp[0][C] = 1. To add new basis, we can consider the possible positions of its leading bit (within the low C bits), and then count the degrees of freedom for the remaining bits. If the basis is fixed then there is 1 degree of freedom; otherwise, there is a degree of freedom for each bit that isn't the leading bit of a previous basis, and it is easy to count these.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">There are a few details that remain to do with checking whether there are an infinite number of solutions (which can only happen if the largest basis is unconstrained), and distinguishing between a true 0 and a 0 that is actually a multiple of 1000000007, but those are left as exercises for the reader.</div></div>Bruce Merrynoreply@blogger.com0tag:blogger.com,1999:blog-31847281.post-76213296937327593642016-11-20T09:10:00.005-08:002016-11-20T09:15:22.277-08:00Topcoder Open 2016 Semifinal 1<div dir="ltr" style="text-align: left;" trbidi="on">Here are my solutions to the first semifinal of the TCO 2016 algorithm contest.<br /><h3 style="text-align: left;">Easy</h3><div style="text-align: left;">Let's start with some easy cases. If s = t, then the answer is obviously zero. Otherwise, it is at least 1. If s and t share a common factor, then the answer is 1, otherwise it is at least 2.<br /></div><div style="text-align: left;">We can eliminate some cases by noting that some vertices are isolated. Specifically, if either s or t is 1 or is a prime greater than n/2 (and s≠t) then there is no solution.<br /></div><div style="text-align: left;">Assuming none of the cases above apply, can the answer be exactly 2? This would mean a path s → x → t, where x must have a common factor with s and a (different) common factor with t. The smallest x can be is thus the product of the smallest prime factors of s and t. If this is at most n, then we have a solution of length 2, otherwise the answer is at least 3.<br /></div><div style="text-align: left;">If none of the above cases applies, then answer is in fact 3. Let p be the smallest prime factor of s and q be the smallest prime factor of t. Then the path s → 2p → 2q → t works (note that 2p and 2q are at most n because we eliminated cases where s or t is a prime greater than n/2 above).</div><h3 style="text-align: left;">Medium</h3><div style="text-align: left;">We can create a graph with the desired properties recursively. If n is 1, then we need just two vertices, source and sink, with an edge from source to sink of arbitrary weight.<br /></div><div style="text-align: left;">If n is even, we can start with a graph with n/2 min cuts and augment it. We add a new vertex X and edges source → X and X → sink, both of weight 1. For every min cut of the original graph, X can be placed on either side of the cut to make a min cut for the new graph.<br /></div><div style="text-align: left;">If n is odd, we can start with a graph with n - 1 min cuts and augment it. Let the cost of the min cut of the old graph be C. We create a new source, and connect it to the original source with an edge of weight C. The cost of the min cut of this new graph is again C (this can easily be seen by considering the effect on the max flow rather than the min cut). There are two ways to achieve a cut of cost C: either cut the new edge only, or make a min cut on the original graph.<br /></div><div style="text-align: left;">We just need to check that this will not produce too many vertices. In the binary representation of n, we need a vertex per bit and a vertex per 1 bit. n can have at most 10 bits and at most 9 one bits, so we need at most 19 vertices.</div><h3 style="text-align: left;">Hard</h3><div style="text-align: left;">I made a complete mess of this during the contest, solving the unweighted version and then trying to retrofit it to handle a weighted version. That doesn't work.<br /><br />Here's how to do it, which I worked out based on some hints from tourist and coded at 3am while failing to sleep due to jetlag. Firstly, we'll make a change that makes some of the later steps a little easier: for each pair of adjacent vertices, add an edge with weight ∞. This won't violate the nesting property, and guarantees that there is always a solution. Let the <i>length</i> of an edge be the difference in indices of the vertices (as opposed to the <i>cost</i>, which is an input). Call an edge <i>right-maximal</i> if it's the longest edge emanating from a particular vertex, and <i>left-maximal</i> if it is the longest edge terminating at a particular vertex. A useful property is that an edge is always either right-maximal, left-maximal, or both. To see this, imagine an edge that is neither: it must then be nested inside longer edges terminating at each end, but these longer edges would cross, violating the given properties.<br /><br />We can now make the following observations:<br /><ol style="text-align: left;"><li>If x → y is right-maximal and color[x] is used, then color[y] must be used too. This is because the path must pass through x, and after x cannot skip over y.</li><li>If x → y is the only edge leaving x, then the edge will appear in a path if and only if color[x] is used.</li><li>If x → y is right-maximal and x → z is the next-longest edge from x, then x → y is in a path if and only if color[x] is used and color[z] is not used. </li></ol>These statements all apply similarly for left-maximal edges of course. We can consider vertices 0 and n to be of colour 0 in the above, which is a color that must be picked. What is less obvious is that condition 1 (together with its mirror) is sufficient to determine whether a set of colours is admissible. Suppose we have a set of colours that satisfies condition 1, and consider the subset S of vertices whose colours come from this set. We want to be sure that every pair of consecutive vertices in S are linked by an edge. Suppose there is a pair (u, v) which are not. Consider the right-maximal edge from u: by condition 1, it must terminate at some vertex in S, which must then be to the right of v. But the left-maximal edge to v must start to the left of u, and so these two edges cross.<br /><br />We can now use this information to construct a new graph, whose minimum cut will give us the cost of the shortest path. It has a vertex per colour, a source vertex corresponding to the special colour 0, and a sink vertex not corresponding to a colour. The vertices on the source side of the cut correspond to the colours that are visited. Every time colour C implies colour D, add an edge C → D of weight ∞, to prevent C being picked while D is not. If an edge falls into case 2 above, add an edge from color[x] to the sink with weight equal to the edge cost. Otherwise it falls into case 3; add an edge from color[x] to color[y] with weight equal to the edge cost. Each edge in the original graph now corresponds to an edge in this new graph, and the original edge is traversed if and only if the new edge forms part of the cut.</div></div>Bruce Merrynoreply@blogger.com0tag:blogger.com,1999:blog-31847281.post-81301064002682603772016-06-13T01:44:00.003-07:002016-06-13T01:44:48.519-07:00Code Jam and Distributed Code Jam 2016 solutions<div dir="ltr" style="text-align: left;" trbidi="on"><h2 style="text-align: left;">Code Jam</h2><div style="text-align: left;">Since the official analysis page still says "coming soon", I thought I'd post my solutions to both the Code Jam and Distributed Code Jam here.</div><h3 style="text-align: left;">Teaching Assistant</h3><div style="text-align: left;">We can start by observing that when we take a new task, it might as well match the mood of the assistant: doing so will give us at least 5 points, not doing so will give us at most 5. Also, there is no point in getting to the end of the course with unsubmitted assignments: if we did, we should not have taken that assignment, instead submitting a previous assignment (a parity argument proves that there was a previous assignment). So what we're looking for is a way to pair up days of the course, obeying proper nesting, scoring 10 points if the paired days have the same mood and 5 points otherwise.<br /><br />That's all we need to start on the small case, which can be done with a standard DP. For every even-length interval of the course, we compute the maximum score. We iterate to pick the day to match up to the first day of the interval, then use previous results to find the best scores for the two subintervals this generates.<br /><br />That's O(N³), which is too slow for the large case. It turns out that a greedy approach works: every day, if the mood matches the top of the stack, submit, otherwise take a new assignment. The exception is that once the stack height matches the number of remaining days, one must submit every day to ensure that the stack is emptied.<br /><br />I haven't quite got my head around a proof, but since it worked on the test case I had for the small input I went ahead and submitted it.<br /><h3 style="text-align: left;">Forest university</h3><div style="text-align: left;">The very weak accuracy requirement, and the attention drawn to it, is a strong hint that the answer should be found by simulation rather than analytically. Thus, we need to find a way to uniformly sample a topological walk of the forest. This is not as simple as always uniformly picking one of the available courses. Consider one tree in the forest: it can be traversed in some number of ways (all of them starting with the root of the tree), and the rest of the forest can be traversed in some number of ways, and then these can be interleaved arbitrarily. If we consider all ways to interleave A items and B items, then A/(A+B) of them will start with an element of A. Thus, the probability that the root of a particular tree will be chosen is proportional to the size of that tree.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">After picking the first element, the available courses again form a forest, and one can continue. After uniformly sampling a sequence of courses, simply check which of the cool words appear on the hat.</div><h3 style="text-align: left;">Rebel against the empire</h3><div style="text-align: left;">I consider this problem to be quite a bit harder than D. It's easier to see what to do, but reams of tricky code.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">For the small case, time is not an issue because the asteroids are static. Thus, one just needs to find the bottleneck distance to connect asteroids 0 and 1. I did this by adding edges from shortest to longest to a union-find structure until 0 and 1 were connected (ala Kruskal's algorithm), but it can also be done by priority-first search (Prim's algorithm) or by binary searching for the answer and then checking connectivity.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">For the large case we'll take that last approach, binary searching over the answers and then checking whether it is possible. Naturally, we will need to know at which times it's possible to make jumps between each pair of asteroids. This is a bit of geometry/algebra, which gives a window of time for each pair (possibly empty, possibly infinite). Now consider a particular asteroid A. In any contiguous period of time during which at least one window is open, it's possible to remain on A, regardless of S, by simply jumping away and immediately back any time security are about to catch up. Also, if two of these intervals are separated in time by at most S, they can be treated as one interval, because one can sit tight on A during the gap between windows. On the other hand, any period longer than S with no open windows is of no use.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">The key is to ask for the earliest time at which one can arrive at each interval. This can be done with a minor modification of Dijkstra's algorithm. The modification is that the outgoing edges from an interval are only those windows which have not already closed by the time of arrival at the interval.</div><h3 style="text-align: left;">Go++</h3><div style="text-align: left;">I liked this problem, and I really wish the large had been worth fewer points, because I might then have taken it on and solved it.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">The key is to realise that the good strings don't really matter. Apart from the trivial case of the bad string also being a good string, it is always possible to solve the problem by producing a pair of programs that can produce <i>every</i> string apart from the bad one.</div><div style="text-align: left;"><br />For the small case, we can use 0?0?0?0? (N repetitions) for the first program, and 111 (N-1 repetitions) for the second (but be careful of the special case N=1!) Each of the 1's can be interleaved once into the first program to produce a 1 in the output, but because there are only N-1 1's, it is impossible to produce the bad string.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">For the large case, the same idea works, but we need something a bit more cunning. The first program is basically the same: alternating digits and ?'s, with the digits forming the complement of the bad string. To allow all strings but the bad string to be output, we want the second program to be a string which does not contain the bad string as a subsequence, but which contains every subsequence of the bad string as a subsequence. This can be achieved by replacing every 1 in the bad string by 01 and every 0 by 10, concatenating them all together, then dropping the last character. Proving that this works is left as an exercise for the reader.</div><h2 style="text-align: left;">Distributed Code Jam</h2><h3 style="text-align: left;">Again</h3><div style="text-align: left;">The code appears to be summing up every product of an element in A with an element in B. However, closer examination shows that those where the sum of the indices is a multiple of M (being the number of nodes) are omitted. However, once we pick an index modulo M for each, we're either adding all or none. The sum of all products of pairs is the same as the product of the sums. So, for each remainder modulo M, we can add up all elements of A, and all elements of B. Then, for each pair of remainders, we either multiply these sums together and accumulate into the grand total, or not. There are some details regarding the modulo 1000000007, but they're not difficult.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">For the large case, we obviously need to distribute this, which can be done by computing each sum on a different node, then sending all the sums back to the master.</div><h3 style="text-align: left;">lisp_plus_plus</h3><div style="text-align: left;">Here, a valid lisp program is just a standard bracket sequence (although it is required to be non-empty). It's well-known that a sequence is valid if and only if:</div><ul style="text-align: left;"><li>the nesting level (number of left brackets minus number of right brackets) of every prefix is non-negative; and</li><li>the nesting level of the whole sequence is zero.</li></ul></div><div style="text-align: left;">If the nesting level ever goes negative in a left-to-right scan, then the point just before it went negative is the longest sequence that can be completed. This makes the small case trivial to implement.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">To distribute this, we can do the usual thing of assigning each node an interval to work on. We can use a <i>parallel prefix sum</i> to find the nesting level at every point (there is lots of information on the internet about how to do this). Then, each node can find it's first negative nesting level, and send this back to a master to find the globally first one. We also need to know the total nesting level by sending the sum for each interval to the master, but that's already done as part of the parallel prefix sum.</div><div style="text-align: left;"><h4 style="text-align: left;">Asteroids</h4><div style="text-align: left;">I liked this one better than the Code Jam asteroids problem, but it was also rather fiddly. The small case is a straightforward serial DP: going from bottom to top, compute the maximum sum possible when ending one's turn on each location.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">At first I thought that this could be parallelised in the same manner as Rocks or Mutex from last year's problems. However, those had only bottom-up and left-to-right propagation of DP state. In this problem, we have bottom-up, left-to-right <i>and</i> right-to-left. On the other hand, the left-to-right and right-to-left propagation is slow: at one most unit for every unit upwards. We can use this!</div><div style="text-align: left;"><br /></div><div style="text-align: left;">Each node will be responsible for a range of columns. However, let's say it starts with knowledge of the prior state for B extra columns on either side. Then after one row of DP iteration, it will correctly know the state for B-1 extra columns, and after B iterations it will still have the correct information for the columns its responsible for. At this stage it needs to exchange information with its neighbours: it will receive information about the B edge columns on either side, allowing it to continue on its way for another B rows.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">There are a few issues involved in picking B. If it's too small (less than 60), then nodes will need to send more than 1000 messages. If it's too large, then nodes will do an excessive number of extra GetPosition queries. Also, B must not be larger than the number of columns a node is handling; in narrow cases we actually need to reduce the number of nodes being used.</div><h3 style="text-align: left;">Gas stations</h3><div style="text-align: left;">The serial version of this problem is a known problem, so I'll just recap a standard solution quickly. At each km, you need to decide how much petrol to add. If there is a town within T km that is cheaper, you should add just enough petrol to reach it (the first one if there are multiple), since any extra you could have waited until you got there. Otherwise, you should fill up completely. A sweep algorithm can give the next cheaper town for every town in linear time.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">What about the large case? The constraint is interesting: if we had enough memory, and if GetGasPrice was faster, there is enough time to do it serially! In fact, we <i>can</i> do it serially, just passing the baton from one node to the next as the car travels between the intervals assigned to each node.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">What about computing the next cheaper station after each station? That would be difficult, but we don't actually need it. We only need to know whether there is a cheaper one within T km. We can start by checking the current node, using a node-local version of the next-cheapest array we had before. If that tells us that there is nothing within the current node, and T km away is beyond the end of the current node, then we will just find the cheapest within T km and check if that is cheaper than the current value. The range will completely overlap some nodes; for those nodes, we can use a precomputed array of the cheapest for each node. There will then be some prefix of another node. We'll just call GetGasPrice to fetch values on this node. Each iteration this prefix will grow by one, so we only need one call to GetGasPrice per iteration (plus an initial prefix). This means we're calling GetGasPrice at most three times per station (on average), which turns out to the fast enough. It's possible to reduce it to 2 by computing the minimum of the initial prefix on the node that already has the data, then sending it to the node that needs it.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">There is one catch, which caused me to make a late resubmission. I was calling GetGasPrice to grow the prefix in the <i>serial</i> part of the code! You need to make all the calls up front in the parallel part of the code and cache the results.</div></div></div>Bruce Merrynoreply@blogger.com1tag:blogger.com,1999:blog-31847281.post-43314163602563016682016-03-06T10:52:00.000-08:002016-03-06T10:52:05.508-08:00Solutions to Facebook Hacker Cup 2016<div dir="ltr" style="text-align: left;" trbidi="on"><div><div>Since the official solutions aren't out yet, here are my solutions. Although I scored 0 in the contest, four of my solutions were basically correct, one I was able to solve fairly easily after another contestant pointed out that my approach was wrong, and I figured out Grundy Graphs on the way home.<br /><h3 style="text-align: left;">Snake and Ladder</h3><div style="text-align: left;">I'll describe two solutions to this. There are a few things one can start off with. Firstly, if there are rows at the top or bottom that are completely full of flowers, just cut them off — they make no difference. Then, handle N=1 specially: if K=1, the answer is 1, if K=0, it is 2. A number of people (including all the contest organisers!) messed up the first, and I messed up the second. Also, if one rung has two flowers on it, the answer is 0.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">My contest solution used dynamic programming. If one cuts the ladder half-way between two rungs and considers the state of the bottom half, it can be divided into four cases:</div><ol style="text-align: left;"><li>Only the left-hand side has the snake on it.</li><li>Only the right-hand side has the snake on it.</li><li>Both sides have the snake on it, and the two are connected somewhere below.</li><li>Both sides have the snake on it, and the two do not connect.</li></ol></div>It is reasonably straightforward (as long as one is careful) to determine the dynamic programming transitions to add a rung, for each possibility of a flower on the left, on the right, or no flower. Advancing one rung at a time will be too slow for the bound on N, but the transitions can be expressed as 4×4 matrices and large gaps between flowers can be crossed with fast matrix exponentiation. Finally, one must double the answer to account for the choice of which end is the head.<br /><br />The alternative solution is to argue on a case-by-case basis. If there no flowers, then one can pick a row for the head and a row for the tail, and pick a column for the head (the column for the tail is then forced by parity). Once these choices are made, there is only one way to complete the snake. Also, the head and tail can only occupy the same row if it is the top or bottom row, giving 2N(N-1)+4 ways.<br /><br />If there is at least one flower, then the head must be above the top flower and the tail below the bottom flower (or vice versa). Again, one can choose a row for each, with the column being forced by parity. One must also consider special cases for a flower in the top/bottom row.<br /><h3 style="text-align: left;">Boomerang Crew</h3><div style="text-align: left;">I think this was possibly the easiest of the problems, provided you thought clearly about it (which I failed to do). Clearly any opponents weaker than (or equal to) your champion can be defeated just be putting them up against your champion. Also, if you're going to defeat P players, they might as well be the weakest P players of the opposition. For a given P, can this be done? There will be a certain number of strong players (stronger than our champion), which we will have to sacrifice our weaker players against to wear them down. What's more, once we've worn one down and beaten him/her, we have to use the winner of that game to wear down the next strong opponent. Thus, we should match up the S strong players against our best S players in some order, and use our remaining players as sacrifices. Ideally we'd like to wear down each opponent to exactly the skill level of the player we will use to win; anything more is wasted effort. We can apply this idea greedily: for each opponent, find the remaining strong player of ours who can win by the smallest margin.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">With that method to test a particular P, it is a standard binary search to determine the number of players we can beat.</div><h3 style="text-align: left;">Grundy Graph</h3><div style="text-align: left;">I think this was the hardest problem: my submission wasn't a real solution and was just for a laugh; several other people I spoke to had submitted but didn't believe in their solution. During the contest I suspected that the solution would be related to 2-SAT, but it wasn't until the flight home that I could figure out how to incorporate turn ordering.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">Let us construct an auxiliary directed graph A whose vertices correspond to those of the original. An edge u->v in A means that if u is black, then v must be black as well for Alice to win. All the edges in the original graph are clearly present in A. However, for each u->v in the original, we also add v'->u', where x' is the vertex assigned a colour at the same time as x. This is because if v' is black but u' is white, then it implies that u is black and v is white. This is the same as the contrapositive edge in a 2-SAT graph. Let x=>y indicate that y is reachable from x in A.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">There are a number of situations in which we can see that Bob can win:</div><ul style="text-align: left;"><li>If any strongly connected component contains both x and x', then Bob wins regardless of what Alice and Bob play.</li><li>If Bob controls vertices x and y and x=>y, then Bob wins by assigning x black and y white, regardless of Alice's play.</li><li>If Bob controls x, Alice controls y, x=>y, x' => y', and y is played before x, then Bob wins by assigning y the opposite colour to x.</li></ul>We claim that if none of the above apply, then Alice wins. Her strategy is that on her turn, she must colour x black if there is any u such that u => x and either u has already been coloured black, u could later be coloured black by Bob, or u = x'. This ensures that it does not immediately allow Bob to win, nor allows Bob to use it to win later. The only way this could fail is if both x and x' are forced to be black (if neither is forced, Alice can pick freely).<br /><br />Suppose u => x and v => x', where u and v are as described above. Then x => v', so u => v'. Firstly consider u = x'. We cannot have v = x (because otherwise x, x' are in the same SCC). Now x' => v', so v => v'. If Bob controls v, then we hit the second case above; if Alice controls v, then v => v' means she would never have chosen v. So we have a contradiction. Now, if u ≠ x' (and by symmetry, v ≠ x), then consider u => v' again. Alice cannot control v/v', since she would then have chosen v' rather than v. But similarly, v => u', so Alice couldn't have chosen u. It follows that Bob controls u and v, which can only be possible if u = v'. But u and u' can only both be relevant if Bob hasn't decided their colour yet, which is the third case above.<br /><br />The implementation details are largely left as an exercise. Hint: construct the SCCs of A, and the corresponding DAG of the SCCs, then walk it in topological order to check the conditions. The whole thing takes linear time.<br /><h3 style="text-align: left;">RNG</h3><div style="text-align: left;">The concept here is a reasonably straightforward exponential DP, but needs a careful implementation to get all the formulae right, particularly in the face of potential numeric instability. The graph itself isn't particularly relevant; the only "interesting" vertices are the start and the gifts, and the only useful information is the distance from each interesting vertex to each other one (if reachable) and the distance from each gift to the nearest dead-end. The DP state is the set of gifts collected and the current position (one of the interesting vertices). When one is at a gift, there are two possible options:</div><ol style="text-align: left;"><li>Pick another gift, and make towards it along the shortest path. Based on the path length L, there is a probability P^L of reaching it in time DL, and a probability 1 - P^L of respawning, with an expected time computable from a formula.</li><li>Aim to respawn by heading for the nearest dead-end. Again, one can work out a formula for the expected time to respawn.</li></ol></div>When one is at the start, one should of course pick a gift and head for it. Again, one can compute an expected time to reach it (one obtains a formula that includes its own result, but a little algebra sorts that out).<br /><h3 style="text-align: left;">Maximinimax flow</h3><div style="text-align: left;">I think I liked this problem the best, even though I screwed up the limits on the binary search. Firstly, what is the minimax flow? For a general graph it would be nasty to compute, but this graph has the same number of edges as vertices. This means that it has exactly one cycle with trees hanging off of it. It shouldn't be too hard to convince yourself that the minimax flow is the smaller of the smallest edge outside the cycle and the sum of the two smallest edges inside the cycle.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">Veterans will immediately reach for a binary search, testing whether it is possible to raise the minimax flow to a given level. For the edges outside the ring, this is a somewhat standard problem that can be solved with query structures e.g. Fenwick trees, indexed by edge value. To raise the minimum to a given level, one needs to know the number of edges below that level (one Fenwick tree, storing a 1 for each edge), and their current sum (a second tree, storing the edge weight for each edge).</div><div style="text-align: left;"><br /></div><div style="text-align: left;">For the vertices inside the cycle it is only slightly more complex. If the required level is less than double the second-smallest edge, then one need only augment the smallest edge. Otherwise one raises all edges to half the target, with a bit of special-casing when the target is odd. The Fenwick tree can also be queried to find the two smallest edges, but I just added a std::multiset to look this up.</div><h3 style="text-align: left;">Rainbow strings</h3><div style="text-align: left;">This seemed to be one of the most-solved problems, and is possibly the easiest if you have a suffix tree/array routine in your toolbox. A right-to-left sweep will yield the shortest and longest possible substring for each starting position (next green for the shortest, next red for the longest). The trick is to process queries in order of length. Each entry in the suffix array will become usable at some length, and unusable again at another length, and these events, along with the queries, can be sorted by length for processing. The only remaining work is to determine the Kth of the active suffixes, which can be done with a Fenwick tree or other query structure.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">Finally, one needs the frequency table for each prefix to be able to quickly count the frequency in each substring.</div></div>Bruce Merrynoreply@blogger.com2tag:blogger.com,1999:blog-31847281.post-72473165469976218642015-05-20T13:58:00.001-07:002015-05-21T06:52:37.720-07:00Analysis of ICPC 2015 world finals problems<div dir="ltr" style="text-align: left;" trbidi="on"><h2 style="text-align: left;">Analysis of ICPC 2015 world finals problems</h2><div style="text-align: left;">I haven't solved all the problems yet, but here are solutions to those I have solved (I've spent quite a bit more than 5 hours on it though).</div><h3 style="text-align: left;">A: Amalgamated Artichokes</h3><div style="text-align: left;">This is a trivial problem that basically everybody solved - I won't go into it.</div><h3 style="text-align: left;">B: Asteroids</h3><div style="text-align: left;">This looks quite nasty, but it can actually be solved by assembling a number of fairly standard computational geometry tools, and the example input was very helpful. To simplify things, let's switch to the frame of reference of the first asteroid, and assume only the second asteroid moves (call them P and Q for short). The first thing to notice is that the area is a piecewise quadratic function of time, with changes when a vertex of one polygon crosses an edge of the other. This is because the derivative of area depends on the sections of boundary of Q inside P, and those vary linearly in those intervals. Finding the boundaries between intervals is a standard line-line intersection problem. To distinguish between zero and "never", touching is considered to be an intersection too.</div><div style="text-align: left;">We also need a way to compute the area of the intersection. Clipping a convex polygon to a half-space is a standard problem too - vertices are classified as inside or outside, and edges that go from inside to outside or vice versa introduce a new interpolated vertex, so repeated clipping will give the intersection of convex polygons. And finding the area of a polygon is also very standard.</div><div style="text-align: left;">Finally, we need to handle the case where the best area occurs midway through one of the quadratic pieces. Rather than try to figure out the quadratic coefficients geometrically, I just sampled it at three points (start, middle, end) and solved for the coefficients, and hence the maximum, with algebra.</div><h3 style="text-align: left;">C: Catering</h3><div style="text-align: left;">I'm sure I'm seen something similar, but I'm not sure where. It can be set up as a min-cost max-flow problem. Split every site into two, an arrival area and a departure area. Connect the source to each departure area, with cap 1 for clients and cap K for the company. Connect each arrival area to the sink similarly. Connect every departure area to every later arrival area with cap 1, except for the company->company link with cap K. Edge costs match the table of costs where appropriate, zero elsewhere. The maximum flow is now K+N (K teams leave the company, and equipment arrives at and leaves every client), and the minimum cost is the cost of the operation.</div><h3 style="text-align: left;">D: Cutting cheese</h3><div style="text-align: left;">This is basically just a binary search for each cut position, plus some mathematics to give the volume of a sphere that has been cut by a plane.</div><h3 style="text-align: left;">E: Parallel evolution</h3><div style="text-align: left;">Start by sorting the strings by length. A simplistic view is that we need to go through these strings in order, assigning each to one path or the other. This can be done with fairly standard dynamic programming, but it will be far too slow.<br />Add an "edge" from each string to the following one if it is a subsequence of this following one. This will break the strings up into connected chains. A key observation is that when assigning the strings in a chain, it is never necessary to switch sides more than once: if two elements in a chain are on one path, one can always put all the intervening elements on the same path without invalidating a solution. Thus, one need only consider two cases for a chain: put all elements in one path (the opposite path to the last element of the previous chain), or start the chain on one path then switch to the other path part-way through. In the latter case, one should make the switch as early as possible (given previous chains), to make it easier to place subsequent strings. A useful feature is that in either case, we know the last element in both paths, which is all that matters for placing future chains. Dynamic programming takes care of deciding which option to use. Only a linear number of subsequence queries is required.<br />I really liked this problem, even though the implementation in messy. It depends only on the compatibility relationship being a partial order and there being a cheap way to find a topological sort.</div><h3 style="text-align: left;">F: Keyboarding</h3><div style="text-align: left;">This can be done with a fairly standard BFS. The graph has a state for each cursor position and number of completed letters. There are up to 5 edges from each state, corresponding to the 5 buttons. I precomputed where each arrow key moves to from each grid position, but I don't know if that is needed. It's also irrelevant that keys form connected regions.<br /><h3 style="text-align: left;">H: Qanat</h3><div style="text-align: left;">This is a good one for a mathsy person to work out on paper while someone else is coding - the code itself is trivial. Firstly, the slope being less than one implies that dirt from the channel always goes out one of the two nearest shafts - whichever gets it to the surface quicker (it helps to think of there being a zero-height shaft at x=0). Between each pair of shafts, one can easily find the crossover point.</div><div style="text-align: left;">Consider just three consecutive shafts, and the cost to excavate them and the section of channel between them. If the outer shafts are fixed, the cost is a quadratic in the position of the middle shaft, from which one can derive a formula for its position in terms of its neighbours. This gives a relationship of the form x<sub>i+2</sub> - gx<sub>i+1</sub> + x<sub>i</sub> = 0. In theory, this can now be used to iteratively find the positions of all the shafts, up to a scale factor which is solved by the requirement for the final shaft (the mother well) to be at x=W.<br />I haven't tested it, but I think evaluating the recurrence could lead to catastrophic rounding problems, because errors introduced early on are exponentially scaled up, faster than the sequence itself grows. The alternative I used is to find a closed formula for the recurrence. This is a known result: let a and b be the roots of x<sup>2</sup> - gx + 1 = 0; then the ith term is r(a<sup>i</sup> - b<sup>i</sup>) for the scale factor r. Finding the formula for g shows that the roots will always be real (and distinct) rather than complex.</div></div><h3 style="text-align: left;">I: Ship traffic</h3><div style="text-align: left;">This mostly needed some careful implementation. For each ship, there is an interval during which the ferry cannot launch. Sort the start and end times of all these intervals and sweeping through them, keeping track of how many ships will be hit for each interval between events. The longest such interval with zero collisions is the answer.</div><h3 style="text-align: left;">J: Tile cutting</h3><div style="text-align: left;">Firstly, which sizes can be cut, and in how many ways? If the lengths from the corners of the parallelograms to the corners of the tile are a, b, c, d, then the area is ab + cd. So the number of ways to make a size is the number of ways it can be written as ab + cd, for a, b, c, d > 0. The number of ways to write a number as ab can be computed by brute force (iterate over all values of a and b for which ab <= 500000). The number of ways to write a number as ab + cd is the convolution of this function with itself (ala polynomial multiplication). There are well-known algorithms to do this in O(N log N) time (with an FFT) or in O(N<sup>1.58</sup>) time (divide-and-conquer), and either is fast enough.<br /><h3 style="text-align: left;">K: Tours</h3><div style="text-align: left;">I've got a solution that passes, but I don't think I can fully prove why.<br />I start by running a DFS to decompose the graph into a forest plus edges from a node to an ancestor in the forest. Each such "up" edge corresponds to a simple cycle. Clearly, the number of companies must divide into the length of any cycle, so we can take the GCD of these cycle lengths.<br />If the cycles were all disjoint, this would (I think) be sufficient, but overlapping cycles are a problem. Suppose two cycles overlap. That means there are two points A and B, with three independent paths between them, say X, Y, Z. If X has more than its share of routes from some company, then Y must have less than its share, to balance the tour X-Y. Similarly, Z must have less than its share. But then the tour Y-Z has too few routes from that company. It follows that X, Y and Z must all have equal numbers of routes from each company, and hence the number of companies must divide into each of their lengths.<br />And now for the bit I can't prove: it seems to be sufficient to consider only pairs of the simple cycles corresponding to using a single up edge. I just iterate over all pairs and compute the overlap. That is itself not completely trivial, requiring an acceleration structure to compute kth ancestors and least common ancestors.</div><h3 style="text-align: left;">L: Weather report</h3><div style="text-align: left;">This is an interesting variation of a classic result in compression theory, Hamming codes. The standard way to construct an optimal prefix-free code from a frequency table is to keep a priority queue of coding trees, and repeatedly combine the two least frequent items by giving them a common parent. That can't be done directly here because there are trillions of possible strings to code, but it can be done indirectly by noting that permutations have identical frequency, and storing them as a group rather than individual items. The actions of the original algorithm then need to be simulated, pairing up large numbers of identical items in a single operation.</div></div></div>Bruce Merrynoreply@blogger.com5tag:blogger.com,1999:blog-31847281.post-8725731193559615342014-10-26T02:42:00.001-07:002014-10-26T02:42:54.172-07:002014 NEERC Southern Subregional<div dir="ltr" style="text-align: left;" trbidi="on">The ICPC NEERC South Subregional was mirrored on Codeforces. It was a very nice contest, with some approachable but also challenging problems. Here are my thoughts on the solutions (I solved everything except A, J and L during the contest).<br /><h3 style="text-align: left;">A: Nasta Rabbara</h3><div style="text-align: left;">This is quite a nasty one, and my initial attempt during the contest was all wrong. The first thing to be aware of is that a single query can be answered in O(L log L) time (or less) using a modification on the standard union-find data structure: each edge in the union-find structure is labelled to indicate whether end-points have the same or the opposite parity, and the find operation tracks whether the returned root has the same or opposite parity as the query point. That way, edges that create cycles can be found to be either odd- or even-length cycles. Of course, this won't be fast enough if all the queries are large.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">Ideally, one would like a data structure that allows both new edges to be added and existing edges to be removed. That would allow for a sliding-window approach, in which we identify the maximal left end-point for each right end-point. However, the 10 second time limit suggests that this is not the right approach.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">Instead, there is a \(O(N+(M+Q)\sqrt{M}\log N)\) solution. Dividing the series into blocks of length \(\sqrt{M}\). For each block, identify all the queries with a right end-point inside the block. Now build up a union-find structure going right-to-left, starting from the left edge of the block. Whenever you hit the left end-point of one of the identified queries, add the remaining episodes for the query (which will all come from inside the block) to answer the query, then undo the effect of these extra episodes before continuing. As long as you don't do path compression, each union operation can be unwound in O(1) time. This will miss queries that occur entirely inside a block, but these can be answered by the algorithm from the first paragraph as they are short.</div><h3 style="text-align: left;">B: Colored blankets</h3><div style="text-align: left;">It turns out that it is always possible to find a solution. Firstly, blankets with no colour can be given an arbitrary colour on one side. It is fairly easy to see that we need only allocate blankets to kits such that each kit contains blankets of at most two colours. Repeat the following until completion:</div><ol style="text-align: left;"><li>If any colour has at most K/N blankets remaining and that colour has not been painted onto any kit, put those blankets into a new kit and paint it with that colour (this might involve zero blankets into that kit).</li><li>Otherwise, pick any colour that has not been painted on a kit. There must be a more than K/N blankets of that colour. Use enough of them to fill up any non-full painted kit. There must be such a kit, otherwise there are more than K blankets in total.</li></ol>Since each kit is painted with a unique colour, this generates at most N kits; and since each kit has K/N blankets in it, it must generate exactly N kits.<br /><h3 style="text-align: left;">C: Component tree</h3><div style="text-align: left;">The only difficulty with this problem is that the tree might be very deep, causing the naive algorithm to spend a lot of time walking up the tree. This can be solved with a heavy-light decomposition. On each heavy path, for each attribute that appears anywhere on the path, store a sorted list (by depth) of the nodes containing that attribute. When arriving on a heavy path during a walk, a binary search can tell where the nearest ancestor with that property occurs on the heavy path. I think this makes each query O(log N) time.</div><h3 style="text-align: left;">D: Data center</h3><div style="text-align: left;">This is a reasonably straightforward sliding window problem. Sort the servers of each type, and start with the minimum number of low voltage servers (obviously taking biggest first). One might also be required to take all the low voltage servers plus some high voltage servers. Then remove the low voltage servers one at a time (smallest first), and after each removal, add high voltage servers (largest first) until the capacity is made up. Then compare this combination to the best solution so far.</div><h3 style="text-align: left;">E: Election</h3><div style="text-align: left;">Firstly, ties can be treated as losses (both within a station and in the election), because we want the mayor to win. When merging two stations, there are two useful results that can occur: win+loss -> win, or loss+loss -> loss; in both cases the number of wins stays the same, while the number of losses goes down by one. So they are equally useful. We can determine the number of viable merges by DP: either the last two can be merged, and we solve for the other N-2; or the last one is untouched, and we solve for the other N-1.</div><h3 style="text-align: left;">F: Ilya Muromets</h3><div style="text-align: left;">Note that the gap closing up after a cut is just a distraction: any set of heads we can cut, can also be cut as two independent cuts of the original set of heads.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">We can determine the best cut for every prefixes in linear time, by keeping a sliding window of sums (or using a prefix sum). Similarly, we can determine the best cut for every suffix. Any pair of cuts can be separated into a cut of a prefix and of the corresponding suffix, so we need only consider each split point in turn.</div><h3 style="text-align: left;">G: FacePalm</h3><div style="text-align: left;">Let's consider each k contiguous days in turn, going left to right. If the current sum is non-negative, we need to reduce some of the values. We might as well reduce the right-most value we can, since that will have the effect on as many future values as possible (past values have already been fixed to be negative, so that is not worth considering). So we reduce the last value as much as needed, or until it reaches the lower limit. If necessary, we then reduce the second-last value, and so on until the sum is negative.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">The only catch is that we need a way to quickly skip over long sequences of the minimum value, to avoid quadratic running time. I kept a cache of previous non-minimum value (similar to path compression in union-find structures); a stack of the positions of non-minimum values should work too.</div><h3 style="text-align: left;">H: Minimal Agapov code</h3><div style="text-align: left;">The first triangle will clearly consist of the minimum three labels. This divides the polygon into (up to) three sections. Within each section, the next triangle will always stand on the existing diagonal, with the third vertex being the lowest label in the section. This will recursively subdivide the polygon into two more sections with one diagonal, and so on. One catch is that ties need to be broken carefully: pick the one furthest from the base with the smaller label (this lead to me getting a WA). Rather than considering all the tie-breaking cases for the first triangle, I started with a first diagonal with the two smallest labels, and found the first triangle through the recursion.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">The main tool needed for this is an efficient range minimum query. There are a number of data structures for this, and any of them should work. I used two RMQ structures to cater for the two possible tie-breaking directions. The cyclic rather than linear nature of the queries, but it is just a matter of being careful.</div><h3 style="text-align: left;">I: Sale in GameStore</h3><div style="text-align: left;">This was the trivial problem: sort, get your friends to buy the most expensive item, start filling up on cheap items until you can't buy any more or you've bought everything.</div><h3 style="text-align: left;">J: Getting Ready for the VIPC</h3><div style="text-align: left;">I got a wrong answer to test case 53 on this, and I still don't know why. But I think my idea is sound.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">The basic approach is dynamic programming, where one computes the minimum tiredness one can have after completing each contest, assuming it is possible to complete it (this also determines the resulting skill). To avoid some corner cases, I banned entering a contest with tiredness greater than \(h_i - l_i\). However, this is \(O(N^2)\), because for each contest, one must consider all possible previous contests.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">The first optimisation one can do is that one can make a list of outcomes for each day, assuming one enters a contest that day: a list of (result skill, tiredness), one for each contest. If one contest results in both less skill and more tiredness than another, it can be pruned, so that one ends up with a list that increases in both skill and tiredness. Now one can compute the DP for a contest by considering only each previous day, and finding the minimum element in the list for which the skill in great enough to enter the current contest. The search can be done by binary search, so if there are few distinct days with lots of contests each day, this will be efficient; but if every contest is on a different day, we're no better off.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">The second optimisation is to note the interesting decay function for tiredness. After about 20 days of inactivity, tireness is guaranteed to reach 0. Thus, there is no need to look more than this distance into the past: beyond 20 days, we only care about the maximum skill that can be reached on that day, regardless of how tired one is. This reduces the cost to \(O(N\log N\log maxT\). </div><h3 style="text-align: left;">K: Treeland</h3><div style="text-align: left;">Pick any vertex and take its nearest neighbour: this is guaranteed to be an edge; call the end-points A and B. An edge in a tree partitions the rest of the tree into two parts. For any vertex C, either d(A, C) < d(B, C) or vice versa, and this tells us which partition C belongs to. We can thus compute the partition, and then recursively solve the problem within each partition.</div><h3 style="text-align: left;">L: Useful roads</h3><div style="text-align: left;">I didn't solve this during the contest, and I don't know exactly how to solve it yet.</div><h3 style="text-align: left;">M: Variable shadowing</h3><div style="text-align: left;">This is another reasonably straightforward implementation problem. For each variable I keep a stack of declarations, with each declaration tagged with the source position and a "scope id" (each new left brace creates a new scope id). I also keep a stack of open scope ids. When a right brace arrives, I check the top-of-stack for each variable to see if it matches the just closed scope id, and if so, pop the stack.</div></div>Bruce Merrynoreply@blogger.com4tag:blogger.com,1999:blog-31847281.post-56399056529668202882014-07-18T01:54:00.002-07:002014-07-31T12:03:04.609-07:00IOI 2014 day 2 analysis<div dir="ltr" style="text-align: left;" trbidi="on"><div><div><div>I found day 2 much harder than day 1, and I still don't know how to solve all the problems (I am seriously impressed by those getting perfect scores). Here's what I've managed to figure out so far.<br /><br /><b>Update</b>: I've now solved everything (in theory), and the solutions are below. The <a href="http://www.ioi2014.org/index.php/competition/contest-tasks">official solutions</a> are now also available on the IOI website. I'll try coding the solutions at some point if I get time.<br /><h3 style="text-align: left;">Gondola</h3><div style="text-align: left;">This was the easiest of the three. Firstly, what makes a valid gondola sequence? In all the subtasks of this problem, there will be two cases. If you see any of the numbers 1 to n, that immediately locks in the phase, and tells you the original gondola for every position. Otherwise, the phase is unknown. So, the constraints are that</div><ul style="text-align: left;"><li>if the phase is known, every gondola up to n must appear in the correct spot if it appears;</li><li>no two gondolas can have the same number.</li></ul></div>Now we can consider how to construct a replacement sequence (and also to count them), which also shows that these conditions are sufficient. If the phase is not locked, pick it arbitrarily. Now the "new gondola" column is simply the numbers from n+1 up to the largest gondola, so picking a replacement sequence is equivalent to deciding which gondola replaces each broken gondola. We can assign each gondola greater than n that we can't see to a position (one where the final gondola number is larger), and this will uniquely determine the replacement sequence. We'll call such gondolas <i>hidden</i>.<br /><br />For the middle set of subtasks, the simplest thing is to assign all hidden gondolas to one position, the one with the highest-numbered gondola in the final state. For counting the number of possible replacement sequences, each hidden gondola can be assigned independently, so we just multiply together the number of options, and also remember to multiply by n if the phase is unknown. In the last subtask there are too many hidden gondolas to deal with one at a time, but they can be handled in batches (those between two visible gondolas), using fast exponentiation.<br /><h3>Friend</h3><div style="text-align: left;">This is a weighted maximum independent set problem. On a general graph this is NP-hard, so we will need to exploit the curious way in which the graph is constructed. I haven't figured out how to solve the whole problem, but let's work through the subtasks:</div><ol style="text-align: left;"><li>This is small enough to use brute force (consider all subsets and check whether they are independent).</li><li>The graph will be empty, so the sample can consist of everyone. </li><li>The graph will be complete, so only one person can be picked in a sample. Pick the best one.</li><li>The graph will be a tree. There is a fairly standard tree DP to handle this case: for every subtree, compute the best answer, either with the root excluded or included. If the root is included, add up the root-excluded answers for every subtree; otherwise add up the best of the two for every subtree. This takes linear time.</li><li>In this case the graph is bipartite and the vertices are unweighted. This is a standard problem which can be solved by finding the maximum bipartite matching. The relatively simple flow-based algorithm for this is theoretically \(O(n^3)\), but it is one of those algorithms that tends to run much faster in most cases, so it may well be sufficient here.</li></ol></div>The final test-case clearly requires a different approach, since n can be much larger. I only managed to figure this out after getting a big hint from the SA team leader, who had seen the official solution.<br /><br />We will process the operations in reverse order. For each operation, we will transform the graph into one that omits the new person, but for which the optimal solution has the same score. Let's say that the last operation had A as the host and B as the invitee, and consider the different cases:<br /><ul style="text-align: left;"><li>YourFriendsAreMyFriends: this is the simplest: any solution using B can also use A, and vice versa. So we can collapse the two vertices into one whose weight is the sum of the original weights, and use it to replace A.</li><li>WeAreYourFriends: this is almost the same, except now we can use at most one of A and B, and which one we take (if either) has no effect on the rest of the graph. So we can replace A with a single vertex having the larger of the two weights, and delete B.</li><li>IAmYourFriend: this is a bit trickier. Let's start with the assumption that B will form part of the sample, and add that to the output value before deleting it. However, if we later decide to use A, there will be a cost to remove B again; so A's weight <i>decreases</i> by the weight of B. If it ends up with negative weight, we can just clamp it to 0.</li></ul></div>Repeat this deletion process until only the original vertex is left; the answer will be the weight of this vertex, plus the saved-up weights from the IAmYourFriend steps.<br /><div><h3 style="text-align: left;">Holiday</h3><div style="text-align: left;">Consider the left-most and right-most cities that Jian-Jia visits. Regardless of where he stops, he will need to travel from the start city to one of the ends, and from there to the other end. There is no point in doing any other back-tracking, so we can tell how many days he spends travelling just from the end-points. This then tells us how many cities he has time to see attractions in, and obviously we will pick the best cities within the range.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">That's immediately sufficient to solve the first test case. To solve more, we can consider an incremental approach. Fix one end-point, and gradually extend the other end-point, keeping track of the best cities (and their sum) in a priority queue (with the worst of the best cities at the front). As the range is extended, the number of cities that can be visited shrinks, so items will need to be popped. Of course, the next city in the range needs to be added each time as well. Using a binary heap, this gives an \(O(n^2\log n)\) algorithm: a factor of n for each endpoint, and the \(\log n\) for the priority queue operations. That's sufficient for subtask 3. It's also good enough for subtask 2, because the left endpoint will be city 0, saving a factor of n.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">For subtask 4, it is clearly not possible to consider every pair of end-points. Let's try to break things up. Assume (without loss of generality) that we move first left, then back to the start, then right. Let's compute the optimal solution for the left part and the right part separately, then combine them. The catch is that we need to know how we are splitting up our time between the two sides. So we'll need to compute the answer for each side for all possible number of days spent within each side. This seems to leave us no better off, since we're still searching within a two-dimensional space (number of days and endpoint), but it allows us to do some things differently.<br /><br />We'll just consider the right-hand side. The left-hand side is similar, with minor changes because we need two days for travel (there and back) instead of one. Let f(d) be the optimal end-point if we have d days available. Then with a bit of work one can show that f is non-decreasing (provided one is allowed to pick amongst ties). If we find f(d) for d=1, 2, 3, ... in that order, it doesn't really help: we're only, on average, halving the search space. But we can do better by using a divide-and-conquer approach: if we need to find f for all \(d \in [0, D)\) then we start with \(d = \frac{D}{2}\) to subdivide the space, and then recursively process each half of the interval on disjoint subintervals of the cities. This reduces the search space to \(O(n\log n)\).<br /><br />This still leaves the problem of efficiently finding the total number of attractions that can be visited for particular intervals and available days. The official solution uses one approach, based on a segment tree over the cities, sorted by number of attractions rather than position. The approach I found is, I think, simpler. Visualise the recursion described above as a tree; instead of working depth-first (i.e., recursively), we work breadth-first. We make \(O(\log n)\) passes, and in each pass we compute f(d) where d is an odd multiple of \(2^i\) (with \(i\) decreasing with each pass). Each pass can be done in a single incremental process, similar to the way we tackled subpass 2. The difference is that each time we cross into the next subinterval, we need to increase \(d\), and hence bring more cities into consideration. To do this, we need either a second priority queue of excluded cities, or we can replace the priority queue with a balanced binary tree. Within each pass, d can only be incremented \(O(n)\) times, so the total running time will be \(O(n\log n)\) per pass, or \(O(n\log n \log n)\) overall.</div></div></div>Bruce Merrynoreply@blogger.com2tag:blogger.com,1999:blog-31847281.post-30935558131283065492014-07-16T08:00:00.000-07:002014-07-17T03:41:53.826-07:00IOI 2014 day 1 analysis<div dir="ltr" style="text-align: left;" trbidi="on"><div><h2 style="text-align: left;">IOI 2014 day 1</h2><div style="text-align: left;">Since there is no online judge, I haven't tried actually coding any of these. So these ideas are not validated yet. You can find the problems <a href="http://www.ioi2014.org/index.php/competition/contest-tasks">here</a>.</div><h3 style="text-align: left;">Rails</h3><div style="text-align: left;">I found this the most difficult of the three to figure out, although coding it will not be particularly challenging.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">Firstly, we can note that distances are symmetric: a route from A to B can be reflected in the two tracks to give a route from B to A. So having only \(\frac{n(n-1)}{2}\) queries is not a limitation, as we can query all distances. This might be useful in tackling the first three subtasks, but I'll go directly to the hardest subtask.</div><div style="text-align: left;"><br /></div>If we know the position and type of a station, there is one other that we can immediately locate: the closest one. It must have the opposite type and be reached by a direct route. Let station X be the closest to station 0. The other stations can be split into three groups:<br /><ol style="text-align: left;"><li>d(X, Y) < d(X, 0): these are reached directly from station X and of type C, so we can locate them exactly.</li><li>d(0, X) + d(X, Y) = d(0, Y), but not of type 1: these are reached from station 0 via station X, so they lie to the left of station 0.</li><li>All other stations lie to the right of station X.</li></ol>Let's now consider just the stations to the right of X, and see how to place them. Let's take them in increasing distance from 0. This ensures that we encounter all the type D stations in order, and any type C station will be encountered at some point after the type D station used to reach it. Suppose Y is the right-most type D station already encountered, and consider the distances for a new station Z. Let \(z = d(0, Z) - d(0, Y) - d(Y, Z)\). If Z is type C, then there must be a type D at distance \(\frac{z}{2}\) to the left of Y. On the other hand, if Z is of type D (and lies to the right of Y), then there must be a type C station at distance \(\frac{z}{2}\) to the left of Y. In the first case, we will already have encountered the station, so we can always distinguish the two cases, and hence determine the position and type of Z.</div><div></div><div>The stations to the left of station zero can be handled similarly, using station X as the reference point instead of station 0.</div><div></div><div>How many queries is this? Every station Z except 0 and X accounts for at most three queries: d(0, Z), d(X, Z) and d(Y, Z), where Y can be different for each Z. This gives \(3(n-2) + 1\), which I think can be improved to \(3(n-2)\) just by counting more carefully. Either way, it is sufficient to solve all the subtasks.<br /><h3 style="text-align: left;">Wall </h3></div><div style="text-align: left;">This is a fairly standard interval tree structure problem, similar to Mountain from IOI 2005 (but a little easier). Each node of the tree contains a range to which its children are clamped. To determine the value of any element of the wall, start at the leaf with a value of 0 and read up the tree, clamping the value to the range in each node in turn. Initially, each node has the range [0, inf). When applying a new instruction, it is done top-down, and clamps are pushed down the tree whenever recursion is necessary.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">An interesting aspect of the problem is that it is offline, in that only the final configuration is requested and all the information is provided up-front. This makes me think that there may be an alternative solution that processes the data in a different order, but I can't immediately see a nicer solution than the one above. </div><h3 style="text-align: left;">Game</h3><div style="text-align: left;">I liked this problem, partly because I could reverse-engineer a solution from the assumption that it is always possible to win, and partly because it requires neither much algorithm/data-structure training (like Wall) nor tricky consideration of cases (like Rails). Suppose Mei-Yu knows that certain cities are connected. If there are any flights between the cities that she has not asked about, then she can win simply by saving one of these flights for last, since it will not affect whether the country is connected. It follows that for Jian-Jia to win, he must always answer no when asked about a flight between two components that Mei-Yu does not know to be connected, <i>unless</i> this is the last flight between these components?</div><div style="text-align: left;"><br /></div><div style="text-align: left;">What if he always answers yes to the last flight between two components? In this case he will win. As long as there are at least two components left, there are uncertain edges between every pair of them, so Mei-Yu can't know whether any of them is connected any other. All edges within a component are known, so the number of components can only become one after the last question.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">What about complexity? We need to keep track of the number of edges between each pair of components, which takes \(O(N^2)\) space. Most operations will just decrement one of these counts. There will be \(N - 1\) component-merging operations, each of which requires a linear-time merging of these edge counts and updating a vertex-to-component table. Thus, the whole algorithm requires \(O(N^2)\) time. This is optimal given that Mei-Yu will ask \(O(N^2)\) questions.</div></div>Bruce Merrynoreply@blogger.com2tag:blogger.com,1999:blog-31847281.post-72428844748534215182014-06-30T09:18:00.000-07:002014-06-30T09:27:26.945-07:00ICPC Problem H: Pachinko<div dir="ltr" style="text-align: left;" trbidi="on">Problem A has been covered quite well <a href="http://keet.wordpress.com/2014/06/28/acm-icpc-2014-solution-to-problem-a-baggage/">elsewhere</a>, so I won't discuss it. That leaves only problem H. I started by reading the Google translation of a <a href="http://fajnezadania.wordpress.com/2014/06/28/pachinko/">Polish writeup</a> by gawry. The automatic translation wasn't very good, but it gave me one or two ideas I borrowed. I don't know how my solution compares in length of code, but I consider it much simpler conceptually.<br /><br />This is a fairly typical Markov process, where a system is in some state, and each timestep it randomly selects one state as a function of the current state. One variation is that the process stops once the ball reaches a target, whereas Markov processes don't terminate. I was initially going to model that as the ball always moving from the target to itself, but things would have become slightly complicated.<br /><br />Gawry has a nice way of making this explicitly a matrix problem. Set up the matrix M as for a Markov process i.e., \(M_{i,j}\) is the probability of a transition from state j to state i. However, for a target state j, we set \(M_{i,j}=0\) for all i. Now if \(b\) is our initial probability vector (equal probability for each empty spot in the first row), then \(M^t b\) represents the probability of the ball being in each position (and the game not having previously finished) after \(t\) timesteps. We can then say that the expected amount of time the ball spends in each position is given by \(\sum_{t=0}^{\infty} M^t b\). The sum of the elements in this vector is the expected length of a game and we're told that it is less than \(10^9\), so we don't need to worry about convergence. However, that doesn't mean that the matrix series itself converges: Gawry points out that if there are unreachable parts of the game with no targets, then the series won't converge. We fix that by doing an initial flood-fill to find all reachable cells and only use those in the matrix. Gawry then shows that under the right conditions, the series converges to \((I - M)^{-1} b\).<br /><br />This is where my solution differs. Gawry dismisses Gaussian elimination, because the matrix can be up to 200,000 square. However, this misses the fact that it is <i>banded</i>: by numbering cells left-to-right then top-to-bottom, we ensure that every non-zero entry in the matrix is at most W cells away from the main diagonal. Gaussian elimination (without pivoting) preserves this property. We can exploit this both to store the matrix compactly, and to perform Gaussian elimination in \(O(W^3H)\) time.<br /><br />One concern is the "without pivoting" caveat. I was slightly surprised that my first submission passed. I think it is possible to prove correctness, however. Gaussian elimination without pivoting is known (and easily provable) to work on strictly column diagonally dominant matrices. In our case the diagonal dominance is weak: columns corresponding to empty cells have a sum of zero, those corresponding to targets have a 1 on the diagonal and zeros elsewhere. However, the matrix is also irreducible, which I think is enough to guarantee that there won't be any division by zero.<br /><br />EDIT: actually it's not guaranteed to be irreducible, because the probabilities can be zero and hence it's possible to get from A to B without being able to get from B to A. But I suspect that it's enough that one can reach a target from every state. </div>Bruce Merrynoreply@blogger.com4tag:blogger.com,1999:blog-31847281.post-10944906129548100282014-06-28T09:59:00.001-07:002014-06-28T12:27:23.563-07:00ICPC Problem L: Wires<div dir="ltr" style="text-align: left;" trbidi="on">While this problem wasn't too conceptually difficult, it requires a <i>lot</i> of code (my solution is about 400 lines), and careful implementation of a number of geometric algorithms. A good chunk of the code comes from implementing a rational number class in order to precisely represent the intersection points of the wires. It is also very easy to suffer from overflow: I spent a long time banging my head against an assertion failure on the server until I upgraded my rational number class to use 128-bit integers everywhere, instead of just for comparisons.<br /><br />The wires will divide the space up into connected regions. The regions can be represented in a planar graph, with edges between regions that share an edge. The problem is then to find the shortest path between the regions containing the two new end-points.<br /><br />My solution works in a number of steps:<br /><ol style="text-align: left;"><li>Find all the intersection points between wires, and the segments between intersection points. This just tests every wire against every other wire. The case of two parallel wires sharing an endpoint needs to be handled carefully. For each line, I sort the intersection points along the line. I used a dot produce for this, which is where my rational number class overflowed, but would probably have been safer to just sort lexicographically. More than two lines can meet at an intersection point, so I used a std::map to assign a unique ID to each intersection point (I'll call them vertices from here on).</li><li>Once the intersection points along a line have been sorted, one can identify the segments connecting them. I create two copies of each segment, one in each direction. With each vertex A I store a list of all segments A->B. Each pair is stored contiguously so that it is trivial to find its partner. Each segment is considered to belong to the region to its left as one travels A->B.</li><li>The segments emanating from each vertex are sorted by angle. These comparisons could easily cause overflows again, but one can use a handy trick: instead of using the vector for the segment in an angle comparison, one can use the vector for the entire wire. It has identical direction but has small integer coordinates.</li><li>Using the sorted lists from the previous step, each segment is given a pointer to its following segment from the same region. In other words, if one is tracing the boundary of the region and one has just traced A->B, the pointer will point to B->C.</li><li>I extract the <i>contours</i> of the regions. A region typically consists of an outer contour and optionally some holes. The outermost region lacks an outer contour (one could add a big square if one needed to, but I didn't). A contour is found by following the next pointers. A case that turns out to be inconvenient later is that some segments might be part of the contour but not enclose any area. This can make a contour disappear completely, in which case it is discarded. Any remaining contours have the property that two adjacent segments are not dual to each other, although it is still possible to both sides of an edge to belong to the same contour.</li><li>Each contour is identified as an outer contour or a hole. With integer coordinates I could just measure the signed area of the polygon, but that gets nasty with rational coordinates. Instead, I pick the lexicographically smallest vertex in the contour and examine the angle between the two incident segments (this is why it is important that there is a non-trivial angle between them). I also sort the contours by this lexicographically smallest vertex, which causes any contour to sort before any other contours it encloses.</li><li>For each segment I add an edge of weight 1 from its containing region to the containing region of its dual.</li><li>For each hole, I search backwards through the other contours to find the smallest non-hole that contains it. I take one vertex of the hole and do a point-in-polygon test. Once again, some care is needed to avoid overflows, and using the vectors for the original wires proves useful. One could then associate the outer contour and the holes it contains into a single region object, but instead I just added an edge to the graph to join them with weight 0. In other words, one can travel from the boundary of a region to the outside of a hole at no cost.</li><li>Finally, I identify the regions containing the endpoints of the new wire, using the same search as in the previous step.</li></ol>After all this, we still need to implement a shortest path search - but by this point that seems almost trivial in comparison.<br /><br />What is the complexity? There can be \(O(M^2)\) intersections and hence also \(O(M^2)\) contours, but only \(O(M)\) of them can be holes (because two holes cannot be part of the same connected component). The slowest part is the fairly straightforward point-in-polygon test which tests each hole against each non-hole segment, giving \(O(M^3)\) time. There are faster algorithms for doing point location queries, so it is probably theoretically possible to reduce this to \(O(M^2\log N)\) or even \(O(M^2)\), but certainly not necessary for this problem.</div>Bruce Merrynoreply@blogger.com1tag:blogger.com,1999:blog-31847281.post-86068129023016434372014-06-28T02:23:00.000-07:002014-06-30T09:18:51.138-07:00ICPC Problem J: Skiing<div dir="ltr" style="text-align: left;" trbidi="on">I'll start by mentioning that there is also some analysis discussion on the <a href="http://apps.topcoder.com/forums/?module=Thread&threadID=823018&start=0&mc=5#1894272">Topcoder forums</a>, which includes alternative solutions that in some cases I think are much nicer than mine.<br /><br />So, on to problem J - which no team solved, and only one team attempted. I'm a little surprised by this, since it's the sort of problem where one can do most of the work on paper while someone else is using the machine. A possible reason is that it's difficult to determine the runtime: I just coded it up and hoped for the best.<br /><br />It is a slightly annoying problem for a few reasons. Firstly, there are a few special cases, because \(v_y\) or \(a\) can be zero. We also have to find the lexicographically smallest answer.<br /><br />Let's deal with those special cases first. If \(v_y = 0\) then we can only reach targets with \(y = 0\). If \(a \not= 0\) then we can reach all of them in any order, otherwise we can only reach those with \(x = 0\). To get the lexicographically smallest answer, just go through them in order and print out those that are reachable. I actually dealt with the \(v_y \not= 0\), \(a = 0\) case as part of my general solution, but is also easy enough to deal with. We can only reach targets with \(x = 0\), and we can only reach them in increasing order of y. One catch is that targets are not guaranteed to have distinct coordinates, so if two targets have the same coordinates we must be careful to visit them in lexicographical order. I handled this just by using a stable sort.<br /><br />So, onwards to the general case. Before we can start doing any high-level optimisation, we need to start by answering this question: if we are at position P with X velocity \(v\) and want to pass through position Q, what possible velocities can we arrive with? I won't prove it formally, but it's not hard to believe that to arrive with the smallest velocity, you should start by accelerating (at the maximum acceleration) for some time, then decelerate for the rest of the time. Let's say that the total time between P and Q is \(T\), the X separation is \(x\), the initial acceleration time is \(T - t\) and the final deceleration time is \(t\). Some basic integration then gives<br />\[x = v(T-t)+\tfrac{1}{2}a(T-t)^2 + (v+a(T-t))t - \tfrac{1}{2}at^2.\]<br />This gives a quadratic in \(t\) where the linear terms conveniently cancel, leaving<br />\[t = \sqrt{\frac{vT+\tfrac{1}{2}aT^2-x}{a}}\]<br />The final velocity is just \(v+aT-2at\). We can also find the largest possible arrival velocity simply by replacing \(a\) with \(-a\). Let's call these min and max velocity functions \(V_l(v, T, x)\) and \(V_h(v, T, x)\).<br /><br />More generally, if we can leave P with a given range of velocities \([v_0, v_1]\), with what velocities can we arrive at Q? We first need to clamp the start range to the range from which we can actually reach Q i.e., that satisfy \(|vT - x| \le \frac{1}{2}aT^2\). A bit of calculus shows that \(V_l\) and \(V_h\) are decreasing functions of \(v\), so the arrival range will be \([V_l(v_1), V_h(v_0)]\).<br /><br />Finally, we are ready to tackle the problem as a whole, with multiple targets. We will use dynamic programming to answer this question, starting from the last target (highest Y): if I start at target <i>i</i> and want to hit a total of <i>j</i> targets, what are the valid X velocities at <i>i</i>? The answer will in general be a set of intervals. We will make a pseudo-target at (0, 0), and the answer will then be the largest <i>j</i> for which 0.0 is a valid velocity at this target.<br /><br />Computing the DP is generally straightforward, except that the low-level queries we need are the reverse of those we discussed above i.e. knowing the arrival velocities we need to compute departure velocities. No problem, this sort of physics is time-reversible, so we just need to be careful about which signs get flipped. For each (<i>i, j</i>) we consider all options for the next target, back-propagate the set of intervals from that next target, and merge them into the answer for (<i>i, j</i>). Of course, for \(j = 1\) we use the interval \((-\infty, \infty)\).<br /><br />The final inconvenience is that we must produce a lexicographical minimum output. Now we will see the advantage of doing the DP in the direction we chose. We build a sequence forwards, keeping track of the velocity interval for our current position. Initially the position will be (0, 0) and the velocity interval will be [0.0, 0.0]. To test whether a next position is valid, we forward-propagate the velocity interval to this next position, and check whether it has non-empty intersection with the set of velocities that would allow us to hit the required number of remaining targets. We then just take the next valid position with the lowest ID.<br /><br />What is the complexity? It could in theory be exponential, because every path through the targets could induce a separate interval in the interval set. However, in reality the intervals merge a lot, and I wouldn't be surprised if there is some way to prove that there can only be a polynomial number of intervals to consider. My solution still ran pretty close to the time limit, though.</div>Bruce Merrynoreply@blogger.com0tag:blogger.com,1999:blog-31847281.post-84951348545507614602014-06-27T02:40:00.000-07:002014-06-27T02:40:42.907-07:00More ICPC analysis<div dir="ltr" style="text-align: left;" trbidi="on"><div>Now we're getting on to the harder problems. Today I've cover two that I didn't know how to solve. Some of the others I have ideas for, but I don't want to say anything until I've had a chance to code them up.<br /><br />Firstly, problem I. This is a maximum clique problem, which on a general graph is NP-complete. So we will need to use the geometry in some way. Misof has a nice set of slides showing how it is done: http://people.ksp.sk/~misof/share/wf_pres_I.pdf. I had the idea for the first step (picking the two points that are furthest apart), but it didn't occur to me that the resulting conflict graph would be bipartite.<br /><br />Now problem G. I discovered that this is a problem that has had a number of research papers published on the topic, one of which achieves \(O(N^3)\). Fortunately, we don't need to be quite that efficient. Let's start by finding a polynomial-time solution. Let's suppose we've already decided the diameters of the two clusters, D(A) and D(B), and just want to find out whether this is actually possible. For each shipment we have a boolean variable that says whether it goes into part A (false) or part B (true). The constraints become boolean expressions: if d(i, j) > D(A) then we must have i || j, and if d(i, j) > D(B) then we must have !i || !j. Determining whether the variables can be satisfied is just 2-SAT, which can be solved in \(O(N^2)\) time.<br /><br />Now, how do we decide which diameters to test? There are \(O(N^2)\) choices for each, so the naive approach will take \(O(N^6)\) time. We can reduce this to \(O(N^4\log N)\) by noting that once we've chosen one, we can binary search the other (it's also possible to eliminate the log factor, but it's still too slow). So far, this is what I deduced during the contest.<br /><br />The trick (which I found in one of those research papers) is that one can eliminate most candidates for the larger diameter. If there is an odd-length cycle, at least one of the edges must be within a cluster, and so that is a lower bound for the larger diameter. What's more, we can ignore the shortest edge of an even cycle (with some tie-breaker), because if the shortest edge lies within a cluster, then so must at least one other edge.<br /><br />We can exploit this to generate an O(N)-sized set of candidates: process the edges from longest to shortest, adding each to a new bipartite graph (as for constructing a maximum spanning tree with Kruskal's algorithm). There are three cases:<br /><ol style="text-align: left;"><li>The next edge connects two previously disconnected components. Add the edge to the graph (which will remain bipartite, since one can always re-colour one of the components. This edge length is a candidate diameter.</li><li>The next edge connects two vertices in the same component, but the graph remains bipartite. This edge is thus part of an even cycle, and so can be ignored.</li><li>The next edge connects two vertices, forming an odd cycle. This edge length is a candidate diameter, and the algorithm terminates.</li></ol></div>The edges selected in step 1 form a tree, so there are only O(N) of them.</div>Bruce Merrynoreply@blogger.com2tag:blogger.com,1999:blog-31847281.post-39471291094526605002014-06-25T11:43:00.004-07:002014-06-28T10:17:31.896-07:00ACM ICPC 2014<div dir="ltr" style="text-align: left;" trbidi="on">ACM ICPC 2014 is over. The contest was incredibly tough: solving less than half the problems was enough to get a medal, and 20 teams didn't solve any of the problems (unfortunately including the UCT team, who got bogged down in problem K).<br /><br />I managed to solve 5 problems during the contest (in 4:30, since I didn't wake up early enough to be in at the start), and I've solved one more since then. Here are my solutions, in approximately easy-to-hard order.<br /><br />EDIT: note that I've made follow-up blog posts with more analysis. <br /><h3 style="text-align: left;">D: game strategy</h3><div style="text-align: left;">This was definitely the easiest one, and this is reflected in the scores. We can use DP to determine the set of states from which Alice can force a win within <i>i</i> moves. Let's call this w[i]. Of course, w[0] = {target}. To compute w[i+1], consider each state s that is not already a winning state. If one of Alice's options is the set S and \(S \subseteq w[i]\), then Alice can win from this state in i+1 moves.<br /><br />We can repeat this N times (since no optimal winning sequence can have more than N steps), or just until we don't find any new winning states. We don't actually need to maintain all the values of w, just the latest one. </div><h3 style="text-align: left;">K: surveillance</h3><div style="text-align: left;">Let's first construct a slower polynomial-time algorithm. Firstly, for each i, we can compute the longest contiguous sequence of walls we can cover starting at i. This can be done in a single pass. We sort all the possible cameras by their starting wall, and as we sweep through the walls we keep track of the largest-numbered ending wall amongst cameras whose starting wall we have passed. The camera with \(a > b\) are a bit tricky: we can handle them by treating them as two cameras \([a - N, b]\) and \([a, b + N]\).</div><div style="text-align: left;"><br /></div><div style="text-align: left;">Let's suppose that we know we will use a camera starting at wall i. Then we can use this jump table to determine the minimum number of cameras: cover as much wall starting from i as possible with one camera, then again starting from the next empty slot, and so on until we've wrapped all the way around and covered at least N walls. We don't know the initial i, but we can try all values of i. Unfortunately, this requires \(O(N^2)\) time.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">To speed things up, we can compute accelerated versions of the jump table. If \(J_c[i]\) is the number of walls we can cover starting from i by using c cameras, then we already have \(J_1\), and we can easily calculate \(J_{a+b}\) given \(J_a\) and \(J_b\). In particular, we can compute \(J_2, J_4, J_8\) and so on, and them combine these powers of two to make any other number. This can then be used in a binary search to find the smallest c such that \(J_c\) contains a value that is at least N. The whole algorithm thus requires \(O(N\log N)\) time.<br /><br />One catch that broke my initial submission is that if the jump table entries are computed naively, they can become much bigger than N. This isn't a problem, until they overflow and become negative. Clamping them to N fixed the problem.</div><h3 style="text-align: left;">C: crane balancing</h3><div style="text-align: left;">This is a reasonably straightforward geometry problem, requiring some basic mechanics. Let the mass of the crane be \(q\), let the integral of the x position over the crane be \(p\), let the base be the range \([x_0, x_1]\), and let the x position of the weight be \(x\). The centre of mass of the crane is \(\frac p q\), but with a weight of mass \(m\), it will be \(\frac{p+mx}{q+m}\). The crane is stable provided that this value lies in the interval \([x_0, x_1]\). This turns into two linear inequalities in \(m\). Some care is then needed to deal with the different cases of the coefficient being positive, negative or zero.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">Computing the area and the integral of the x value is also reasonably straightforward: triangulate the crane by considering triangles with vertices at \((0, i, i+1)\) and computing the area (from the cross product) and centre of mass (average of the vertices) and then multiply the area by the x component of the centre of mass to get the integral.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">I prefer to avoid floating point issues whenever possible, so I multiplied all the X coordinates by 6 up front and worked entirely in integers. This does also cause the masses to be 6 times larger, so they have to be adjusted to compensate.</div><h3 style="text-align: left;">E: maze reduction</h3><div style="text-align: left;">This is effectively the same problem I set in Topcoder SRM 378 (thanks to ffao on Topcoder for reminding me where I'd seen this). If you have an account on Topcoder you can read about an alternative solution in the match analysis, in which it is easier to compute worst-case bounds.<br /><br />The approach I used in this contest is similar in concept to D, in that you incrementally update a hypothesis about which things are distinguishable. We will assign a label to every door of each chamber. Two doors are given the same label if we do not (yet) know of a way to distinguish them. Initially, we just label each door with the degree of the room it is in, since there is nothing else we can tell by taking zero steps.</div><div style="text-align: left;">Now we can add one inference step. Going clockwise from a given door, we can explore all the corridors leading from the current chamber and note down the labels on the matching doors at the opposite end of each corridor. This forms a signature for the door. Having done this for each door, we can now assign new labels, where each unique signature is assigned a corresponding label. We can repeat this process until we achieve stability, i.e., the number of labels does not change. We probably ought to use each door's original label in the signature too, but my solution passed without requiring this.</div><div style="text-align: left;">Finally, two rooms are effectively identical if their sequence of door labels is the same up to rotation. I picked the lexicographically minimum rotation for each room and used this as a signature for the room.</div><div style="text-align: left;">I'm not sure what the theoretical work-case performance is, but I suspect it would be quite large. For a start, by algorithm requires \(O(NE\log N)\) time <i>per iteration</i>, which I suspect could be improved by using a hash table with some kind of rolling hash to rotation in O(1) time. I was surprised when my first submission ran in time.</div><h3 style="text-align: left;">B: buffed buffet</h3><div style="text-align: left;">This one was deceptively sneaky, but the principles are reasonably simple. It's not too hard to guess that one will solve the continuous and discrete problems separately, and then consider all partitions between the two.</div><div style="text-align: left;">Let's start with the continuous problem, since it is a little easier. I don't have a formal proof, but it shouldn't be too hard to convince yourself that a greedy strategy works. We start by eating the tastiest food. Once it degrades to being as tasty as the second-tastiest food, we then each both, in the proportion that makes them decay at the same rate. In fact, at this point we can treat them as a combined food with a combined (slower) decay rate. We continue eating this mix until tastiness decays to match the third-tastiest, and so on. There are some corner cases that need to be handled if there are foods that don't decay.</div><div style="text-align: left;">Now what about the discrete case? Because the items have different weights, a greedy strategy won't work here, as with any knapsack problem. There is a reasonably straightforward \(O(DW^2)\) dynamic programming, where you consider each possible number of serving of each discrete food type, but this will be too slow. But there is some structure in the problem, so let's try to exploit it. Let's say that \(f(i)\) is the maximum tastiness for a weight of \(i\) using only the foods we've already considered, and we're now computing an updated \(f'\) by adding a new discrete food with weight \(w\), initial tastiness \(t\) and decay rate \(d\). For a given i, \(f'(i)\) clearly only depends on \(f(j)\) where \(i - j\) is a multiple of \(w\), so let's split things up by the remainder modulo \(i\). Fix a remainder \(r\) and let \(g(i) = f(iw + r)\) and \(g'(i) = f'(iw + r)\). Now we can say that</div><div style="text-align: left;">\[<br />\begin{aligned}<br />g'(i) &= \max_{0 \le j \le i}\big\{ g(j) + \sum_{n=1}^{i-j}(t - (n-1)d\big\}\\<br />&= \max_{0 \le j \le i}\big\{ g(j) + (i-j)t - \frac{(i-j-1)(i-j)}{2}\cdot d\big\}\\<br />&= \max_{0 \le j \le i}\big\{ g(j) + (i-j)t - \frac{i(i-1)+j(j+1)-2ij}{2}\cdot d\big\}\\<br />&= it - \frac{i(i-1)d}{2} + \max_{0 \le j \le i}\big\{ g(j)-\frac{j(j+1)d}{2} - jt + ijd\big\}\\<br />&= it - \frac{i(i-1)d}{2} + \max_{0 \le j \le i}\big\{ h(j) + ijd \big\}<br />\end{aligned}<br />\]<br />Here we have defined \(h(j) = g(j)-\frac{j(j+1)d}{2} - jt\), which can be computed in constant time per \(j\).<br /><br />The key observation is that we have turned the expression inside the maximum into a linear function of \(j\). Thus, if we plot \(h\) on a graph, only the upper convex hull is worth considering. We can maintain this upper hull as we increase \(i\) (remember, \(j \le i\)). The second observation is that the optimal choice of \(j\) will increase as \(i\) increases, because increasing \(i\) will increase the \(ijd\) more for larger values of \(j\). It follows that we can incrementally find the optimal \(j\) for each \(i\) by increasing the previous \(j\) along the upper hull until increasing it further would start decreasing the objective function. There are a few corner cases: the upper hull might not have any valid entries (because there might be no way to make up a weight of exactly \(jw+r\)), and we might have popped off the previous \(j\) as part of updating the hull. These just need careful coding. Another snag to watch out for is that if all the foods decay very fast, the optimal result may in fact overload a 32-bit integer in the negative direction.<br /><br />The discrete part of the algorithm now requires O(DW), and the continuous part requires O(D\log D + W).</div><h3 style="text-align: left;">F: messenger (solved after contest)</h3><div style="text-align: left;">This is a nasty problem mostly because of numerical stability problems. The problem itself is numerically unstable: a rounding error in measuring the total lengths of the paths is sufficient to change the answer between possible and impossible. It requires the black art of choosing good epsilons. I'll only describe the ideal case in which none of these nasty rounding problems occur.<br /><br />Suppose we pick a point on both Misha and Nadia's path. In general this won't work, because the courier will arrive too late or too early at this point to intercept Nadia. Let L(A, B) be the number of time units that the courier is late when travelling from A to B. L can be late if the courier is early. Valid solutions are those where L = 0.<br /><br />First, let's prove that L is an increasing function of A (where "increasing" means that A moves further along Misha's path). Suppose that the courier decided to, instead of moving straight to B, instead kept pace with Misha for a while, then went to B. Obviously this would mean arriving later (or at the same time). But this is equivalent to the arrival time if A increases. A similar argument shows that L is a decreasing function of B. In both cases, this is not strict.<br /><br />This means that there can be at most \(O(N)\) pairs of segments for which L=0, rather than \(O(N^2)\), because we cannot move forward on one path and backwards on another. We can iterate over the candidate pairs by successively advancing either one segment or the other, by measuring L at pairs of segment endpoints.<br /><br />Now we have reduced the problem to a single segment from each path. Given a point on one path, we can find the corresponding point on the other by solving what looks like a quadratic but which decays into a linear equation. Again, there are corner cases where the linear coefficient is also zero, in which case we have to break ties so as to minimise the travel time. Using this mapping function, we can identify the range of positions on Misha's path that correspond to valid positions on Nadia's path. I then used ternary search to find the optimal position along this segment. I didn't actually prove that the function is convex, but it seemed to work.<br /><br />The solution I implemented that passed is actually rather messier than what I've described, and ran close to the time limit. I tried implementing it as described above but it hits assertions in my code due to numeric instabilities, but I think it should still be possible to fix it.</div></div>Bruce Merrynoreply@blogger.com12tag:blogger.com,1999:blog-31847281.post-78518432906570376112014-06-02T03:23:00.004-07:002014-06-02T03:23:39.904-07:00New website<div dir="ltr" style="text-align: left;" trbidi="on">I've finally gotten around to replacing my ugly nineties-looking <a href="http://www.brucemerry.org.za/">personal web page</a> that was cobbled together with raw HTML and m4, with a much prettier version that nevertheless still contains roughly the same content. It is generated using <a href="http://sphinx-doc.org/">Sphinx</a>, with the individual pages written in <a href="http://docutils.sf.net/rst.html">reStructuredText</a>. I've also merged my publications page into the main website instead of leaving it on the equally ugly but separately cobbled-together page I originally set up for my PhD.</div>Bruce Merrynoreply@blogger.com0tag:blogger.com,1999:blog-31847281.post-45467314060592092412014-05-10T08:45:00.002-07:002014-05-10T08:45:23.889-07:00Challenge 24 finals 2014<div dir="ltr" style="text-align: left;" trbidi="on">Here are my thoughts/solutions for the Challenge 24 Finals problems from this year.<br /><h3 style="text-align: left;">A: Halting problem</h3><div style="text-align: left;">Firstly, be aware that the problem had a typo: the condition should be "A != 0" rather than "A == 0". The intuition is that "most" of the time the g edge will be taken, but occasionally the h edge might be taken. Given a starting point, let us suppose that we have an efficient way to iterate the algorithm until either an h edge is taken or we terminate (or we determine that neither condition will ever hold). In the former case, we have just determine that we need f(0) for some f. Since there are only O(N) of these values, we can cache them as needed, and so only require O(Q + N) rounds of the inner algorithm (namely, iterating until we reach A = 0 or terminate).</div><div style="text-align: left;">Now let us solve this inner problem. Consider the directed graph in which each function f is connected to g. A graph where every vertex has out-degree one has a specific structure, in which each component contains a single loop from which hang upward-directed trees. Iteration thus consists of walking up a tree until reaching the loop, then going round and round the loop until we exit. Of course, we might finish early, or we might go around the loop forever.</div><div style="text-align: left;">For the walking up the tree part, let's use brute force for now. Assuming we make it to the loop, how do we find the exit point? Let's say that every complete time around the loop we increase A by C, and that at some point around the loop we have A = a. Then to exit at this point after t full cycles, we must have a + tC = 0 (mod 2<sup>64</sup>). We can solve this using the extended GCD algorithm. We can measure this for every point around the loop, and the smallest valid t value tells us where we will exit. If there are no valid t values, then we will go round forever.</div><div style="text-align: left;">This is essentially the solution I used in the contest. It is somewhat slow: O(N(N+Q)) I think. I did a lot of microoptimisation, and even then it took hours for the largest test case. The important optimisations were precomputing parameters for each loop (particularly the inverse of C) once, and reordering the data so that points around a loop are consecutive in memory. In fact over 75% of time ended up being spent in pointer-chasing up the trees, because this was cache-unfriendly.</div><div style="text-align: left;">I think the problem can be solved in something like O((N+Q)log (N+Q)), but it is rather more complex, and a blog doesn't really lend itself to the mathematical notation. I might write it up later if I get around to it. I'll give a few hints. If C is odd then increasing the A value on entry to the loop by a certain amount will increase the time-to-exit for each point by the same amount; this can be generalized if C is an odd multiple of 2^k by partitioning things by their remainder mod 2^k. The tree part can be handled by considering all queries at the same time, in a preorder walk of the tree.</div><h3 style="text-align: left;">C: Complete program</h3><div style="text-align: left;">I had a nasty super-polynomial time solution during the contest, which was complex and only managed to solve about 7 test cases. There is a much simpler and polynomial-time DP solution, however.</div><div style="text-align: left;">Let's start with some observations. Any O not followed by a valid number will need one, so let's add it immediately (it can be anything valid). After this, O and the number might as well be treated as a single unit (there is no point inserted anything between an O and a valid number either). There is no point ever adding an I. Adding F's is useful only to increment the nesting depth, so we might as well add F's only right at the front, since that way everything will be nested inside them. Finally, whenever we add a number, it might as well be a 0.<br />What is less obvious is that it is only necessary to add an A immediately before a number. This requires checking a few cases, but is because an A added anywhere else can be shuffled to the right until it reaches a number, without affecting whether a token string is valid. Similar rotation arguments show that we never need to add two As or Os in front of a number.</div><div style="text-align: left;">For the DP, let's compute, for each contiguous substring of the input and each F nesting level, the minimum number of tokens we need to add to make it into a valid parse tree. There is a special case for the empty range: we must just add a 0, and we must check that the nesting depth at that point is at least 1. Otherwise, we can either</div><ul style="text-align: left;"><li>Insert an A or an O.</li><li>Use the first token in the substring as the root.</li></ul>In the first case, the A or O will "swallow" the number, so if it is at most 127 we can just use an O, otherwise we must use an A and the nesting level must be sufficiently large. We then use DP to find the best way to handle the rest of the substring. The second case is largely straightforward, unless the first given token is an A. In this case, we must consider all ways in which the rest of the substring may be split into two valid trees.<br />Finally, we need to determine the nesting level to use at the root. It is possible, although a little slow, to just keep incrementing this value until there are so many initial Fs that it is clearly impossible to improve the solution. A more efficient solution in practice is to compute the DP over each substring as a piecewise constant function of the nesting level; this is more complex since one needs to do various operations on these piecewise constant functions, but it guarantees an O(N<sup>4</sup>) solution independent of the size of the numbers in the input.<br /><h3 style="text-align: left;">D: Firing game</h3><div style="text-align: left;">I solved 9 out of the 10 test cases. The primary observation is that any manager that is unfirable in the initial setup creates a completely independent subtree: firings within this subtree have no effect on the rest of the tree, and vice versa. This partitions the tree into separate Nim piles, and we just need to determine the Grundy number for each such subproblem. In all but one test case, these trees are reasonably small, and can be solved by DP/memoized recursion.</div><h3 style="text-align: left;">N: Dog tags</h3><div style="text-align: left;">This problem looked as though it could be extremely nasty, but fortunately the data was somewhat friendly. For example, each bubble was intersected enough times to get a good guess at the volume, even though the problem only guaranteed a single intersection.</div><div style="text-align: left;">Firstly one must identify the bubbles. The loops can be considered to be a graph, with an edge between two loops if they are on adjacent slices and their interiors intersect in the XY plane. I was lazy and considered two loops to intersect if a vertex of one lay inside another, which is good enough if the loops are sampled well. Each connected component in this graph constitutes a bubble. I estimated the volume as the sum of areas of the loops, and also found the centroid (mean of all points contained in the loops).</div><div style="text-align: left;">Finding the fiduciary bubbles is easy: they are the 3 bubbles with greatest volume. Determining which one is the corner is not quite as easy, since one does not know the inter-slice distance and hence distances cannot easily be measured. I identified the first plane of bubbles (see later), and noted that the mid-point of the two corner bubbles should be close to the centroid of this set of bubbles. I just took all 3 edges to find which is best.</div><div style="text-align: left;">This now gives the grid coordinate system XYZ. One does need to check that the bubbles are on the positive Z side, otherwise the two corners need to be swapped.</div><div style="text-align: left;">Now we can partition the bubbles into planes. I sorted by Z, computed all the differences between adjacent Z values, sorted this list, and looked for a big jump between adjacent values (small values are noise within a plane, big values are plane-to-plane steps).</div><div style="text-align: left;">The same process can then be used to find each row within a plane, and then bubbles in a row can be sorted by X.</div></div>Bruce Merrynoreply@blogger.com0tag:blogger.com,1999:blog-31847281.post-78303376044283592462014-02-08T09:39:00.001-08:002014-02-08T09:41:56.929-08:00Challenge 24 EC round 2014<div dir="ltr" style="text-align: left;" trbidi="on">Here are my thoughts on the problems we solved from this year's electronic round of Challenge 24.<br /><h2 style="text-align: left;">A. Safe</h2><div style="text-align: left;">I didn't get a complete solution to this, but I was able to solve cases 1-9. I noticed that in most cases the state space for the wheels (keys, but the pictures made me think of them as wheels) is reasonably small. For each password, the solution will involve first pressing the keys a number of times to set up the initial state, and then typing the password. So we want to find the initial states from which we can input the password, and pick the cheapest amongst them.<br />This can be done by dynamic programming, running backwards over the password. When there are no more characters to type, any state is fine. Considering each letter in turn, we need to find the states in which we can type the next letter on one of the keys, and from which the resulting state allows us to complete the password.</div><div style="text-align: left;"></div><div style="text-align: left;">For A10 there are 6.4 billion states, and it was going to take a long time (it did churn out the first two passwords in about 3 hours). I suspect it might be possible to do it by keeping an explicit list/set of reachable states and working from there, rather than iterating over the entire state space.</div><h2 style="text-align: left;">B. Wiretapping</h2><div style="text-align: left;">This problem basically just requires an application of <a href="http://en.wikipedia.org/wiki/Kirchhoff%27s_theorem">Kirchhoff's Theorem</a>, which gives the number of spanning trees of the graph in polynomial time. To count the number of spanning trees that use the tapped link, just collapse the two endpoints into a single vertex and count the number of spanning trees of the remaining graph. The ratio between the number of spanning trees in this reduced graph and the original graph is the answer.<br />One complication is that the number of spanning trees may overflow a double-precision float. I worked around that by scaling down the Laplacian matrix by some factor f (I used 1/32 for the larger cases), which reduces the determinant of the cofactor by f<sup>N-1</sup>. </div><div style="text-align: left;"><h2 style="text-align: left;">C. Visual Programming - VM</h2><div style="text-align: left;">This was a fairly straightforward implementation problem, just requiring careful interpretation of the rules, and recognising that a lot of the information given is unnecessary (e.g., for lines you only need to keep the two endpoints).</div><h2 style="text-align: left;">D. Visual Programming</h2><div style="text-align: left;">Due to lack of time I only solved D1, D2 and D4, and had a broken solution to D3. As with most VM problems, I invented a slightly higher-level language that is possible (but still hard) to code in by hand, and write a simpler compiler. I made a few simplifying assumptions</div><ul style="text-align: left;"><li>Polygons are laid out one below the other. Each polygon is rectangular, except that the right hand edge may be ragged.</li><li>Each line is only ever used in one direction. To ensure this, the outgoing lines from a polygon all emanate from the top part of the right edge, and incoming lines arrive at the bottom part of the right edge. There is also an unconditional jump from each polygon to the next one to ensure that the routing terminates before encountering any of the incoming lines.</li><li>To avoid lines cutting through polygons, each line emanates directly horizontally from the polygon to (256, y), then has a second segment to the target point. The x value of the start is chosen depending on which register is to be used in the comparison.</li><li>I reserved registers 0 and 1 to always contain the values 0 and 1, which made it easier to construct some other operations, as well as to build an unconditional jump.</li></ul></div><div style="text-align: left;">The programming mini-language just specifies instructions and conditional a list of conditional jumps that takes place after each instruction (falling through to the next instruction if none match). The instructions I wrote at the level of setting the parameters (A-F, X-Z, T, Q, L, H), but I had useful defaults for all of them so that most instructions only changed one or two defaults e.g. setting A=3 B=15 would generate an instruction that copies from SIO into register 3.</div><div style="text-align: left;">Once all that is done, it is not particularly difficult for anyone who has done any assembly to write code for D1, D2, and D4. The only slightly tricky part was to extract the lower 8 bits from the result, which I did by multiplying to 256 then dividing by 256.</div><div style="text-align: left;"><h2 style="text-align: left;">G. Spy Union</h2><div style="text-align: left;">At first I thought this could be done greedily, but it doesn't work: reinstating a fired person might let you fire an additional person <i>in each hierarchy</i>. Instead, we can use network flow. Let's consider just one hierarchy for now, and connect the source to every node, connect each node to its parent, and connect the root to the sink (directed edges). The edges from the source have capacity 1, with the interpretation that they have flow if the corresponding employee is fired. The flow through each intermediate edge will then be the total number of employees fired from the corresponding subtree, so we will set the capacities to be the maximum number we can fire each subtree. The maximum flow will thus be the maximum number of people we can fire. </div><div style="text-align: left;"></div><div style="text-align: left;">However, we have two hierarchies. Since network flow is symmetric, we can apply the same treatment to the second hierarchy, but connect the source to the root, and the individual employees to the sink. To join up the problems, we get rid of the source in the first graph and the sink in the second graph, and create capacity-1 edges directly from each employee in the second graph to the matching employee in the first. Once again, these edges have flow if the corresponding employee is fired, and the maximum flow gives the solution.</div><div style="text-align: left;"><br /></div></div></div>Bruce Merrynoreply@blogger.com3tag:blogger.com,1999:blog-31847281.post-1822817891268781262013-11-23T01:44:00.004-08:002013-11-23T01:44:46.861-08:00Fun with C++11 variadic templates<div dir="ltr" style="text-align: left;" trbidi="on"><div>The other day I decided to experiment a bit with the support for variadic template support in C++11 (the new C++ standard). Like many C++ features, they're incredibly powerful, but not always that easy to use. After some experimentation I figured out how to do what I wanted, and I thought I'd write it up in case anyone else wanted to try to do the same thing.<br /><br />I wanted to create a fixed-size vector class (similar to <a href="http://en.cppreference.com/w/cpp/container/array">std::array</a>), but with a constructor that accepts the appropriate number of arguments. So for example, Vec<float 3=""> should have a constructor that accepts 3 floats.</float><br /><br />The simplest solution is to use initializer lists, which allow a constructor to take an arbitrary sequence of arguments of the same type. However, this has a number of disadvantages:<br /><ol style="text-align: left;"><li>It can only be used with brace-initializer syntax, not parentheses.</li><li>It does not enforce the correct length at compile time.</li></ol></div>Variadic templates allow a function to take a variable number of arguments. My second attempt at a constructor started by looking something like this:<br /><br /><span style="font-family: "Courier New",Courier,monospace;">template<typename ...="" args=""> Vec(Args&&... args);</typename></span><br /><br />This is just a constructor that accepts an arbitrary set of arguments, so it doesn't seem to fix the second problem, and additionally it doesn't even require the types to be correct. We can fix that using SFINAE to statically validate the arguments before the template is instantiated. The tricky part is figuring out where to insert the <a href="http://en.cppreference.com/w/cpp/types/enable_if">enable_if</a>: the constructor has no return type, so it can't go there; and we can't add a default argument, because that makes the interpretation of the variadic argument list ambiguous. However, we can add an extra template parameter with a default value:<br /><span style="font-family: "Courier New",Courier,monospace;"><br /></span><span style="font-family: "Courier New",Courier,monospace;">template<typename ...="" args="" span=""></typename></span><br /><span style="font-family: "Courier New",Courier,monospace;"> typename E = typename enable_if_all<sizeof ...="" n="" rgs="" span=""></sizeof></span><br /><span style="font-family: "Courier New",Courier,monospace;"> std::is_convertible<args t="">::value...>::type></args></span><br /><span style="font-family: "Courier New",Courier,monospace;">Vec(Args&&... args);</span><br /><span style="font-family: inherit;"><br /></span><span style="font-family: inherit;">Here </span><span style="font-family: "Courier New",Courier,monospace;">enable_if_all</span> is a variadic version of enable_if, which has a type member if all its arguments are true. It's not part of the standard, so I had to implement it myself. I won't go into details since this isn't the solution I used. It works, but it's roundabout and I'm not convinced that it won't differ in some subtle ways from a constructor that actually accepts only type Twhen in the presence of other constructors (because every parameter will match exactly, where as a constructor with float parameters might require a conversion sequence which would make this a worse match than another constructor).<br /><br />In the end I realised that the solution was to create a template parameter pack with N copies of T. At first I thought this couldn't be done, because parameter packs are not types and so one cannot typedef one in a helper class to be expanded in the constructor. The solution is to create it in the vector class itself, using recursion to generate the appropriate type. You can see the code <a href="http://pastebin.com/1FqXUawP">here</a>. The <span style="font-family: "Courier New",Courier,monospace;">VecBase</span> class is templated with the type T, followed by N copies of T as a variadic parameter pack. Thus, for example, <span style="font-family: "Courier New",Courier,monospace;">VecBase<int int=""></int></span> is a vector with 3 ints. This makes it easy to write the constructor, since we just expand the parameter pack.<br /><br />The tricky part is that we really want to write <span style="font-family: "Courier New",Courier,monospace;">Vec<int 3=""></int></span> rather than <span style="font-family: "Courier New",Courier,monospace;">VecBase<int int=""></int></span>. This is where <span style="font-family: "Courier New",Courier,monospace;">VecHelper</span> comes in: it recursively adds copies of T to the type name. Thus, <span style="font-family: "Courier New",Courier,monospace;">VecHelper<int 3="">::type</int></span> is <span style="font-family: "Courier New",Courier,monospace;">typedef</span>ed to <span style="font-family: "Courier New",Courier,monospace;">VecHelper<int 2="" int="">::type</int></span>, which in turn is typedefed to <span style="font-family: "Courier New",Courier,monospace;">VecHelper<int 1="" int=""></int></span> and finally <span style="font-family: "Courier New",Courier,monospace;">VecHelper<int 0="" int=""></int></span>. This is the base case that is handled in the partial specialization at line 29 to be <span style="font-family: "Courier New",Courier,monospace;">VecBase<int int=""></int></span>.<br /><br />Finally, we make use of the new <span style="font-family: "Courier New",Courier,monospace;">using</span> feature to typedef <span style="font-family: "Courier New",Courier,monospace;">Vec<int 3=""></int></span> to be <span style="font-family: "Courier New",Courier,monospace;">VecHelper<int 3="">::type</int></span>, and the problem is solved.</div>Bruce Merrynoreply@blogger.com0tag:blogger.com,1999:blog-31847281.post-81162523441104920392013-07-19T00:49:00.000-07:002013-07-19T00:49:02.603-07:00IOI 2013 day 2 analysis<div dir="ltr" style="text-align: left;" trbidi="on"><h2 style="text-align: left;">Cave</h2><div style="text-align: left;">This is a nice problem following a recent IOI trend of having problems where the limiting factor is not execution time, but some abstract measure of efficiency. This is slightly easier than some of the previous problems in each vein.</div><div style="text-align: left;">To achieve 100%, we can use up to 14 queries per door. Since 2<sup>14</sup> is slightly more than the number of doors, this strongly suggests some form of binary search on each door. It will be easiest if we attack the doors in order. Once we know which switch and position opens a door, we can lock that switch in place so that we will always be able to see the next door, and thereafter pretend the switch does not even exist.<br />To solve for a door, we can start with all switches down, which will immediately tell us which switch position opens the door. At this point every switch is a candidate. We can then test with half the candidates down and half up, which will eliminate half the candidates. This process is then repeated until only one candidate remains.<br /><h2 style="text-align: left;">Robots</h2><div style="text-align: left;">As is commonly the case for problems that ask how long some agents need to achieve a goal, the answer can be found by a binary search on the answer. So we need to decide whether the robots can clear the floor in S seconds.</div><div style="text-align: left;">We can simplify the problem slightly by noting that for each toy, we only need to know how many weak and how many small robots are able to move it, which can be found by binary searching the two lists (after sorting them). Of course, if a toy cannot be moved by any robot, return -1.</div><div style="text-align: left;">Let's first decide the actions of the weak robots, starting from the weakest. There will be some set of toys it can handle. Since they effectively differ only in size, the weakest robot should work from the largest downwards, so as to make the job of the small robots easier. Also, there is never any reason for it to move fewer than S toys, unless it runs out. Now consider the 2nd-weakest robot. There will be extra toys it can handle, plus any light toys that the weakest robot didn't have time for. Since the weakest robot is finished, the difference in weights are irrelevant, and the 2nd-weakest robot should again work in decreasing order of size amongst the toys it can handle. The same process can be continued for the remaining weak robots.</div><div style="text-align: left;">Now consider the small robots, from largest to smallest. These can again take up to S toys, starting from the largest remaining one. If a robot is unable to handle the largest remaining toy, then S was too small.</div><div style="text-align: left;">Implementation can be done using a priority queue, implemented as a binary heap, representing toys that are light enough to be handled by the current robot and ordered with the largest at the head. The toys are initially sorted by weight. Each time a new weak robot is considered, new elements are added from the list of toys to the priority queue, and the robot then removes items starting from the head of the queue.</div><div style="text-align: left;">Assuming that T is larger than A, B, the running time will be O(T (log T)<sup>2</sup>): one log T for the binary search, the other for the priority queue operations. I think that it may be possible to reduce this to something like O(T·log T·log(max(A, B))) using the right sort of data structure for the priority queue (to allow S items to be removed in log time): something like an interval tree for the number of toys of each size.</div><h2 style="text-align: left;">Game</h2><div style="text-align: left;">I disliked this problem because it has a nice solution that takes just a bit too much memory. I only managed to get 80% for it in the time I spent on it, and I didn't feel inspired to modify my solution to pass fully.</div><div style="text-align: left;">In 1D, this can be solved by a fairly straightforward use of a segment tree: each node stores the GCD of its two children. Since R can be quite big, this needs to be a sparse segment tree; another alternative would be a balanced binary tree.</div><div style="text-align: left;">In 2D, it is tempting to use a quadtree, but in fact that doesn't guarantee poly-logarithmic time. A 1xC query will force refinement down to the individual non-zero cell entries. Instead, we can use a range tree, which is a tree of trees: an outer tree is built over the columns, and for each column span corresponding to a node in this tree, we have an inner tree over the rows. Each node in this inner tree corresponds to a rectangle of the grid, and stores the GCD of elements from this rectangle. A query now uses the columns to select O(log C) nodes from the outer tree i.e., O(log C) inner trees, and applies a query to each of them. Queries thus take O(log R·log C) time when the implementation uses segment trees for the inner and outer trees. With balanced binary trees, it would only be O((log Nu)<sup>2</sup>).<br />Unfortunately, using segment trees also requires O(log R·log C) memory per non-zero element, which just exceeds the available memory. Using balanced binary trees instead should fit within memory, but is a lot more painful to implement. I think it might also be possible to make it work by sticking with segment trees, but compressing the representation by compacting chains of nodes with one child. </div></div></div>Bruce Merrynoreply@blogger.com1tag:blogger.com,1999:blog-31847281.post-86884706926313303962013-07-17T02:52:00.002-07:002013-07-17T05:04:55.411-07:00IOI 2013 day 1 analysis<div dir="ltr" style="text-align: left;" trbidi="on">I finally got around to actually solving the IOI day 1 tasks in the contest system - I'd had ideas, but no time to test them out. So here are my solutions (I haven't looked around for any other writeups, although no doubt they exist). I'm hoping to have time to tackle day 2 before the contest system is shut down.<br /><br /><h2 style="text-align: left;">Art class</h2><div style="text-align: left;">This is a heuristic problem that can probably be solved in many ways. Since it was a hammer I have, I decided to hit it with a very simple wavelet-based spectral analysis. To find the highest-frequency components, downsample the image, upsample it again, and subtract it from the original. Then take the sum of squared values in the residual. Now start with the downsampled image and repeat recursively to get progressively lower frequencies. For the downsample and upsample I took a simple box filter. I kept 6 frequencies, since 2<sup>6</sup> is less than the minimum image size.</div><div style="text-align: left;">For each style, I computed the mean and standard deviation of each of the 6 power values from the examples. To classify a picture, I sum up the squared differences from the means, with the differences scaled using the standard deviations. It's not 100% accurate, but good enough to get a perfect score.<br /><h2 style="text-align: left;">Dreaming</h2><div style="text-align: left;">This is a combination of a few common tree processing tasks. Firstly, the longest path might just be within one of the original trees, i.e., a tree diameter. This can be computed recursively on a tree by determining, for each node, the two longest paths downwards via different children (one or both can be zero if there are fewer than 2 children). The diameter path will have a highest node, and so the diameter will be the sum of these two lengths.</div><div style="text-align: left;">When add an edge to a tree, we must decide where to make the connection. The longest path from the connection point to anywhere in the tree ought to be as short as possible, and so for each point in the tree we need to know the distance to the furthest point. This is slightly more complex than before, since we also have to consider paths that start upwards. However, a second recursive walk (this time computing top-down instead of bottom-up) allows the shortest such paths to be found. For a given tree, let the <i>radius</i> be the distance from the optimal connection point to the furthest point in the tree.<br />Finally, we must decide how to connect the trees together. Sort the trees by decreasing radius r<sub>1</sub> > r<sub>2</sub> > .... Clearly, there will be a path of at least length r<sub>1</sub> + r<sub>2</sub> + L. If there at least three trees, they can't all be connected to each other, so there must also be a path of at least length r<sub>2</sub> + r<sub>3</sub> + 2L. Conversely, by connecting the first tree to every other tree (always using the optimal connection points), it is not hard to see that there are no other paths that can be longer than the worst of these.<br /><h2 style="text-align: left;">Wombats</h2><div style="text-align: left;">I found this to be the most difficult of the tasks. Apart from being conceptually difficult, it also required a reasonably amount of tuning, and my solution still takes over 10s in many cases.</div><div style="text-align: left;">The basis of the solution is to note that C is relatively small, and so it is feasible to precompute the costs to get from any point on row X to any point on row Y, for some X and Y. Let's write such a table as {X, Y}. What's less obvious is that it's possible to combine {X, Y} and {Y, Z} to produce {X, Z} in O(C<sup>2</sup>) time. The trick is to use the fact that optimal paths won't cross over each other. Thus, if i < j, (X, i) to (Z, j-1) goes via (Y, p), and (X, i+1) to (Z, j) goes via (Y, q), then the optimal path from (X, i) to (Z, j) will go via (Y, r) where p ≤ r ≤ q. By iterating in order of increasing j - i, it is possible to compute {X, Z} in quadratic time.</div><div style="text-align: left;">We can combine this observation with a segment tree: for each appropriate i and a, we maintain {a·2<sup>i</sup>, (a+1)·2<sup>i</sup>}, computing it either directly (i = 0) or by combining two smaller intervals as above (where the upper bound exceeds R - 1, it is clamped). Each change invalidates O(log R) of these, so the time to update after a change is O(C^2 log R). Queries can be answered in O(1) time using the root of the segment tree.</div><div style="text-align: left;">The catch with this approach is that it requires too much memory: we can't even afford R different {X, Y} tables. Instead of keeping {X, X+1} at the finest level of the segment tree, we can instead store, say, {10X, 10X+10} for each X, and use 1/10th the memory. The cost is that updating the base level of the segment tree will now take 10 times as long.</div></div></div></div>Bruce Merrynoreply@blogger.com5tag:blogger.com,1999:blog-31847281.post-54990287453819149392013-02-24T07:49:00.002-08:002013-02-24T07:49:59.046-08:00Challenge 24 Electronic ContestHere's how my team approached some of the problems in the online round of this year's Challenge 24. You can read the problems <a href="http://ch24.org/static/archive/2013/2013_ec.pdf">here</a>.<br /><h2>A. Cutting back middle management</h2><div>I didn't completely solve this: A10 was too big for my O(NM<sup>2</sup>) algorithm. Consider a generalised version of the problem, in which we want to know how many managers can be fired for each possible number of subordinates the CEO ends up managing (0 to M). This can be solved by dynamic programming. Suppose we have solved it for some tree, and wish to graft on another subtree to the root. Given L, the number of subordinates that will end up reporting to the root, we pick the number A that are due to the new subtree, with the remaining L - A coming from the original tree. There is a special case when A = 1, in which case the root of the subtree may be retained. Merging in a subtree in this way requires iterating over L and over A, so takes O(M<sup>2</sup>) time per edge and hence O(NM<sup>2</sup>) for the whole tree.<br /><br />I optimised this slightly by noting that near the bottom of the tree, most values of A are impossible to achieve, and do not need to be iterated over. This is sufficient to get A9 (after running for quite a long time though), but not A10.<br /><h2>B. Requirements</h2></div><div>Firstly, how does the number of manuals grow? In one iteration, K manuals generate K(K - 1) new manuals, which together with the originals makes K<sup>2</sup> manuals. Thus, after T iterations there will be K<sup>2<sup>T</sup></sup> manuals. Next, which of them will be ultimate manuals? We can compute this using inclusion-exclusion: for each subset of requirements, count the number of manuals that are missing (at least) that set of requirements, and either add or subtract it from the total depending on the parity. The final manuals that are missing a set of requirements are simply those produced from the initial manuals missing those requirements. Counting the initial manuals for each subset can be achieved in O(3<sup>N</sup>) time.<br /><br />The only thing left is to figure out how to efficiently compute K<sup>2<sup>T </sup></sup>modulo P (where P = 1,000,000,007). Since P is a prime, we can reduce 2<sup>T</sup> module P-1 without affecting the result. This is all that is needed to solve all test cases in this problem.<br /><h2>C. Road roller</h2><div>We didn't have time to do anything with this problem.</div><h2>D. Octal CNC</h2></div><div>We didn't solve this during the contest, but I wrote a solution afterwards using OpenCV. It consists of three parts:</div><div><ol><li>Locating the symbols</li><li>Organising the symbols into rows</li><li>Identifying the symbols</li></ol>To locate the symbols, I first eroded the image by 8 pixels (which makes the black blobs larger), thresholded the image to classify each pixel as ink or paper, and detected connected components. For each connected component I took the bounding box (corrected for the erosion) and used this in subsequent steps.</div><div><br /></div><div>To organise the bounding boxes into rows, I processed them left-to-right (by left edge), and for each symbol picked the row it best matched (by difference in height compared to the last symbol in the row). If it fails some simple heuristics it starts its own row.</div><div><br /></div><div>To identify the symbols I used a pretty crude image descriptor: I essentially downsampled the bounding box to a 3x3 image, and normalized the result (to compensate for the ink being lighter in some symbols). These 9 values worked fairly well but there were still some ambiguities, particularly with symbol 1. I added a tenth value, the logarithm of the bounding box aspect ratio. Descriptors were compared just on Euclidean distance.</div><div><br /></div><div>With this approach I solved all test cases except 5 (where this is a checksum failure). I haven't yet investigated which stage failed for this test case.</div><h2>E. Stack compressor</h2><div>This stack language is quite difficult to work with, because there is no way to rotate the stack or otherwise make any changes below the TOS without losing data. I realised afterwards that one can just copy the numbers you need from lower in the stack and keep growing the stack indefinitely (well, up to the 20 million limit). During the contest I didn't think of this, which limited how I managed to </div><div><br /></div><div>The simplest way to do text compression is <a href="http://en.wikipedia.org/wiki/Huffman_coding">Huffman coding</a>. I start with a lot of PUSH instructions to put the coded text on the stack, backwards. In each 32-bit word I use only 30 bits. Bit 30 I set to 1 to use as a sentinel (to detect when to move to the next word), and I use only non-negative values (because C99 has no yucky semantics for negative mod). The encoding table is represented in code rather than data. At each state it does a divide by 2 to extract another bit and branches to a child state depending on whether the bit is zero or one. The leaves have an OUT instruction and a jump back to the start.</div><div><br /></div><div>Putting the table into code causes quite a lot of code bloat. After the contest I made a different version that encoded the transitions into the stack, and which would have scored 100% on many of the test cases. However, there are still some test cases where the 100% goal is below the entropy limit for Huffman encoding, even if all 32 bits per word could be used. Getting 100% for those presumably requires some other encoding, such as encoding of repeated strings, or possibly making use of digraphs.</div><h2>F. Backup communication</h2><div>A little physics shows that the cost of a launch is proportional to the distance achieved. Thus, we need to find the point from which the sum of distances is maximised. This is difficult to do directly, but starting with the centroid and using Newton-Raphson iteration solves it easily.</div><h2>G. Trains</h2><div>We didn't manage to solve this one.</div>Bruce Merrynoreply@blogger.com2tag:blogger.com,1999:blog-31847281.post-90051001067872925922012-09-27T11:49:00.002-07:002012-09-29T23:19:06.339-07:00IOI 2012 Day 2 analysis<h3>Last Supper</h3><div>Firstly, it's not even obvious how to simulate Leonardo's strategy efficiently. It can be done though. First, you need to be able to tell, for each request, what the next request of the same colour will be (we'll see why later). This can be computed by making a backwards pass over the data, while keeping track of the last seen request for each colour. Now we can implement Leonardo's strategy in a forward pass. Over time, let's keep track of the next time each colour will be needed. After each request, we need to update this for the just-requested colour to the point to the next instance for this colour. Each time Leonardo requests a paint colour that isn't on the scaffold, we can examine all the colours on the scaffold to determine which will be needed the furthest in the future. That would still require O(NK) time, but we can keep the next-needed-time for all the scaffold colours in a max priority queue. Since keys need to be changed as we go, a <span style="font-family: 'Courier New', Courier, monospace;">multi_set</span><span style="font-family: inherit;"> is a good choice in C++ for O(N log K) time.</span><br /><span style="font-family: inherit;"><br /></span><span style="font-family: inherit;">With that in place, it's straightforward to implement the cases where we have at least ceil(log K) bits per request, because we can just encode the index of the colour on the scaffold to replace. However, to solve subtasks 4 and 5 you need something smarter. The key point is that you don't have to exactly implement Leonardo's algorithm. Let's do a thought experiment: Leonardo has two shelves on his scaffold, top and bottom. The top shelf contains the colours he will use at least once more before they're taken away. The bottom shelf contains the colours he knows will be taken away before they're used again. Since he knows the sequence of colours and he assumes his assistant follows his strategy, he can arrange the initial K paints on the right shelves, and every time he uses a paint he also puts it back on the right shelf. His assistant will only ever take away a colour from the bottom shelf, and Leonardo will only pick up paint either from his assistant or from the top shelf.</span><br /><br />And now for the key observation: <i>Leonardo never looks at the bottom shelf.</i> Thus, it doesn't actually matter which colour the assistant takes off the bottom shelf. If he instead takes away an arbitrary colour from the bottom shelf, the set of top shelf paints will still be exactly as before, step after step. The only possible difference is that a requested colour might turn out now to already be on the bottom shelf, but this can't happen because it would contradict the optimality of Leonardo's strategy.<br /><br />Thus, it is sufficient for the advice to provide enough information to know which colours are on which shelf, and the assistant can pick bottom-shelf colours arbitrary to take away. We need K bits to indicate the shelf for each of the initial K colours, and we need N bits to indicate which shelf each paint goes back on after it has been used. N + K bits is enough to solve all subtasks, and in fact subtask 5 can be solved with only 125000 bits.<br /><h3>Jousting tournament</h3></div><div>There are a number of observations one can make. Firstly, let's say a knight is a <i>candidate</i> for a tournament if he either participates in the tournament or in another tournament whose winner was a candidate (recursively). Then it is easy to see that the candidates for a tournament form a contiguous interval of the original line of knights, and that the location of the interval is independent of the ranks of the knights. Computing this interval efficiently is not completely trivial, but let's come back to that and assume we know the candidate interval for every tournament. We can also see that the winner of any tournament is the candidate with highest rank.<br /><br />Suppose the popular knight is a candidate for a tournament. We can immediately determine who all the other candidates for that tournament are, regardless of exactly where he is in the line. We can also compute whether he will win. Now, for each tournament going backwards in time, we can compute how many times he will win if that is his first tournament: if he loses it will be 0, otherwise it will be more one that the number for the tournament he advances to. We can also find the leftmost slot (if any) he could occupy that doesn't require advancement from a previous tournament. We then pick the tournament with the highest such score and take the leftmost slot (breaking ties leftwards as required).<br /><br />Done naively this will require O(N<sup>2</sup>) time, which should suffice for the first 2 subtasks. We need a few operations to make this small enough to tackle the final subtask. Firstly, we need to be able to determine the candidate ranges. This can be done with a <a href="http://community.topcoder.com/tc?module=Static&d1=tutorials&d2=binaryIndexedTrees">binary indexed tree</a>. As the tournaments are played, store an array of N values in a BIT, where each value is 1 for knights that haven't competed or that had the lowest index (not rank) of all candidates in their last tournament, and 0 for all others. For a new tournament (S, E), the first candidate will be the first knight whose (inclusive) prefix sum is S+1, and the last candidate will be one to the left of the first knight whose (inclusive) prefix sum is E+2. Binary searching for a given prefix sum can be implemented in O(log N) time, but we also need to find the bits to zero out in the BIT. These can be found by keeping the locations of all 1 bits in a linked list. Each bit is zeroed at most once, so this process takes O(N log N) time.<br /><br />For each tournament, we also need to know to which tournament the winner advances. This can be computed at the same time as the previous step, by storing the tournament index in the linked list and building a tournament tree as we go.<br /><br />Finally, we need to be able to determine the winner of each tournament, assuming it contains the popular knight. For this we need to determine the maximum value within a range of values. This is the standard range max query problem, for which there are a number of O(N log N) and even O(N) algorithms to answer O(N) queries on N items. A simple one is to precompute for each element the maximum of the next 1, 2, 4, 8, ... elements in O(N log N) time, which then allows queries to be answered in O(1) time.<br /><br /><b>UPDATE</b>: a commenter made a good observation: we don't care about the ranks of the other knights, only whether they are higher or lower than that of the popular knight. The range max queries can all be done on a boolean array, which is simpler. For example, by precomputing prefix sums one can determine the number of higher-ranked knights in an interval, and the popular knight will win a tournament if and only if the corresponding sum is zero.<br /><h3>Ideal City</h3></div><div>This one had me totally stumped yesterday, but I woke up this morning with what I think is a solution. It depends on a conjecture. Let's call the <i>x-weight</i> of a path the number of horizontal steps, and similarly for <i>y-weight</i>. Let's call a path from A to B <i>x-shortest</i> if it has minimal x-weight, and let the <i>x-distance</i> from A to B be x-weight of this path (again, similarly for <i>y-shortest</i> and <i>y-weight</i>). I conjecture that the distance from A to B is the sum of the x-distance and the y-distance. Put another way, I conjecture that the shortest path from A to B is both x-shortest and y-shortest. I can't formally prove it, but the intuition is that any x-shortest path can be "straightened out" until it matches a shortest path, because there are no internal obstacles to get in the way.</div><div>Given that, we can separate the problem into finding the sum of x-distances and the sum of y-distances. Let's look at just the y-distances. The city can be divided into contiguous horizontal strips. Consider each such strip as a node in a graph, where two nodes are connected if the strips are directly connected. In fact, the graph must be a tree, since a cycle would imply that the city is not ideal. Also, travel within a strip is free when computing y-distance, so the y-distance between two blocks is equal to the number of tree edges between their corresponding nodes.</div><div>Computing the sum of all distances in a tree is a more conventional problem with conventional recursive techniques. For each subtree, compute the sum of all distances within the subtree, the number of blocks, and the sum of distances from every block to the root. When merging two trees at their roots, these values can be computed in O(1) time for the combined tree using the values for the original tree and the new subtree, so the process takes O(N) time.<br /><br /><b>UPDATE</b>: I realised that this construction can also prove the conjecture (as did a commenter). Consider the shortest path, and assume it is not y-shortest. Project it onto the tree of horizontal strips. It is then not a shortest path through this tree, since all y steps are projected to steps in this tree. But a non-shortest path through a tree must visit some vertex twice, which means that the original path exits some strip and then re-enters it. We could then shorten the path by short-cutting directly between the exit and entry points along the path.</div>Bruce Merrynoreply@blogger.com2tag:blogger.com,1999:blog-31847281.post-65684613855105798342012-09-26T05:59:00.002-07:002012-09-27T01:50:57.955-07:00IOI 2012 day 1 analysisI'm not at IOI this year, but since I didn't spot anything on the web I thought I'd post my thoughts on the Day 1 tasks. It looks like quite a tough problem set, although mostly because there are a lot of subtasks to think about and cases to handle.<br /><br /><b>UPDATE:</b> The official webpage now has a page with <a href="http://www.ioi2012.org/hints-for-tasks-of-day-1/">hints</a>. They seem to have essentially the same solutions as me for Crayfish Scrivener and Parachute Rings.<br /><h3>Crayfish Scrivener</h3>This is a task that I proposed.<br /><br />Half the challenge here is thinking about the problem in the right way. Rather than trying to undo undo's, just think of undo as a time machine: it takes you back to exactly the same state as you were N operations ago. So a simple implementation is just to maintain an array of strings through time, and whenever an undo is encountered, copy the string that was N steps ago.<br /><br />That's going to use up too much memory for the later subtasks. However, there is a lot of duplication, because for every L command, the next string looks almost the same as the previous one. We can compress the whole thing by using a <i>trie</i>: a tree where each edge is a letter, and each string is a path from the root through the tree. Any string can be encoded as just a pointer to a node in this tree. Adding a letter requires finding the child node with a particular letter (creating it if it doesn't already exist). An undo requires just a copy of a previous pointer. Thus, each step can be simulated in O(1) time and space.<br /><br />There is still the problem of answering queries. For subtasks where the queries all follow the commands, this is quite easy: just extract the final string once into an array, and answer queries from the array. For the final subtask something smarter is required: some extra information must be stored in each node. A start is to store the node's depth and its parent. Each query can be translated into a query for the kth ancestor of the current node, and the parent pointers can be used to ascend the tree. To make it fast enough, one can also store pointers to the 2<sup>i</sup>-th ancestor for all relevant i, which makes it possible to answer queries in O(log N) time. It also requires O(N log N) space: since the task description doesn't list the memory requirements I'm not sure if that will fit. It's also possible to store only the parent and the 2<sup>i</sup>-th ancestor where 2<sup>i</sup> is the largest power of 2 dividing the depth, for O(N) space.<br /><h3>Parachute Rings </h3>For this problem, the trick was keeping track of all the cases. Firstly, it is quite easy to keep a histogram of the degree of each ring. We can also keep track of the components, and the number of vertices and edges in each component (this can be kept in the root of a standard union-find structure). Now, let's consider cases:<br /><ol><li>If there is a ring with at least 4 connections, it is the only possible critical ring, because removing any other ring will leave it with at least three connections. If there is more than one such ring, there can be no more critical rings.</li><li>If there is a ring with exactly 3 connections, only it and its neighbours can be critical rings. There are thus at most four candidates to consider.</li><li>If all rings have at most 2 connections, there could be many critical rings. Every component will be either a chain or a cycle, and these can be distinguished by comparing the number of edges and vertices in the component. If there are no cycles, all rings are critical. If there is one cycle, all rings in that cycle are critical. Otherwise there are no critical rings.</li></ol>The last case can be checked as we go, but the others are a little more tricky because they identify only candidates, not definitely critical rings. However, at most four rings can become candidates during the lifetime of the program: the first to have at least three connections, plus those three neighbours. As soon as a ring becomes a candidate, we can replay all the previous connections on another union-find data structure with that ring removed, and thereafter maintain that structure so that it can be examined after future connections.<br /><br />I've omitted one minor detail: we also need to maintain in each union-find structure whether there are 0, 1, or more cycles, and if 1, the size of the component with 1 cycle. This can quite easily be maintained as the connections are made. Thus, total memory is O(N) and total time is O(N α(N)).<br /><h3>Pebbling odometer</h3>This seems like the hardest problem, if only because it takes the most time to work through all the different subtasks. I haven't tried to implement these, so my analysis may be a little off. In terms of implementation, the major annoyance of this language is the lack of subroutine calls. It is likely that you will want to write metaprograms to generate the programs, where a subroutine call is implemented by duplicating code and adjusting internal labels. There is also no mechanism for storing internal state, so state needs to be stored either in the instruction pointer (by duplicating code), or in the grid itself through counts of pebbles.<br /><h4> Subtask 1</h4>Since we don't need to preserve the pebbles, we can just pair off pebbles until one of the piles is empty. Here's an outline of the process, ignoring all the details of moving around the grid.<br /><ol><li>If there is no pebble at (0, 0), then terminate at (0, 0).</li><li>Pick up a pebble from (0, 0).</li><li>If there is no pebble at (0, 1), then terminate at (0, 1).</li><li>Pick up a pebble from (0, 1).</li><li>Repeat.</li></ol><h4>Subtask 2</h4>We can no longer destroy the state by picking up pebbles. However, we can just move them somewhere for safe keeping. Start by moving all the pebbles in (0, 0) to (1, 0) and from (0, 1) to (1, 1) (using a loop in each case). Then apply the algorithm for subtask 1, and once the result has been decided, move any remaining pebbles back to the top row. Note that there will need to be two code-paths for restoring the pebbles, depending on where you want to terminate.<br /><h4>Subtask 3</h4>This should be doable by moving the pebbles towards each other until they meet. First move right until finding the first pebble. Then<br /><ol><li>Pick up a pebble.</li><li>Move forward one square.</li><li>If there is already a pebble on this square, terminate.</li><li>Add a pebble to this square.</li><li>Continue forward until reaching the other pebble.</li><li>Turn around.</li><li>Repeat.</li></ol><h4>Subtask 4</h4>Here it gets a little trickier to tell whether the programs will actually fit, but I think this should work. Firstly, note that there is room for only three commands per cell on average. That seems too little even to do a move, a pebble test, a border test and a jump back to the start of a loop. However, loop unrolling can trade program size for execution length e.g. ((move, pebble) * 4, border, jump) to handle 4 cells gives an average of 2.5 steps per cell. I'm assuming a walk which goes alternatively left and right across rows.<br /><br />That makes it possible to walk the board in few enough steps, but what do we do when we find a pebble? There probably isn't enough program size to store the number of pebbles in the program counter, so we need to move the pebbles around. We can move the pebble back to the start of the row, then restart the row from the next spot. Even if there are several pebbles in a row, we will still make forward progress. Once we've reached the end of the board, we can process the two extreme columns in a similar way to move all the pebbles up to the top-left corner. Moving pebbles will be expensive (each will cost up to a few thousand instructions, depending on implementation), but as there are at most 15 of them it is not a major issue.<br /><h4>Subtask 5</h4>I've got a few ideas for this one, but nothing that I'm confident will fit into 444 instructions. The fundamental issue is that there is no way to save state in the board, since the initial board can be in any legal state and a valid solution cannot destroy information (because it has to match the final board). Thus, the only way to save state is in the program counter, which leads to massive code duplication. One is quite likely to need to store the minimum count.<br /><br />This probably requires some very careful micro-optimisation: Johnny Ho's perfect score in fact takes 449 instructions, and is rounded up.<br /><br /><b>UPDATE:</b> see the <a href="http://www.ioi2012.org/hints-for-tasks-of-day-1/">official hints page</a> for a solution to subtask 5.Bruce Merrynoreply@blogger.com2tag:blogger.com,1999:blog-31847281.post-47183581619587864692011-05-14T07:47:00.000-07:002011-05-14T08:59:36.713-07:00The myth of "up" and "down" in computer graphicsToday is another of my rare technical rants, er, blog posts. It's a topic that I struggled with a number of years ago until I understood how to think about it properly, and which I still see people struggling with regularly.<br /><br />The OpenGL coordinate system causes a lot of confusion because it uses a "bottom-left" origin, which is seen as "upside-down" relative to other coordinate systems typically used in window systems. This leads to people putting in lots of extra code to "flip images over" to make them right, often in the wrong part of a pipeline and requiring further reflections elsewhere to correct for it; as well as mis-guided requests for features in various APIs.<br /><br />The way to stop your brain hurting is to stop labelling things as "up" or "down". Up and down are functions of an eyeball, and so it can only be directly applied to things that you see on a screen. Windows on a screen are visible, and this is the one place where OpenGL really is a bottom-up coordinate system. However, textures are <span style="font-weight: bold;">not</span> directly visible on the screen, either for normal sampling or for render-to-texture, so the use of up and down in the specification is purely a convenience for describing the behaviour. When uploading a texture, the first texel that is provided has texel coordinates (0, 0), and when this texture is treated as a rendertarget, that same texel has window coordinates (0, 0). This is true in both OpenGL and in Direct3D - there's no need to make any changes.<br /><br />Things get more complicated at interface boundaries with other formats that do specify an up and a down. For example, PNG files do not have a standard coordinate system, but they do have a well-defined top and a bottom. So if a PNG file is used as a texture, where should the coordinate system origin be placed? That's not obvious, and needs to be consistent for an entire toolchain e.g. if your 3D modelling package shows you previews in which texture coordinates of (0, 0) map to the bottom-left of your image, then you should probably do the same thing in an application that consumes those models. Note that not all image file formats work the same way: formats created specifically for use with textures (e.g. <a href="http://www.khronos.org/opengles/sdk/tools/KTX/file_format_spec/">KTX</a>) may specify an origin rather than an up/down orientation (KTX also has an optional hint to tell viewer/editor applications how to display the image). The asset format may also provide a convention e.g. COLLADA explicitly indicates that (0, 0) corresponds to the lower-left corner of a texture image.<br /><br />So, to summarize:<br /><ul><li>Avoid using the terms "up" and "down" where they are not absolutely necessary. They'll just confuse you.</li><li>Correct applications never flip images "upside-down" - they just sometimes have to re-arrange pixels in memory to conform to an interface. An upside-down image is a bug.<br /></li><li>When defining interfaces between systems which have a defined "up" and "down" (e.g. a PNG file) and systems which have a defined coordinate system (e.g. OpenGL textures), make sure you know what the correspondence is (using precedent set by thirdparty tools or file formats where possible), then stick it to throughout your toolchain.</li></ul>Hopefully this will reduce the amount of confusion in the world.Bruce Merrynoreply@blogger.com0