tag:blogger.com,1999:blog-318472812020-10-08T02:17:21.856-07:00Entropy always increasesUnknownnoreply@blogger.comBlogger105125tag:blogger.com,1999:blog-31847281.post-57345543162377941592020-01-26T02:45:00.003-08:002020-01-26T02:45:59.984-08:00Thoughts on Rust<div dir="ltr" style="text-align: left;" trbidi="on">I've recently spent a few days learning to program in Rust, and thought I'd write down my thoughts so far. I used programming contest problems as away to get practical experience, which probably biases things somewhat. For example, I didn't look into multi-threading, testing, or see very much of the standard library.<br /><h3 style="text-align: left;">References</h3><div style="text-align: left;">This is obviously the stand-out feature of Rust. It seems like a nice idea, and eliminating use-after-free, double-free, null pointer dereferences, and many forms of memory leak sounds great from a C/C++ perspective. While modern C++ makes it much easier to manage ownership safely, it doesn't really address borrowing.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">The mutability rules also sound like a huge win for safe concurrency, but I haven't looked into any of the details.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">The one thing I've found non-intuitive is that it seems sometimes references are implicitly dereferenced (e.g., you can add two references to integers) but in other places they aren't (e.g., adding in place to a mutable reference to integer).</div><div style="text-align: left;"><br /></div><div style="text-align: left;">I haven't really tried using references very heavily: my contest coding style tends not to use a lot of pointers anywhere, preferring standard library structures like vectors. </div><h3 style="text-align: left;">Types</h3><div style="text-align: left;">I found it really annoying that there is no automatic coercion between integer types. What's worse, the typecast operator ("as") doesn't follow the same rules as arithmetic operations, namely panicking on overflow in debug builds. It's also annoying that indexing requires an unsigned type. If you have an index (of type usize) and an offset to it (of type i32, possibly negative) it is a real pain to add them to produce a new index.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">The type inference also seems like spooky action at a distance: I'm all for omitting the type when declaring a variable and inferring it from the initialiser, but inferring it from usage elsewhere in the code takes some getting used to. It leads to weirdness where changing/removing some apparently unrelated line of code can cause a compilation failure, or even silently change the type of a variable. Given that the indexing requires unsigned, I worry that some variable that might need to store negative values could end up unsigned without one being aware of it just because of an inference chain from an index.</div><h3 style="text-align: left;">Overloading</h3><div style="text-align: left;">Coming from C++, it's disappointing to have no function/method overloading, not even default arguments. It leads to having to invent names for lots of variants of basically the same function. And unfortunately I don't think it can be easily fixed, because of the aggressive type inference: overloading uses the types of arguments to select a function overload, but type inference uses the types of formal parameters to infer the types of arguments. So as soon as you add a new overload, you're reducing the power of type inference and potentially making old code no longer valid.</div><h3 style="text-align: left;">Traits and generics</h3><div style="text-align: left;">While it took a bit of getting used to when coming from traditional OO languages like C++ and Python, I quite like the traits system. It's a lot better than Java interfaces, because you can have default implementations, and you can define new traits and bolt them on to existing classes. It's a much cleaner way to put type bounds on generics than SFINAE in C++, and seems like it probably has many of the same advantages as C++ concepts (not that I've looked at the latest incarnations of C++ concepts).</div><div style="text-align: left;"><br /></div><div style="text-align: left;">I also really like the way traits allow you to use a trait with either static dispatch (ala C++ templates) or dynamic dispatch (ala C++ virtual functions), rather than forcing the API designer to choose one or the other. I suspect there are also some performance advantages: in C++ when using a polymorphic class, you pay for a virtual function call even when the exact class is known to the programmer, unless the compiler can also determine that it is known or the programmer uses final classes/methods. With Rust there is no inheritance, so an object whose type is a concrete struct will have exactly that type.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">The one thing I disliked is that methods defined in traits end up looking the same as normal methods on the class. If you're not aware that a particular method is actually provided by a trait (or you don't know which trait), it can mysteriously fail to exist if you forget to import the trait into your namespace.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">The generics system still has some way to go before it catches up to C++ e.g. there are no non-type template parameters (although it's being worked on), no variadic templates, and from what I could see, no real specialisation to override more generic implementations.</div><h3 style="text-align: left;">Enums</h3><div style="text-align: left;">I think this is one of the more under-rated features of Rust. Rust enums are really discriminated unions (ala boost::variant), with first-class language support. I particularly like the "?" operator: given an expression of type Result<t e=""> (which holds either an "ok" or an error), put a "?" after it, and if it is an error it will immediately return it from the function. This means that although Rust doesn't have exceptions in the same way as C++/Java/Python, one can propagate errors with very little boilerplate, and with the benefit that it's explicit where early returns might happen. It certainly looks nicer than what I've seen of Go. </t></div><h3 style="text-align: left;">Performance</h3><div style="text-align: left;">I translated a few contest solutions from C++ to Rust, and was pleasantly surprised by the performance: generally faster than the C++ code (which might just be because Rust uses LLVM, which is pretty good and often better than GCC, particularly since Codeforces uses a 32-bit GCC). The one area where it was much worse is writing large outputs to stdout, because Rust always makes stdout line-buffered while GCC makes it fully-buffered when it is not a TTY (which is a known issue in Rust). After wrapping a buffer around stdout the performance was good again.</div><h3 style="text-align: left;">Summary</h3><div style="text-align: left;">In general I like Rust, although I think it still needs a few years to mature before I'd consider abandoning C++ for it. While Go seems to be getting all the popularity, I think a high-performance language really needs to avoid garbage collection and provide a strong compile-time generics system.</div></div>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-31847281.post-81731269198305258312018-01-20T10:12:00.004-08:002018-01-20T10:12:51.639-08:00COCI 2017/2018 r5 analysis<div dir="ltr" style="text-align: left;" trbidi="on"><div style="text-align: left;">For the problem texts, see <a href="http://hsin.hr/coci/">here</a>. I found this contest easier than usual for a COCI.</div><h2 style="text-align: left;">Olivander</h2><div style="text-align: left;">Given two wands A < B and two boxes X > Y, there is never any point in putting A in X and B in Y; if they fit, then they also fit the other way around. So sort the boxes and wands, match them up in this order and check if it all fits.</div><h2 style="text-align: left;">Spirale</h2><div style="text-align: left;">This is really just an implementation challenge. For each spiral, walk along it, filling in the grid where the spiral overlaps it. With more than one spiral, set each grid point to the minimum of the values induced by each spiral. Of course, it is not necessary to go to \(10^{100}\); about 10000 is enough to ensure that the grid is completely covered by the partial spiral.</div><h2 style="text-align: left;">Birokracija</h2><div style="text-align: left;">The rule for determining which employee does the next task is a red herring. Each employee does exactly one task, and the order doesn't matter. The money earned by an employee is the sum of the distance from that employee to all underlings. The can be computed by a tree DP (computing both this sum and the size of each subtree).</div><h2 style="text-align: left;">Karte</h2><div style="text-align: left;">Suppose there is a solution. It can be modified by a series of transformations into a canonical form.Firstly, moving a false claim below an adjacent true claim will not alter either of them; thus, we can push all the false claims to the bottom and true claims to the top. At this point, the true claims can be freely reordered amongst themselves. Similarly, if we have a false claim with small <i>a</i> above one with a larger a, we can swap them without making either true. So we can assume that the false claims are increasing from bottom to top in the deck. Finally, given a true claim with large <i>a</i> and a false claim with small <i>a</i>, we can swap them and they will both flip (so the positions of false claims stay the same).</div><div style="text-align: left;"><br /></div><div style="text-align: left;">After these transformations, the deck will, from bottom to top, contain the largest K cards in increasing order, followed by the rest in arbitrary order (let's say increasnig). To solve the problem, we simply construct this canonical form, then check if it indeed satisfies the conditions.</div><h2 style="text-align: left;">Pictionary</h2><div style="text-align: left;">When building roads on the day with factor F, we don't actually need to build roads between every pair of multiples of F: it is equivalent to connect every multiple of F to F. This gives O(N log M) roads. We can represent the connected components after each day with a standard union-find structure. For reasons we'll see later, we won't use path compression, but always putting the smaller component under the larger one in the tree is sufficient to ensure a shallow tree (in theory O(log N), but I found the maximum depth was 6). A slow solution would be to check every remaining query after each day to see whether the two mathematicians are in the same component yet.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">To speed this up, we can record extra information in the tree: for each edge, we record the day on which it was added. If the largest label on a path from A to B is D, then D was the first day on which they were connected (this property would be broken by path compression, which is why we cannot use it). Thus, to answer a query, we need only walk up the tree from each side to the least common ancestor; and given the shallowness of the tree, is this cheap.</div><h2 style="text-align: left;">Planinarenje</h2><div style="text-align: left;">I really liked this problem. Take a single starting peak P. Let A be the size of the maximum matching the graph, and let B be the size of the maximum matching excluding P. Suppose A = B. Then Mirko can win as follows: take the latter matching, and whenever Slavko moves to a valley, Mirko moves to the matched peak. Slavko can never move to a valley without a match, because otherwise the journey would form an augmenting path that would give a matching for the graph of size B + 1.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">Conversely, suppose A > B. Then take a whole-graph matching, which by the assumption must include a match for P. Slavko can win by always moving to the matched valley. By a similar argument, Mirko can never reach an unmatched peak, because otherwise toggling all the edges on their journey would give a maximum matching that excludes P.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">To implement it, it may not be efficient enough to construct a new subgraph matching from every peak. Instead, one can start with a full-graph matching, remove P and its match from the graph (if any), then re-augment starting from that match. This should give an O(NM) algorithm (O(NM) for the initial matching, then O(M) per query).</div></div>Unknownnoreply@blogger.com6tag:blogger.com,1999:blog-31847281.post-38381065761615770402017-11-26T12:26:00.004-08:002017-11-26T12:26:54.309-08:00Analysis of Croatian Open contest 2017r3<div dir="ltr" style="text-align: left;" trbidi="on">This round of <a href="http://hsin.hr/coci/">COCI</a> was more challenging than usual, with some of the lower-scoring problems being particularly difficult.<br /><h3 style="text-align: left;">Aron</h3><div style="text-align: left;">This was quite straightforward: run through the letters, and each time a letter is different to the previous one (or is the first one), increment a counter.</div><h3 style="text-align: left;">Programiranje</h3><div style="text-align: left;">Firstly, one word can be made by rearranging the letters of another if and only if they have the same letter frequencies. So if we have a fast way to find the letter frequencies in any substring, the problem is solved. Consider the sequence whose ith element is 1 if S[i] is 'a' and 0 otherwise: the number of a's in a substring is then just a contiguous sum of this sequence. By precomputing the prefix sum, we can find the sum of any interval in constant time. We thus just need to store 26 prefix sum tables, one for each letter.</div><h3 style="text-align: left;">Retro</h3><div style="text-align: left;">The first part (finding the length) is a reasonably standard although fiddly dynamic programming problem. At any point in the process, one has a position relative to the original grid and a nesting depth, and need to know the longest suffix that will complete a valid bracket sequence. For each of the (up to) three possible moves, one can compute the state would be reached, and find the longest possible sequence from previously computed values.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">My attempt at the second part during the contest was overly-complex and algorithmically too slow, involving trying to rank all length-L strings represented by DP states, for each L in increasing order. This required a lot of sorting, making the whole thing O(N³ log N) (where N is O(R + C)).</div><div style="text-align: left;"><br /></div><div style="text-align: left;">A simpler and faster solution I wrote after the contest is to reconstruct the path non-deterministically, one bracket at a time. Instead of working from a single source state, keep a set of candidate states (in super-position). From those, run a breadth-first (or depth-first) search to find all states where one might encounter the next bracket. Any transition that is optimal (for length) in the original DP now becomes an edge in this search. Once the candidate states for the next bracket has been found, there might be a mix of opening and closing brackets. If so, discard those states corresponding to closing brackets, since they will not be lexicographically minimal.</div><h3 style="text-align: left;">Portal</h3><div style="text-align: left;">I didn't attempt this problem, and am not sure how to solve it. I suspect it requires coming up with and proving a number of simplifying assumptions which allow the search space to be reduced.</div><h3 style="text-align: left;">Dojave</h3><div style="text-align: left;">I really liked this problem. First of all, M = 1 is a bit tricky, because it's the only case where you can't swap two numbers without having an effect. It's easy enough to solve by hand and treat as a special case.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">Now let's count all the cases where the player <i>can't</i> win. Let's say that the lit elements have an XOR of K. For M > 1, we can't have \(K = 2^M - 1\), since otherwise the player could swap either two lit or two unlit numbers and win. Let \(K' = K \oplus 2^M - 1\), and consider a lit number A. To win, the player has to swap it with B such that \(A \oplus B = K'\). As the RHS can't be 0, the lit numbers must be grouped into pairs with XOR of K'. The XOR of all the lit numbers must thus be either 0 or K', depending on whether there are an even or odd number of such pairs. But this is K (by definition); and since K = K' is impossible, we must have K = 0.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">So now we need to find intervals such that the XOR of all elements is 0, and for every pair of complementary elements A and B, they are either both inside or both outside the interval. The first part is reasonably easy to handle: we can take a prefix sums (using \(\oplus\) instead of addition), and an interval with XOR of 0 is one where two prefix sums are equal. For the second part, we can use some sweeps: for each potential left endpoint, establish an upper bound on the right endpoint (considering only complementary pairs that lie on opposite sides of the left endpoint), and vice versa - doable with an ordered set or a priority queue to keep track of right elements of complementary pairs during the sweep. Now perform another left-to-right sweep, keeping track of possible left endpoints that are still "active" (upper bound not yet reached), and each time a right endpoint is encountered, query how many of the left endpoints with the same suffix sum are above the lower bound for the right end-point. Actually getting the count in logarithmic time requires a bit of fancy footwork with data structures that I won't go into (e.g. a Fenwick tree, a segment tree or a balanced binary tree with node sizes in the internal nodes).</div><h3 style="text-align: left;">Sažetak</h3><div style="text-align: left;">I somehow miscounted and didn't realise that this problem even existed until after the contest (maybe because I don't expect more than three harder problems in COCI). It's disappointing, because it only took me about 15 minutes to solve after the contest, yet was worth the most points.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">For each K-summary, draw a line between each pair of adjacent segments of that summary. For example, with a 2-summary and 3-summary and N=9, you would get |01|2|3|45|67|8|9|. Clearly in any marked interval with more than one number, none of them can be determined. It is also not hard to see that in any interval with exactly one number, that number can be determined (hint: find the sum of all numbers except that one). Thus, we need the count of numbers X such that X is a multiple of some Ki and X+1 is a multiple of another Kj. Note that if any K=1 then the answer is simply N, which we'll assume is handled as a special case.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">If one has chosen a single Ki and Kj, it is not difficult to determine the numbers that satisfy the property, using the Chinese Remainder Theorem (they will be in arithmetic progression, with period Ki×Kj, and so are easy to count). However, this can easily lead to double-counting, and we thus need to apply some form of inclusion-exclusion principle. That is, we can pick any two subsets S and T of the K's, with products P and Q, and count the number of X's such that P divides X and Q divides X + 1. Note that S and T must be disjoint, as otherwise the common element would need to divide X and X+1 (and we assumed Ki > 1). We also need to determine the appropriate coefficient to scale this term in the sum. Rather than trying to derive equations, I let the computer do the work: for each value of |S| and |T| I consider every possible subset of S and of T (actually, just each possible size), add up the coefficients they're already contributing, and set the coefficient for this value of |S| and |T| to ensure that the total is 1.</div></div>Unknownnoreply@blogger.com1tag:blogger.com,1999:blog-31847281.post-18479708433254053842017-06-12T23:19:00.000-07:002017-06-12T23:19:01.320-07:00Extra DCJ 2017 R2 analysis<div dir="ltr" style="text-align: left;" trbidi="on"><h3 style="text-align: left;">Flagpoles</h3><div style="text-align: left;">My solution during the contest was essentially the same as the official analysis. Afterwards I realised a potential slight simplification: if one starts by computing the second-order differences (i.e., the differences of the differences), then one is looking for the longest run of zeros, rather than the longest run of the same value. That removes the need to communicate the value used in the runs at the start and end of each section.</div><h3 style="text-align: left;">Number Bases</h3><div style="text-align: left;">I missed the trick of being able to uniquely determine the base from the <i>first</i> point at which X[i] + Y[i] ≠ Z[i]. Instead, at every point where X[i] + Y[i] ≠ Z[i], I determine two candidate bases (depending on whether there is a carry of not). Then I collect the candidates and test each of them. If more than three candidates are found, then the test case is impossible, since there must be two disjoint candidate pairs.</div><h3 style="text-align: left;">Broken Memory</h3><div style="text-align: left;">My approach was slightly different. Each node binary searches for its broken value, using two other nodes to help (and simultaneously helping two other nodes). Let's say we know the broken value is in a particular interval. Split that interval in half, and compute hashes for each half on the node (h1 and h2) and on two other nodes (p1 and p2, q1 and q2). If h1 equals p1 or q1, then the broken value must be in interval 2, or vice versa. If neither applies, then nodes p and q both have broken values, in the opposite interval to that of the current node. We can tell which by checking whether p1 = q1 or p2 = q2.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">This does rely on not having collisions in the hash function. In the contest I relied on the contest organisers not breaking my exact choice of hash function, but it is actually possible to write a solution that works on all test data. Let P be a prime greater than \(10^{18}\). To hash an interval, compute the sums \(\sum m_i\) and \(\sum i m_i\), both mod P, giving a 128-bit hash. Suppose two sequences p and q collide, but differ in at most two positions. The sums are the same, so they must differ in exactly two positions j and k, with \(p_j - q_j = q_k - p_k\) (all mod P). But then the second sums will differ by</div><div style="text-align: left;">\(jp_j + kp_k - jq_j - kq_k = (j - k)(p_j - q_j)\), and since P is prime and each factor is less than P, this will be non-zero.</div></div>Unknownnoreply@blogger.com1tag:blogger.com,1999:blog-31847281.post-43450258690438233972017-06-12T00:13:00.001-07:002017-06-12T00:15:26.674-07:00Alternative Code Jam 2017 R3 solution<div dir="ltr" style="text-align: left;" trbidi="on">This last weekend was round 3 of Code Jam 2017 and round 2 of Distributed Code Jam 2017. I'm not going to describe how to solve all the problems since there are official analyses (<a href="https://code.google.com/codejam/contest/8304486/dashboard#s=a">here</a> and <a href="https://code.google.com/codejam/contest/3284486/dashboard#s=a">here</a>), but just mention some alternatives. For this post I'll just talk about one problem from Code Jam; some commentary on DCJ will follow later.<br /><h3 style="text-align: left;">Slate Modern (Code Jam)</h3><div style="text-align: left;">The idea I had for this seems simpler (in my opinion, without having tried implementing both) than the official solution, but unfortunately I had a bug that I couldn't find until about 10 minutes after the contest finished.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">As noted in the official analysis, one can first check whether a solution is possible by comparing each pair of fixed cells: if the difference is value is greater than D times the Manhattan distance, then it is impossible; if no such pair exists, it is possible. The best solution is then found by setting each cell to the smallest lower bound imposed by any of the fixed cells.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">Let's try to reduce the complexity by a factor of C, by computing the sum of a single row quickly. If we look at the upper bound imposed by one fixed cell, it has the shape \(b + |x - c| D\), where \(x\) is the column and b, c are constants. When combining the upper bounds, we take the lowest of them. Each function will be the smallest for some contiguous (possibly empty) interval. By sweeping through the fixed cells in order of c, we can identify those that contribute, similar to finding a Pareto front. Then, by comparing adjacent functions one can find the range in which each function is smallest, and a bit of algebra gives the sum over that range.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">This reduces the complexity to O(RN + N²) (the N² is to check whether a solution is possible, but also hides an O(N log N) to sort the fixed cells). That's obviously still too slow. The next insight is that most of the time, moving from one row to the next changes very little: each of the functions increases or decreases by one (depending on whether the corresponding fixed cell is above or below the row), and the range in which each function is smallest grows or shrinks slightly. It thus seems highly likely that the row sum will be a low-degree polynomial in the row number. From experimentation with the small dataset, I found that it is actually a quadratic (conceptually it shouldn't be too hard to prove, but I didn't want to get bogged down in the details).</div><div style="text-align: left;"><br /></div><div style="text-align: left;">Note that I said "most of the time". This will only be true piecewise, and we need to find the "interesting" rows where the coefficients change. It is fairly obvious that rows containing fixed cells will be interesting. Additionally, when the range in which a function is smallest disappears or appears will be interesting. Consider just two fixed cells, and colour the whole grid according to which of the two fixed cells gives the lower upper bound. The boundary between the two colours can have diagonal, horizontal and vertical portions, and it is these horizontal portions that are (potentially) interesting. I took the conservative approach of adding all O(N²) such rows as interesting.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">Now that we have partitioned the rows into homogeneous intervals, we need to compute the sum over each interval efficiently. Rather than determine the quadratic coefficients analytically, I just interfered them by taking the row sums of three consecutive rows (if there are fewer than three rows in the interval, just add up their row sums directly). A bit more algebra to find the sum of a quadratic series, and we're done! There are O(N²) intervals to sum and it requires O(N) to evaluate each of the row sums needed, giving a running time of O(N³). I suspect that this could probably be reduced to O(N² log N) by being smarter about how interesting rows are picked, but it is not necessary given the constraints on N.</div></div>Unknownnoreply@blogger.com2tag:blogger.com,1999:blog-31847281.post-53295121464490044972016-11-22T22:45:00.001-08:002016-11-22T22:45:15.422-08:00An alternative DCJ 2016 solution<div dir="ltr" style="text-align: left;" trbidi="on">During this year's Distributed Code Jam I had an idea for solving the hard part of Toothpick Sculptures, but had no time to implement it, or even to work out all the details. When the official solution was presented, I was surprised that it was very different to mine. Recently Pablo from Google has been able to help me check that my solution does indeed pass the system tests (thanks Pablo), so it seems like a good time to describe the solution.<br /><br />There are going to be a lot of different trees involved, so let's give them names. A "minitree" is one of the original sculptures of N toothpicks, and its root is a "miniroot". The "maxitree" is the full tree of 1000N toothpicks. Let's assume that each minitree is built out of toothpicks of one colour. We can then consider a tree whose vertices are colours, called the "colour tree", which contains an edge between two colours if and only if the maxitree contains an edge between two vertices of those colours.<br /><br />For each colour, we can find its children in the colour tree by walking the corresponding minitree looking for children which are miniroots. This can be done in parallel over the colours.<br /><br />For a given vertex, we can use DP to solve the following two problems for the subtree: what is the minimum cost to stabilise the subtree if we stabilise the root, and what is the minimum cost if we do not stabilise the root? But this will not be so easy to parallelise. We can go one step further: if we fix some vertex C and cut off the subtree rooted at C, then we can answer the same two problems for each case of C being stabilised or not stabilised (four questions in total). If we later obtain the answers to the original two queries for C, then we can combine this with the four answer we have to answer the original queries for the full tree.<br /><br />This is sufficient to solve the small problem, and more generally any test case where the colour tree does not branch. For each miniroot, compute the DP, where the miniroot corresponding to the single child (if any) in the colour tree is the cutoff. These DPs can be done in parallel, and the master can combine the results.<br /><br />Let's consider another special case, where the colour tree is shallow. In this case, one can solve one layer at a time, bottom up, without needing to use the cutoff trick at all. The colours in each layer of the colourtree are independent and can be solved in parallel, so the latency is proportional to the depth of the tree. The results of each layer are fed into the calculations of the layer above.<br /><br />So, we have a way to handle long paths, and a way to handle shallow trees. This should immediately suggest light-heavy decomposition. Let the "light-depth" of a node in the colour tree be the number of light edges between it and the root. The native of light-heavy decomposition guarantees that light-depth is at most logarithmic in the tree size (which is 1000). We will process all nodes with the same light-depth in parallel. This means that a node in a tree may be processed at the same time as its children, but only along a heavy path. We thus handle each heavy path using the same technique as in the small problem. For other children in the colour tree, the subtree results were computed in a previous pass and are sent to the slave.</div>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-31847281.post-45148395329081301192016-11-22T22:37:00.003-08:002016-11-22T22:46:24.051-08:00TCO 2016 finals<div dir="ltr" style="text-align: left;" trbidi="on">The final round of TCO 2016 did not go well for me. I implemented an over-complicated solution to the 500 that did not work (failed about 1 in 1000 of my random cases), and I started implementing the 400 before having properly solved it, leading to panic and rushed attempts to fix things in the code rather than solving the problem.<br /><h3 style="text-align: left;">Easy</h3><div style="text-align: left;">The fact that there are only four mealtimes is a clue that this will probably require some exponential time solution. I haven't worked out the exact complexity, but it has at least a factor of \(2^\binom{n}{n/2}\) for n mealtimes.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">Let's restrict our attention to meal plans that contain no meals numbered higher than m and have t meals in the plan (t can be less than 4). We can further categorise these meals by which subsets of the mealtimes can accommodate the plan. This will form the state space for dynamic programming. To propagate forwards from this state, we iterate over the number of meals of type m+1 (0 to 4 of them). For each choice, we determine which subsets of mealtimes can accommodate this new meal plan (taking every subset that can accommodate the original plan and combining it with any disjoint subset that can accommodate the new meals of type m+1).<br /><h3 style="text-align: left;">Medium</h3><div style="text-align: left;">Let's start by working out whether a particular bracket sequence can be formed. Consider splitting the sequence into a prefix and suffix at some point. If the number of ('s in the prefix differs between the initial and target sequence, then the swaps that cross the split need to be used to increase or decrease the number of ('s. How many ('s can we possibly move to the left? We can answer that greedily, by running through all the operations and applying them if and only if they swap )( to (). Similarly, we can find the least possible number of ('s in the final prefix.<br /><br />If our target sequence qualifies by not having too many or too few ('s in any prefix, can we definitely obtain it? It turns out that we can, although it is not obvious why, and I did not bother to prove it during the contest. Firstly, if there is any (proper) prefix which has exactly the right number of ('s, we can split the sequence at this point, choose not to apply any operations that cross the split, and recursively figure out how to satisfy each half (note that at this point we are not worrying whether the brackets are balanced — this argument applies to any sequence of brackets).<br /><br />We cannot cross between having too few and too many ('s when growing the prefix without passing through having the right number, so at the bottom of our recursion we have sequences where every proper prefix has too few ('s (or too many, but that can be handled by symmetry). We can solve this using a greedy algorithm: for each operation, apply it if and only if it swaps )( to () and the number of ('s in the prefix is currently too low. We can compare the result after each operation to the maximising algorithm that tries to maximise the number of ('s in each prefix. It is not hard to see (and prove, using induction) that for any prefix and after each operation is considered, the number of ('s in the prefix is:</div><ul style="text-align: left;"><li>between the original value and the target value (inclusive); and</li><li>the smaller of the target value and the value found in the maximising algorithm.</li></ul><div style="text-align: left;">Thus, after all operations have been considered, we will have reached the target.<br /><br />So now we know how to determine whether a specific bracket sequence can be obtained. Counting the number of possible bracket sequences is straightforward dynamic programming: for each prefix length, we count the number of prefixes<br />that are valid bracket sequence prefixes (no unmatched right brackets) for each possible nesting depth.</div><h3 style="text-align: left;">Hard</h3><div style="text-align: left;">This is an exceptionally mean problem, and I very much doubt I would have solved it in the fully 85 minutes; congratulations to Makoto for solving all three!<br /><br />For convenience, let H be the complement of G2 (i.e. n copies of G1). A Hamiltonian path in G2 is simply a permutation of the vertices, such that no two consecutive vertices form an edge in H. We can count them using inclusion-exclusion, which requires us to count, for each m, the number of paths that pass through any m chosen edges. Once we pick a set of m edges and assign them directions (in such a way that we have a forest of paths), we can permute all the vertices that are not tails of edges, and the remaining vertices have their positions uniquely determined.<br /><br />Let's start by solving the case n=1. Since k is very small, we can expect an exponential-time solution. For every subset S of the k vertices and every number m of edges, we can count the number of ways to pick m edges with orientation. We can start by solving this for only connected components, which means that m = |S| - 1 and the edges form a path. This is a standard exponential DP that is used in the Travelling Salesman problem, where one counts the number of paths within each subset ending at each vertex.<br /><br />Now we need to generalise to m < |S| - 1. Let v be the first vertex in the set. We can consider every subset for the path containing v, and then use dynamic programming to solve for the remaining set of vertices. This is another exponential DP, with a \(O(3^n)\) term.<br /><br />Now we need to generalise to n > 1. If a graph contains two parts that are not connected, then edges can be chosen independently, and so the counts for the graph are obtained by convolving the counts for the two parts. Unfortunately, a naïve implementation of this convolution would require something like O(n²k) time, which will be too slow. But wait, why are we returning the answer modulo 998244353? Where is our old friend 1000000007, and his little brother 1000000009? In fact, \(998244353 = 2^{23} \times 7 \times 17 + 1\), which strongly suggests that the Fast Fourier Transform will be involved. Indeed, we can perform an 1048576-element FFT in the field modulo 998244353, raise each element to the power of n, and then invert the FFT.<br /><br />Despite the large number of steps above, the solution actually requires surprisingly little new code, as long as one has library code for the FFT and modulo arithmetic.</div></div></div>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-31847281.post-42541078882516302972016-11-21T03:59:00.003-08:002016-11-21T03:59:32.870-08:00TCO 2016 semifinal 2 solutions<div dir="ltr" style="text-align: left;" trbidi="on">Semifinal 2 was much tougher than semifinal 1, with only one contestant correctly finishing two problems. The first two problems each needed a key insight, after which very little coding is needed (assuming one has the appropriate library code available). The hard problem was more traditional, requiring a serious of smaller steps to work towards the solution.<br /><h3 style="text-align: left;">Easy</h3><div style="text-align: left;">Consider the smallest special clique. What happens if we remove an element? Then we will still have a clique, but the AND of all the elements is non-zero. In particular, there is some bit position which is 1 in every element of the special clique except for 1.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">We can turn that around: it suffices to find a bit position B, a number T which is zero on bit B, and a special clique containing T whose other elements are all 1 on bit B. We can easily make a list of candidates to put in this special clique: any element which is 1 on bit B and which is connected to T. What's more, these candidates are all connected to each other (through bit B), so taking T plus all candidates forms a clique. What's more, if we can make a special clique by using only a subset of the candidates, then the clique formed using all candidates will be special too, so the latter is the only one we need to test.</div><h3 style="text-align: left;">Medium</h3><div style="text-align: left;">The key insight is to treat this as a flow/matching problem. Rather than thinking of filling and emptying an unlabelled tree, we can just build a labelled tree with the properties that if A's parent is B, then B < A and A appears earlier in p than B does. A few minutes thinking should convince you that any labelled tree satisfying these constraints can be used for the sorting.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">We thus need to match each number to a parent (either another number or the root). We can set this up with a bipartite graph in the usual way: each number appears on the left, connected to a source (for network flow) with an edge of capacity 1. Each number also appears on the right, along with the root, all connected to the sink with N edges of capacity 1 (we'll see later why I don't say one edge of capacity N). Between the two sides, we add an edge A → B if B is a viable parent for A, again with capacity 1. Each valid assignment can be represented by a maxflow on this network (with flow N), where the edges across the middle are parent-child relationships. Note that normally using this approach to building a tree risks creating cycles, but the heap requirement on the tree makes that impossible.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">That tells us which trees are viable, but we still need to incorporate the cost function. That is where the edges from the right numbers to the sink come in. A (non-root) vertex starts with a cost of 1, then adding more children increases the cost by 3, 5, 7, ... Thus, we set the costs of the edges from each number to the sink to this sequence. Similarly, the costs of the edges from the root to the sink have costs 1, 3, 5, 7, ... The min-cost max-flow will automatically ensure that the cheaper edges get used first, so the cost of the flow will match the cost of the tree (after accounting for the initial cost of 1 for each number, which can be handled separately).</div><h3 style="text-align: left;">Hard</h3><div style="text-align: left;">This one requires a fairly solid grasp on linear algebra and vector spaces. The set with \(2^k\) elements is a vector subspace of \(\mathbb{Z}_2^N\) of dimension k, for some unknown N. It will have a basis of size k. To deal more easily with ordering, we will choose to consider a particular canonical basis. Given an arbitrary initial basis, one can use Gaussian reduction and back propagation to obtain a (unique) basis with the following property: the leading 1 bit in each basis element is a 0 bit in every other basis element. With this basis, it is not difficult to see that including any basis element in a sum will increase rather than decrease the sum. The combination of basis elements forming the ith sorted element is thus given exactly by the binary representation of i.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">We can now take the information we're given an recast it is a set of linear equations. Furthermore, we can perform Gaussian elimination on these linear equations (which we actually do in code, unlike the thought experiment Gaussian elimination above). This leaves some basis elements fixed (uniquely determined by the smaller basis elements) and others free. Because of the constraints of the basis elements, we also get some more information. If basis i appears in an equation, its leading bit must correspond to a 1 bit in the value (and to the first 1 bit, if i is the leading basis in the equation). Similarly, if basis i does not appear, then it must correspond to a 0 bit in the value. This allows us to build a mask of where the leading bit for each basis element can be.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">We now switch to dynamic programming to complete the count. We count the number of ways to assign the first B basis elements using the first C value bits. Clearly dp[0][C] = 1. To add new basis, we can consider the possible positions of its leading bit (within the low C bits), and then count the degrees of freedom for the remaining bits. If the basis is fixed then there is 1 degree of freedom; otherwise, there is a degree of freedom for each bit that isn't the leading bit of a previous basis, and it is easy to count these.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">There are a few details that remain to do with checking whether there are an infinite number of solutions (which can only happen if the largest basis is unconstrained), and distinguishing between a true 0 and a 0 that is actually a multiple of 1000000007, but those are left as exercises for the reader.</div></div>Unknownnoreply@blogger.com1tag:blogger.com,1999:blog-31847281.post-76213296937327593642016-11-20T09:10:00.005-08:002016-11-20T09:15:22.277-08:00Topcoder Open 2016 Semifinal 1<div dir="ltr" style="text-align: left;" trbidi="on">Here are my solutions to the first semifinal of the TCO 2016 algorithm contest.<br /><h3 style="text-align: left;">Easy</h3><div style="text-align: left;">Let's start with some easy cases. If s = t, then the answer is obviously zero. Otherwise, it is at least 1. If s and t share a common factor, then the answer is 1, otherwise it is at least 2.<br /></div><div style="text-align: left;">We can eliminate some cases by noting that some vertices are isolated. Specifically, if either s or t is 1 or is a prime greater than n/2 (and s≠t) then there is no solution.<br /></div><div style="text-align: left;">Assuming none of the cases above apply, can the answer be exactly 2? This would mean a path s → x → t, where x must have a common factor with s and a (different) common factor with t. The smallest x can be is thus the product of the smallest prime factors of s and t. If this is at most n, then we have a solution of length 2, otherwise the answer is at least 3.<br /></div><div style="text-align: left;">If none of the above cases applies, then answer is in fact 3. Let p be the smallest prime factor of s and q be the smallest prime factor of t. Then the path s → 2p → 2q → t works (note that 2p and 2q are at most n because we eliminated cases where s or t is a prime greater than n/2 above).</div><h3 style="text-align: left;">Medium</h3><div style="text-align: left;">We can create a graph with the desired properties recursively. If n is 1, then we need just two vertices, source and sink, with an edge from source to sink of arbitrary weight.<br /></div><div style="text-align: left;">If n is even, we can start with a graph with n/2 min cuts and augment it. We add a new vertex X and edges source → X and X → sink, both of weight 1. For every min cut of the original graph, X can be placed on either side of the cut to make a min cut for the new graph.<br /></div><div style="text-align: left;">If n is odd, we can start with a graph with n - 1 min cuts and augment it. Let the cost of the min cut of the old graph be C. We create a new source, and connect it to the original source with an edge of weight C. The cost of the min cut of this new graph is again C (this can easily be seen by considering the effect on the max flow rather than the min cut). There are two ways to achieve a cut of cost C: either cut the new edge only, or make a min cut on the original graph.<br /></div><div style="text-align: left;">We just need to check that this will not produce too many vertices. In the binary representation of n, we need a vertex per bit and a vertex per 1 bit. n can have at most 10 bits and at most 9 one bits, so we need at most 19 vertices.</div><h3 style="text-align: left;">Hard</h3><div style="text-align: left;">I made a complete mess of this during the contest, solving the unweighted version and then trying to retrofit it to handle a weighted version. That doesn't work.<br /><br />Here's how to do it, which I worked out based on some hints from tourist and coded at 3am while failing to sleep due to jetlag. Firstly, we'll make a change that makes some of the later steps a little easier: for each pair of adjacent vertices, add an edge with weight ∞. This won't violate the nesting property, and guarantees that there is always a solution. Let the <i>length</i> of an edge be the difference in indices of the vertices (as opposed to the <i>cost</i>, which is an input). Call an edge <i>right-maximal</i> if it's the longest edge emanating from a particular vertex, and <i>left-maximal</i> if it is the longest edge terminating at a particular vertex. A useful property is that an edge is always either right-maximal, left-maximal, or both. To see this, imagine an edge that is neither: it must then be nested inside longer edges terminating at each end, but these longer edges would cross, violating the given properties.<br /><br />We can now make the following observations:<br /><ol style="text-align: left;"><li>If x → y is right-maximal and color[x] is used, then color[y] must be used too. This is because the path must pass through x, and after x cannot skip over y.</li><li>If x → y is the only edge leaving x, then the edge will appear in a path if and only if color[x] is used.</li><li>If x → y is right-maximal and x → z is the next-longest edge from x, then x → y is in a path if and only if color[x] is used and color[z] is not used. </li></ol>These statements all apply similarly for left-maximal edges of course. We can consider vertices 0 and n to be of colour 0 in the above, which is a color that must be picked. What is less obvious is that condition 1 (together with its mirror) is sufficient to determine whether a set of colours is admissible. Suppose we have a set of colours that satisfies condition 1, and consider the subset S of vertices whose colours come from this set. We want to be sure that every pair of consecutive vertices in S are linked by an edge. Suppose there is a pair (u, v) which are not. Consider the right-maximal edge from u: by condition 1, it must terminate at some vertex in S, which must then be to the right of v. But the left-maximal edge to v must start to the left of u, and so these two edges cross.<br /><br />We can now use this information to construct a new graph, whose minimum cut will give us the cost of the shortest path. It has a vertex per colour, a source vertex corresponding to the special colour 0, and a sink vertex not corresponding to a colour. The vertices on the source side of the cut correspond to the colours that are visited. Every time colour C implies colour D, add an edge C → D of weight ∞, to prevent C being picked while D is not. If an edge falls into case 2 above, add an edge from color[x] to the sink with weight equal to the edge cost. Otherwise it falls into case 3; add an edge from color[x] to color[y] with weight equal to the edge cost. Each edge in the original graph now corresponds to an edge in this new graph, and the original edge is traversed if and only if the new edge forms part of the cut.</div></div>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-31847281.post-81301064002682603772016-06-13T01:44:00.003-07:002016-06-13T01:44:48.519-07:00Code Jam and Distributed Code Jam 2016 solutions<div dir="ltr" style="text-align: left;" trbidi="on"><h2 style="text-align: left;">Code Jam</h2><div style="text-align: left;">Since the official analysis page still says "coming soon", I thought I'd post my solutions to both the Code Jam and Distributed Code Jam here.</div><h3 style="text-align: left;">Teaching Assistant</h3><div style="text-align: left;">We can start by observing that when we take a new task, it might as well match the mood of the assistant: doing so will give us at least 5 points, not doing so will give us at most 5. Also, there is no point in getting to the end of the course with unsubmitted assignments: if we did, we should not have taken that assignment, instead submitting a previous assignment (a parity argument proves that there was a previous assignment). So what we're looking for is a way to pair up days of the course, obeying proper nesting, scoring 10 points if the paired days have the same mood and 5 points otherwise.<br /><br />That's all we need to start on the small case, which can be done with a standard DP. For every even-length interval of the course, we compute the maximum score. We iterate to pick the day to match up to the first day of the interval, then use previous results to find the best scores for the two subintervals this generates.<br /><br />That's O(N³), which is too slow for the large case. It turns out that a greedy approach works: every day, if the mood matches the top of the stack, submit, otherwise take a new assignment. The exception is that once the stack height matches the number of remaining days, one must submit every day to ensure that the stack is emptied.<br /><br />I haven't quite got my head around a proof, but since it worked on the test case I had for the small input I went ahead and submitted it.<br /><h3 style="text-align: left;">Forest university</h3><div style="text-align: left;">The very weak accuracy requirement, and the attention drawn to it, is a strong hint that the answer should be found by simulation rather than analytically. Thus, we need to find a way to uniformly sample a topological walk of the forest. This is not as simple as always uniformly picking one of the available courses. Consider one tree in the forest: it can be traversed in some number of ways (all of them starting with the root of the tree), and the rest of the forest can be traversed in some number of ways, and then these can be interleaved arbitrarily. If we consider all ways to interleave A items and B items, then A/(A+B) of them will start with an element of A. Thus, the probability that the root of a particular tree will be chosen is proportional to the size of that tree.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">After picking the first element, the available courses again form a forest, and one can continue. After uniformly sampling a sequence of courses, simply check which of the cool words appear on the hat.</div><h3 style="text-align: left;">Rebel against the empire</h3><div style="text-align: left;">I consider this problem to be quite a bit harder than D. It's easier to see what to do, but reams of tricky code.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">For the small case, time is not an issue because the asteroids are static. Thus, one just needs to find the bottleneck distance to connect asteroids 0 and 1. I did this by adding edges from shortest to longest to a union-find structure until 0 and 1 were connected (ala Kruskal's algorithm), but it can also be done by priority-first search (Prim's algorithm) or by binary searching for the answer and then checking connectivity.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">For the large case we'll take that last approach, binary searching over the answers and then checking whether it is possible. Naturally, we will need to know at which times it's possible to make jumps between each pair of asteroids. This is a bit of geometry/algebra, which gives a window of time for each pair (possibly empty, possibly infinite). Now consider a particular asteroid A. In any contiguous period of time during which at least one window is open, it's possible to remain on A, regardless of S, by simply jumping away and immediately back any time security are about to catch up. Also, if two of these intervals are separated in time by at most S, they can be treated as one interval, because one can sit tight on A during the gap between windows. On the other hand, any period longer than S with no open windows is of no use.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">The key is to ask for the earliest time at which one can arrive at each interval. This can be done with a minor modification of Dijkstra's algorithm. The modification is that the outgoing edges from an interval are only those windows which have not already closed by the time of arrival at the interval.</div><h3 style="text-align: left;">Go++</h3><div style="text-align: left;">I liked this problem, and I really wish the large had been worth fewer points, because I might then have taken it on and solved it.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">The key is to realise that the good strings don't really matter. Apart from the trivial case of the bad string also being a good string, it is always possible to solve the problem by producing a pair of programs that can produce <i>every</i> string apart from the bad one.</div><div style="text-align: left;"><br />For the small case, we can use 0?0?0?0? (N repetitions) for the first program, and 111 (N-1 repetitions) for the second (but be careful of the special case N=1!) Each of the 1's can be interleaved once into the first program to produce a 1 in the output, but because there are only N-1 1's, it is impossible to produce the bad string.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">For the large case, the same idea works, but we need something a bit more cunning. The first program is basically the same: alternating digits and ?'s, with the digits forming the complement of the bad string. To allow all strings but the bad string to be output, we want the second program to be a string which does not contain the bad string as a subsequence, but which contains every subsequence of the bad string as a subsequence. This can be achieved by replacing every 1 in the bad string by 01 and every 0 by 10, concatenating them all together, then dropping the last character. Proving that this works is left as an exercise for the reader.</div><h2 style="text-align: left;">Distributed Code Jam</h2><h3 style="text-align: left;">Again</h3><div style="text-align: left;">The code appears to be summing up every product of an element in A with an element in B. However, closer examination shows that those where the sum of the indices is a multiple of M (being the number of nodes) are omitted. However, once we pick an index modulo M for each, we're either adding all or none. The sum of all products of pairs is the same as the product of the sums. So, for each remainder modulo M, we can add up all elements of A, and all elements of B. Then, for each pair of remainders, we either multiply these sums together and accumulate into the grand total, or not. There are some details regarding the modulo 1000000007, but they're not difficult.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">For the large case, we obviously need to distribute this, which can be done by computing each sum on a different node, then sending all the sums back to the master.</div><h3 style="text-align: left;">lisp_plus_plus</h3><div style="text-align: left;">Here, a valid lisp program is just a standard bracket sequence (although it is required to be non-empty). It's well-known that a sequence is valid if and only if:</div><ul style="text-align: left;"><li>the nesting level (number of left brackets minus number of right brackets) of every prefix is non-negative; and</li><li>the nesting level of the whole sequence is zero.</li></ul></div><div style="text-align: left;">If the nesting level ever goes negative in a left-to-right scan, then the point just before it went negative is the longest sequence that can be completed. This makes the small case trivial to implement.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">To distribute this, we can do the usual thing of assigning each node an interval to work on. We can use a <i>parallel prefix sum</i> to find the nesting level at every point (there is lots of information on the internet about how to do this). Then, each node can find it's first negative nesting level, and send this back to a master to find the globally first one. We also need to know the total nesting level by sending the sum for each interval to the master, but that's already done as part of the parallel prefix sum.</div><div style="text-align: left;"><h4 style="text-align: left;">Asteroids</h4><div style="text-align: left;">I liked this one better than the Code Jam asteroids problem, but it was also rather fiddly. The small case is a straightforward serial DP: going from bottom to top, compute the maximum sum possible when ending one's turn on each location.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">At first I thought that this could be parallelised in the same manner as Rocks or Mutex from last year's problems. However, those had only bottom-up and left-to-right propagation of DP state. In this problem, we have bottom-up, left-to-right <i>and</i> right-to-left. On the other hand, the left-to-right and right-to-left propagation is slow: at one most unit for every unit upwards. We can use this!</div><div style="text-align: left;"><br /></div><div style="text-align: left;">Each node will be responsible for a range of columns. However, let's say it starts with knowledge of the prior state for B extra columns on either side. Then after one row of DP iteration, it will correctly know the state for B-1 extra columns, and after B iterations it will still have the correct information for the columns its responsible for. At this stage it needs to exchange information with its neighbours: it will receive information about the B edge columns on either side, allowing it to continue on its way for another B rows.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">There are a few issues involved in picking B. If it's too small (less than 60), then nodes will need to send more than 1000 messages. If it's too large, then nodes will do an excessive number of extra GetPosition queries. Also, B must not be larger than the number of columns a node is handling; in narrow cases we actually need to reduce the number of nodes being used.</div><h3 style="text-align: left;">Gas stations</h3><div style="text-align: left;">The serial version of this problem is a known problem, so I'll just recap a standard solution quickly. At each km, you need to decide how much petrol to add. If there is a town within T km that is cheaper, you should add just enough petrol to reach it (the first one if there are multiple), since any extra you could have waited until you got there. Otherwise, you should fill up completely. A sweep algorithm can give the next cheaper town for every town in linear time.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">What about the large case? The constraint is interesting: if we had enough memory, and if GetGasPrice was faster, there is enough time to do it serially! In fact, we <i>can</i> do it serially, just passing the baton from one node to the next as the car travels between the intervals assigned to each node.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">What about computing the next cheaper station after each station? That would be difficult, but we don't actually need it. We only need to know whether there is a cheaper one within T km. We can start by checking the current node, using a node-local version of the next-cheapest array we had before. If that tells us that there is nothing within the current node, and T km away is beyond the end of the current node, then we will just find the cheapest within T km and check if that is cheaper than the current value. The range will completely overlap some nodes; for those nodes, we can use a precomputed array of the cheapest for each node. There will then be some prefix of another node. We'll just call GetGasPrice to fetch values on this node. Each iteration this prefix will grow by one, so we only need one call to GetGasPrice per iteration (plus an initial prefix). This means we're calling GetGasPrice at most three times per station (on average), which turns out to the fast enough. It's possible to reduce it to 2 by computing the minimum of the initial prefix on the node that already has the data, then sending it to the node that needs it.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">There is one catch, which caused me to make a late resubmission. I was calling GetGasPrice to grow the prefix in the <i>serial</i> part of the code! You need to make all the calls up front in the parallel part of the code and cache the results.</div></div></div>Unknownnoreply@blogger.com1tag:blogger.com,1999:blog-31847281.post-43314163602563016682016-03-06T10:52:00.000-08:002016-03-06T10:52:05.508-08:00Solutions to Facebook Hacker Cup 2016<div dir="ltr" style="text-align: left;" trbidi="on"><div><div>Since the official solutions aren't out yet, here are my solutions. Although I scored 0 in the contest, four of my solutions were basically correct, one I was able to solve fairly easily after another contestant pointed out that my approach was wrong, and I figured out Grundy Graphs on the way home.<br /><h3 style="text-align: left;">Snake and Ladder</h3><div style="text-align: left;">I'll describe two solutions to this. There are a few things one can start off with. Firstly, if there are rows at the top or bottom that are completely full of flowers, just cut them off — they make no difference. Then, handle N=1 specially: if K=1, the answer is 1, if K=0, it is 2. A number of people (including all the contest organisers!) messed up the first, and I messed up the second. Also, if one rung has two flowers on it, the answer is 0.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">My contest solution used dynamic programming. If one cuts the ladder half-way between two rungs and considers the state of the bottom half, it can be divided into four cases:</div><ol style="text-align: left;"><li>Only the left-hand side has the snake on it.</li><li>Only the right-hand side has the snake on it.</li><li>Both sides have the snake on it, and the two are connected somewhere below.</li><li>Both sides have the snake on it, and the two do not connect.</li></ol></div>It is reasonably straightforward (as long as one is careful) to determine the dynamic programming transitions to add a rung, for each possibility of a flower on the left, on the right, or no flower. Advancing one rung at a time will be too slow for the bound on N, but the transitions can be expressed as 4×4 matrices and large gaps between flowers can be crossed with fast matrix exponentiation. Finally, one must double the answer to account for the choice of which end is the head.<br /><br />The alternative solution is to argue on a case-by-case basis. If there no flowers, then one can pick a row for the head and a row for the tail, and pick a column for the head (the column for the tail is then forced by parity). Once these choices are made, there is only one way to complete the snake. Also, the head and tail can only occupy the same row if it is the top or bottom row, giving 2N(N-1)+4 ways.<br /><br />If there is at least one flower, then the head must be above the top flower and the tail below the bottom flower (or vice versa). Again, one can choose a row for each, with the column being forced by parity. One must also consider special cases for a flower in the top/bottom row.<br /><h3 style="text-align: left;">Boomerang Crew</h3><div style="text-align: left;">I think this was possibly the easiest of the problems, provided you thought clearly about it (which I failed to do). Clearly any opponents weaker than (or equal to) your champion can be defeated just be putting them up against your champion. Also, if you're going to defeat P players, they might as well be the weakest P players of the opposition. For a given P, can this be done? There will be a certain number of strong players (stronger than our champion), which we will have to sacrifice our weaker players against to wear them down. What's more, once we've worn one down and beaten him/her, we have to use the winner of that game to wear down the next strong opponent. Thus, we should match up the S strong players against our best S players in some order, and use our remaining players as sacrifices. Ideally we'd like to wear down each opponent to exactly the skill level of the player we will use to win; anything more is wasted effort. We can apply this idea greedily: for each opponent, find the remaining strong player of ours who can win by the smallest margin.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">With that method to test a particular P, it is a standard binary search to determine the number of players we can beat.</div><h3 style="text-align: left;">Grundy Graph</h3><div style="text-align: left;">I think this was the hardest problem: my submission wasn't a real solution and was just for a laugh; several other people I spoke to had submitted but didn't believe in their solution. During the contest I suspected that the solution would be related to 2-SAT, but it wasn't until the flight home that I could figure out how to incorporate turn ordering.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">Let us construct an auxiliary directed graph A whose vertices correspond to those of the original. An edge u->v in A means that if u is black, then v must be black as well for Alice to win. All the edges in the original graph are clearly present in A. However, for each u->v in the original, we also add v'->u', where x' is the vertex assigned a colour at the same time as x. This is because if v' is black but u' is white, then it implies that u is black and v is white. This is the same as the contrapositive edge in a 2-SAT graph. Let x=>y indicate that y is reachable from x in A.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">There are a number of situations in which we can see that Bob can win:</div><ul style="text-align: left;"><li>If any strongly connected component contains both x and x', then Bob wins regardless of what Alice and Bob play.</li><li>If Bob controls vertices x and y and x=>y, then Bob wins by assigning x black and y white, regardless of Alice's play.</li><li>If Bob controls x, Alice controls y, x=>y, x' => y', and y is played before x, then Bob wins by assigning y the opposite colour to x.</li></ul>We claim that if none of the above apply, then Alice wins. Her strategy is that on her turn, she must colour x black if there is any u such that u => x and either u has already been coloured black, u could later be coloured black by Bob, or u = x'. This ensures that it does not immediately allow Bob to win, nor allows Bob to use it to win later. The only way this could fail is if both x and x' are forced to be black (if neither is forced, Alice can pick freely).<br /><br />Suppose u => x and v => x', where u and v are as described above. Then x => v', so u => v'. Firstly consider u = x'. We cannot have v = x (because otherwise x, x' are in the same SCC). Now x' => v', so v => v'. If Bob controls v, then we hit the second case above; if Alice controls v, then v => v' means she would never have chosen v. So we have a contradiction. Now, if u ≠ x' (and by symmetry, v ≠ x), then consider u => v' again. Alice cannot control v/v', since she would then have chosen v' rather than v. But similarly, v => u', so Alice couldn't have chosen u. It follows that Bob controls u and v, which can only be possible if u = v'. But u and u' can only both be relevant if Bob hasn't decided their colour yet, which is the third case above.<br /><br />The implementation details are largely left as an exercise. Hint: construct the SCCs of A, and the corresponding DAG of the SCCs, then walk it in topological order to check the conditions. The whole thing takes linear time.<br /><h3 style="text-align: left;">RNG</h3><div style="text-align: left;">The concept here is a reasonably straightforward exponential DP, but needs a careful implementation to get all the formulae right, particularly in the face of potential numeric instability. The graph itself isn't particularly relevant; the only "interesting" vertices are the start and the gifts, and the only useful information is the distance from each interesting vertex to each other one (if reachable) and the distance from each gift to the nearest dead-end. The DP state is the set of gifts collected and the current position (one of the interesting vertices). When one is at a gift, there are two possible options:</div><ol style="text-align: left;"><li>Pick another gift, and make towards it along the shortest path. Based on the path length L, there is a probability P^L of reaching it in time DL, and a probability 1 - P^L of respawning, with an expected time computable from a formula.</li><li>Aim to respawn by heading for the nearest dead-end. Again, one can work out a formula for the expected time to respawn.</li></ol></div>When one is at the start, one should of course pick a gift and head for it. Again, one can compute an expected time to reach it (one obtains a formula that includes its own result, but a little algebra sorts that out).<br /><h3 style="text-align: left;">Maximinimax flow</h3><div style="text-align: left;">I think I liked this problem the best, even though I screwed up the limits on the binary search. Firstly, what is the minimax flow? For a general graph it would be nasty to compute, but this graph has the same number of edges as vertices. This means that it has exactly one cycle with trees hanging off of it. It shouldn't be too hard to convince yourself that the minimax flow is the smaller of the smallest edge outside the cycle and the sum of the two smallest edges inside the cycle.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">Veterans will immediately reach for a binary search, testing whether it is possible to raise the minimax flow to a given level. For the edges outside the ring, this is a somewhat standard problem that can be solved with query structures e.g. Fenwick trees, indexed by edge value. To raise the minimum to a given level, one needs to know the number of edges below that level (one Fenwick tree, storing a 1 for each edge), and their current sum (a second tree, storing the edge weight for each edge).</div><div style="text-align: left;"><br /></div><div style="text-align: left;">For the vertices inside the cycle it is only slightly more complex. If the required level is less than double the second-smallest edge, then one need only augment the smallest edge. Otherwise one raises all edges to half the target, with a bit of special-casing when the target is odd. The Fenwick tree can also be queried to find the two smallest edges, but I just added a std::multiset to look this up.</div><h3 style="text-align: left;">Rainbow strings</h3><div style="text-align: left;">This seemed to be one of the most-solved problems, and is possibly the easiest if you have a suffix tree/array routine in your toolbox. A right-to-left sweep will yield the shortest and longest possible substring for each starting position (next green for the shortest, next red for the longest). The trick is to process queries in order of length. Each entry in the suffix array will become usable at some length, and unusable again at another length, and these events, along with the queries, can be sorted by length for processing. The only remaining work is to determine the Kth of the active suffixes, which can be done with a Fenwick tree or other query structure.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">Finally, one needs the frequency table for each prefix to be able to quickly count the frequency in each substring.</div></div>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-31847281.post-72473165469976218642015-05-20T13:58:00.001-07:002015-05-21T06:52:37.720-07:00Analysis of ICPC 2015 world finals problems<div dir="ltr" style="text-align: left;" trbidi="on"><h2 style="text-align: left;">Analysis of ICPC 2015 world finals problems</h2><div style="text-align: left;">I haven't solved all the problems yet, but here are solutions to those I have solved (I've spent quite a bit more than 5 hours on it though).</div><h3 style="text-align: left;">A: Amalgamated Artichokes</h3><div style="text-align: left;">This is a trivial problem that basically everybody solved - I won't go into it.</div><h3 style="text-align: left;">B: Asteroids</h3><div style="text-align: left;">This looks quite nasty, but it can actually be solved by assembling a number of fairly standard computational geometry tools, and the example input was very helpful. To simplify things, let's switch to the frame of reference of the first asteroid, and assume only the second asteroid moves (call them P and Q for short). The first thing to notice is that the area is a piecewise quadratic function of time, with changes when a vertex of one polygon crosses an edge of the other. This is because the derivative of area depends on the sections of boundary of Q inside P, and those vary linearly in those intervals. Finding the boundaries between intervals is a standard line-line intersection problem. To distinguish between zero and "never", touching is considered to be an intersection too.</div><div style="text-align: left;">We also need a way to compute the area of the intersection. Clipping a convex polygon to a half-space is a standard problem too - vertices are classified as inside or outside, and edges that go from inside to outside or vice versa introduce a new interpolated vertex, so repeated clipping will give the intersection of convex polygons. And finding the area of a polygon is also very standard.</div><div style="text-align: left;">Finally, we need to handle the case where the best area occurs midway through one of the quadratic pieces. Rather than try to figure out the quadratic coefficients geometrically, I just sampled it at three points (start, middle, end) and solved for the coefficients, and hence the maximum, with algebra.</div><h3 style="text-align: left;">C: Catering</h3><div style="text-align: left;">I'm sure I'm seen something similar, but I'm not sure where. It can be set up as a min-cost max-flow problem. Split every site into two, an arrival area and a departure area. Connect the source to each departure area, with cap 1 for clients and cap K for the company. Connect each arrival area to the sink similarly. Connect every departure area to every later arrival area with cap 1, except for the company->company link with cap K. Edge costs match the table of costs where appropriate, zero elsewhere. The maximum flow is now K+N (K teams leave the company, and equipment arrives at and leaves every client), and the minimum cost is the cost of the operation.</div><h3 style="text-align: left;">D: Cutting cheese</h3><div style="text-align: left;">This is basically just a binary search for each cut position, plus some mathematics to give the volume of a sphere that has been cut by a plane.</div><h3 style="text-align: left;">E: Parallel evolution</h3><div style="text-align: left;">Start by sorting the strings by length. A simplistic view is that we need to go through these strings in order, assigning each to one path or the other. This can be done with fairly standard dynamic programming, but it will be far too slow.<br />Add an "edge" from each string to the following one if it is a subsequence of this following one. This will break the strings up into connected chains. A key observation is that when assigning the strings in a chain, it is never necessary to switch sides more than once: if two elements in a chain are on one path, one can always put all the intervening elements on the same path without invalidating a solution. Thus, one need only consider two cases for a chain: put all elements in one path (the opposite path to the last element of the previous chain), or start the chain on one path then switch to the other path part-way through. In the latter case, one should make the switch as early as possible (given previous chains), to make it easier to place subsequent strings. A useful feature is that in either case, we know the last element in both paths, which is all that matters for placing future chains. Dynamic programming takes care of deciding which option to use. Only a linear number of subsequence queries is required.<br />I really liked this problem, even though the implementation in messy. It depends only on the compatibility relationship being a partial order and there being a cheap way to find a topological sort.</div><h3 style="text-align: left;">F: Keyboarding</h3><div style="text-align: left;">This can be done with a fairly standard BFS. The graph has a state for each cursor position and number of completed letters. There are up to 5 edges from each state, corresponding to the 5 buttons. I precomputed where each arrow key moves to from each grid position, but I don't know if that is needed. It's also irrelevant that keys form connected regions.<br /><h3 style="text-align: left;">H: Qanat</h3><div style="text-align: left;">This is a good one for a mathsy person to work out on paper while someone else is coding - the code itself is trivial. Firstly, the slope being less than one implies that dirt from the channel always goes out one of the two nearest shafts - whichever gets it to the surface quicker (it helps to think of there being a zero-height shaft at x=0). Between each pair of shafts, one can easily find the crossover point.</div><div style="text-align: left;">Consider just three consecutive shafts, and the cost to excavate them and the section of channel between them. If the outer shafts are fixed, the cost is a quadratic in the position of the middle shaft, from which one can derive a formula for its position in terms of its neighbours. This gives a relationship of the form x<sub>i+2</sub> - gx<sub>i+1</sub> + x<sub>i</sub> = 0. In theory, this can now be used to iteratively find the positions of all the shafts, up to a scale factor which is solved by the requirement for the final shaft (the mother well) to be at x=W.<br />I haven't tested it, but I think evaluating the recurrence could lead to catastrophic rounding problems, because errors introduced early on are exponentially scaled up, faster than the sequence itself grows. The alternative I used is to find a closed formula for the recurrence. This is a known result: let a and b be the roots of x<sup>2</sup> - gx + 1 = 0; then the ith term is r(a<sup>i</sup> - b<sup>i</sup>) for the scale factor r. Finding the formula for g shows that the roots will always be real (and distinct) rather than complex.</div></div><h3 style="text-align: left;">I: Ship traffic</h3><div style="text-align: left;">This mostly needed some careful implementation. For each ship, there is an interval during which the ferry cannot launch. Sort the start and end times of all these intervals and sweeping through them, keeping track of how many ships will be hit for each interval between events. The longest such interval with zero collisions is the answer.</div><h3 style="text-align: left;">J: Tile cutting</h3><div style="text-align: left;">Firstly, which sizes can be cut, and in how many ways? If the lengths from the corners of the parallelograms to the corners of the tile are a, b, c, d, then the area is ab + cd. So the number of ways to make a size is the number of ways it can be written as ab + cd, for a, b, c, d > 0. The number of ways to write a number as ab can be computed by brute force (iterate over all values of a and b for which ab <= 500000). The number of ways to write a number as ab + cd is the convolution of this function with itself (ala polynomial multiplication). There are well-known algorithms to do this in O(N log N) time (with an FFT) or in O(N<sup>1.58</sup>) time (divide-and-conquer), and either is fast enough.<br /><h3 style="text-align: left;">K: Tours</h3><div style="text-align: left;">I've got a solution that passes, but I don't think I can fully prove why.<br />I start by running a DFS to decompose the graph into a forest plus edges from a node to an ancestor in the forest. Each such "up" edge corresponds to a simple cycle. Clearly, the number of companies must divide into the length of any cycle, so we can take the GCD of these cycle lengths.<br />If the cycles were all disjoint, this would (I think) be sufficient, but overlapping cycles are a problem. Suppose two cycles overlap. That means there are two points A and B, with three independent paths between them, say X, Y, Z. If X has more than its share of routes from some company, then Y must have less than its share, to balance the tour X-Y. Similarly, Z must have less than its share. But then the tour Y-Z has too few routes from that company. It follows that X, Y and Z must all have equal numbers of routes from each company, and hence the number of companies must divide into each of their lengths.<br />And now for the bit I can't prove: it seems to be sufficient to consider only pairs of the simple cycles corresponding to using a single up edge. I just iterate over all pairs and compute the overlap. That is itself not completely trivial, requiring an acceleration structure to compute kth ancestors and least common ancestors.</div><h3 style="text-align: left;">L: Weather report</h3><div style="text-align: left;">This is an interesting variation of a classic result in compression theory, Hamming codes. The standard way to construct an optimal prefix-free code from a frequency table is to keep a priority queue of coding trees, and repeatedly combine the two least frequent items by giving them a common parent. That can't be done directly here because there are trillions of possible strings to code, but it can be done indirectly by noting that permutations have identical frequency, and storing them as a group rather than individual items. The actions of the original algorithm then need to be simulated, pairing up large numbers of identical items in a single operation.</div></div></div>Unknownnoreply@blogger.com5tag:blogger.com,1999:blog-31847281.post-8725731193559615342014-10-26T02:42:00.001-07:002014-10-26T02:42:54.172-07:002014 NEERC Southern Subregional<div dir="ltr" style="text-align: left;" trbidi="on">The ICPC NEERC South Subregional was mirrored on Codeforces. It was a very nice contest, with some approachable but also challenging problems. Here are my thoughts on the solutions (I solved everything except A, J and L during the contest).<br /><h3 style="text-align: left;">A: Nasta Rabbara</h3><div style="text-align: left;">This is quite a nasty one, and my initial attempt during the contest was all wrong. The first thing to be aware of is that a single query can be answered in O(L log L) time (or less) using a modification on the standard union-find data structure: each edge in the union-find structure is labelled to indicate whether end-points have the same or the opposite parity, and the find operation tracks whether the returned root has the same or opposite parity as the query point. That way, edges that create cycles can be found to be either odd- or even-length cycles. Of course, this won't be fast enough if all the queries are large.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">Ideally, one would like a data structure that allows both new edges to be added and existing edges to be removed. That would allow for a sliding-window approach, in which we identify the maximal left end-point for each right end-point. However, the 10 second time limit suggests that this is not the right approach.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">Instead, there is a \(O(N+(M+Q)\sqrt{M}\log N)\) solution. Dividing the series into blocks of length \(\sqrt{M}\). For each block, identify all the queries with a right end-point inside the block. Now build up a union-find structure going right-to-left, starting from the left edge of the block. Whenever you hit the left end-point of one of the identified queries, add the remaining episodes for the query (which will all come from inside the block) to answer the query, then undo the effect of these extra episodes before continuing. As long as you don't do path compression, each union operation can be unwound in O(1) time. This will miss queries that occur entirely inside a block, but these can be answered by the algorithm from the first paragraph as they are short.</div><h3 style="text-align: left;">B: Colored blankets</h3><div style="text-align: left;">It turns out that it is always possible to find a solution. Firstly, blankets with no colour can be given an arbitrary colour on one side. It is fairly easy to see that we need only allocate blankets to kits such that each kit contains blankets of at most two colours. Repeat the following until completion:</div><ol style="text-align: left;"><li>If any colour has at most K/N blankets remaining and that colour has not been painted onto any kit, put those blankets into a new kit and paint it with that colour (this might involve zero blankets into that kit).</li><li>Otherwise, pick any colour that has not been painted on a kit. There must be a more than K/N blankets of that colour. Use enough of them to fill up any non-full painted kit. There must be such a kit, otherwise there are more than K blankets in total.</li></ol>Since each kit is painted with a unique colour, this generates at most N kits; and since each kit has K/N blankets in it, it must generate exactly N kits.<br /><h3 style="text-align: left;">C: Component tree</h3><div style="text-align: left;">The only difficulty with this problem is that the tree might be very deep, causing the naive algorithm to spend a lot of time walking up the tree. This can be solved with a heavy-light decomposition. On each heavy path, for each attribute that appears anywhere on the path, store a sorted list (by depth) of the nodes containing that attribute. When arriving on a heavy path during a walk, a binary search can tell where the nearest ancestor with that property occurs on the heavy path. I think this makes each query O(log N) time.</div><h3 style="text-align: left;">D: Data center</h3><div style="text-align: left;">This is a reasonably straightforward sliding window problem. Sort the servers of each type, and start with the minimum number of low voltage servers (obviously taking biggest first). One might also be required to take all the low voltage servers plus some high voltage servers. Then remove the low voltage servers one at a time (smallest first), and after each removal, add high voltage servers (largest first) until the capacity is made up. Then compare this combination to the best solution so far.</div><h3 style="text-align: left;">E: Election</h3><div style="text-align: left;">Firstly, ties can be treated as losses (both within a station and in the election), because we want the mayor to win. When merging two stations, there are two useful results that can occur: win+loss -> win, or loss+loss -> loss; in both cases the number of wins stays the same, while the number of losses goes down by one. So they are equally useful. We can determine the number of viable merges by DP: either the last two can be merged, and we solve for the other N-2; or the last one is untouched, and we solve for the other N-1.</div><h3 style="text-align: left;">F: Ilya Muromets</h3><div style="text-align: left;">Note that the gap closing up after a cut is just a distraction: any set of heads we can cut, can also be cut as two independent cuts of the original set of heads.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">We can determine the best cut for every prefixes in linear time, by keeping a sliding window of sums (or using a prefix sum). Similarly, we can determine the best cut for every suffix. Any pair of cuts can be separated into a cut of a prefix and of the corresponding suffix, so we need only consider each split point in turn.</div><h3 style="text-align: left;">G: FacePalm</h3><div style="text-align: left;">Let's consider each k contiguous days in turn, going left to right. If the current sum is non-negative, we need to reduce some of the values. We might as well reduce the right-most value we can, since that will have the effect on as many future values as possible (past values have already been fixed to be negative, so that is not worth considering). So we reduce the last value as much as needed, or until it reaches the lower limit. If necessary, we then reduce the second-last value, and so on until the sum is negative.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">The only catch is that we need a way to quickly skip over long sequences of the minimum value, to avoid quadratic running time. I kept a cache of previous non-minimum value (similar to path compression in union-find structures); a stack of the positions of non-minimum values should work too.</div><h3 style="text-align: left;">H: Minimal Agapov code</h3><div style="text-align: left;">The first triangle will clearly consist of the minimum three labels. This divides the polygon into (up to) three sections. Within each section, the next triangle will always stand on the existing diagonal, with the third vertex being the lowest label in the section. This will recursively subdivide the polygon into two more sections with one diagonal, and so on. One catch is that ties need to be broken carefully: pick the one furthest from the base with the smaller label (this lead to me getting a WA). Rather than considering all the tie-breaking cases for the first triangle, I started with a first diagonal with the two smallest labels, and found the first triangle through the recursion.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">The main tool needed for this is an efficient range minimum query. There are a number of data structures for this, and any of them should work. I used two RMQ structures to cater for the two possible tie-breaking directions. The cyclic rather than linear nature of the queries, but it is just a matter of being careful.</div><h3 style="text-align: left;">I: Sale in GameStore</h3><div style="text-align: left;">This was the trivial problem: sort, get your friends to buy the most expensive item, start filling up on cheap items until you can't buy any more or you've bought everything.</div><h3 style="text-align: left;">J: Getting Ready for the VIPC</h3><div style="text-align: left;">I got a wrong answer to test case 53 on this, and I still don't know why. But I think my idea is sound.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">The basic approach is dynamic programming, where one computes the minimum tiredness one can have after completing each contest, assuming it is possible to complete it (this also determines the resulting skill). To avoid some corner cases, I banned entering a contest with tiredness greater than \(h_i - l_i\). However, this is \(O(N^2)\), because for each contest, one must consider all possible previous contests.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">The first optimisation one can do is that one can make a list of outcomes for each day, assuming one enters a contest that day: a list of (result skill, tiredness), one for each contest. If one contest results in both less skill and more tiredness than another, it can be pruned, so that one ends up with a list that increases in both skill and tiredness. Now one can compute the DP for a contest by considering only each previous day, and finding the minimum element in the list for which the skill in great enough to enter the current contest. The search can be done by binary search, so if there are few distinct days with lots of contests each day, this will be efficient; but if every contest is on a different day, we're no better off.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">The second optimisation is to note the interesting decay function for tiredness. After about 20 days of inactivity, tireness is guaranteed to reach 0. Thus, there is no need to look more than this distance into the past: beyond 20 days, we only care about the maximum skill that can be reached on that day, regardless of how tired one is. This reduces the cost to \(O(N\log N\log maxT\). </div><h3 style="text-align: left;">K: Treeland</h3><div style="text-align: left;">Pick any vertex and take its nearest neighbour: this is guaranteed to be an edge; call the end-points A and B. An edge in a tree partitions the rest of the tree into two parts. For any vertex C, either d(A, C) < d(B, C) or vice versa, and this tells us which partition C belongs to. We can thus compute the partition, and then recursively solve the problem within each partition.</div><h3 style="text-align: left;">L: Useful roads</h3><div style="text-align: left;">I didn't solve this during the contest, and I don't know exactly how to solve it yet.</div><h3 style="text-align: left;">M: Variable shadowing</h3><div style="text-align: left;">This is another reasonably straightforward implementation problem. For each variable I keep a stack of declarations, with each declaration tagged with the source position and a "scope id" (each new left brace creates a new scope id). I also keep a stack of open scope ids. When a right brace arrives, I check the top-of-stack for each variable to see if it matches the just closed scope id, and if so, pop the stack.</div></div>Unknownnoreply@blogger.com4tag:blogger.com,1999:blog-31847281.post-56399056529668202882014-07-18T01:54:00.002-07:002014-07-31T12:03:04.609-07:00IOI 2014 day 2 analysis<div dir="ltr" style="text-align: left;" trbidi="on"><div><div><div>I found day 2 much harder than day 1, and I still don't know how to solve all the problems (I am seriously impressed by those getting perfect scores). Here's what I've managed to figure out so far.<br /><br /><b>Update</b>: I've now solved everything (in theory), and the solutions are below. The <a href="http://www.ioi2014.org/index.php/competition/contest-tasks">official solutions</a> are now also available on the IOI website. I'll try coding the solutions at some point if I get time.<br /><h3 style="text-align: left;">Gondola</h3><div style="text-align: left;">This was the easiest of the three. Firstly, what makes a valid gondola sequence? In all the subtasks of this problem, there will be two cases. If you see any of the numbers 1 to n, that immediately locks in the phase, and tells you the original gondola for every position. Otherwise, the phase is unknown. So, the constraints are that</div><ul style="text-align: left;"><li>if the phase is known, every gondola up to n must appear in the correct spot if it appears;</li><li>no two gondolas can have the same number.</li></ul></div>Now we can consider how to construct a replacement sequence (and also to count them), which also shows that these conditions are sufficient. If the phase is not locked, pick it arbitrarily. Now the "new gondola" column is simply the numbers from n+1 up to the largest gondola, so picking a replacement sequence is equivalent to deciding which gondola replaces each broken gondola. We can assign each gondola greater than n that we can't see to a position (one where the final gondola number is larger), and this will uniquely determine the replacement sequence. We'll call such gondolas <i>hidden</i>.<br /><br />For the middle set of subtasks, the simplest thing is to assign all hidden gondolas to one position, the one with the highest-numbered gondola in the final state. For counting the number of possible replacement sequences, each hidden gondola can be assigned independently, so we just multiply together the number of options, and also remember to multiply by n if the phase is unknown. In the last subtask there are too many hidden gondolas to deal with one at a time, but they can be handled in batches (those between two visible gondolas), using fast exponentiation.<br /><h3>Friend</h3><div style="text-align: left;">This is a weighted maximum independent set problem. On a general graph this is NP-hard, so we will need to exploit the curious way in which the graph is constructed. I haven't figured out how to solve the whole problem, but let's work through the subtasks:</div><ol style="text-align: left;"><li>This is small enough to use brute force (consider all subsets and check whether they are independent).</li><li>The graph will be empty, so the sample can consist of everyone. </li><li>The graph will be complete, so only one person can be picked in a sample. Pick the best one.</li><li>The graph will be a tree. There is a fairly standard tree DP to handle this case: for every subtree, compute the best answer, either with the root excluded or included. If the root is included, add up the root-excluded answers for every subtree; otherwise add up the best of the two for every subtree. This takes linear time.</li><li>In this case the graph is bipartite and the vertices are unweighted. This is a standard problem which can be solved by finding the maximum bipartite matching. The relatively simple flow-based algorithm for this is theoretically \(O(n^3)\), but it is one of those algorithms that tends to run much faster in most cases, so it may well be sufficient here.</li></ol></div>The final test-case clearly requires a different approach, since n can be much larger. I only managed to figure this out after getting a big hint from the SA team leader, who had seen the official solution.<br /><br />We will process the operations in reverse order. For each operation, we will transform the graph into one that omits the new person, but for which the optimal solution has the same score. Let's say that the last operation had A as the host and B as the invitee, and consider the different cases:<br /><ul style="text-align: left;"><li>YourFriendsAreMyFriends: this is the simplest: any solution using B can also use A, and vice versa. So we can collapse the two vertices into one whose weight is the sum of the original weights, and use it to replace A.</li><li>WeAreYourFriends: this is almost the same, except now we can use at most one of A and B, and which one we take (if either) has no effect on the rest of the graph. So we can replace A with a single vertex having the larger of the two weights, and delete B.</li><li>IAmYourFriend: this is a bit trickier. Let's start with the assumption that B will form part of the sample, and add that to the output value before deleting it. However, if we later decide to use A, there will be a cost to remove B again; so A's weight <i>decreases</i> by the weight of B. If it ends up with negative weight, we can just clamp it to 0.</li></ul></div>Repeat this deletion process until only the original vertex is left; the answer will be the weight of this vertex, plus the saved-up weights from the IAmYourFriend steps.<br /><div><h3 style="text-align: left;">Holiday</h3><div style="text-align: left;">Consider the left-most and right-most cities that Jian-Jia visits. Regardless of where he stops, he will need to travel from the start city to one of the ends, and from there to the other end. There is no point in doing any other back-tracking, so we can tell how many days he spends travelling just from the end-points. This then tells us how many cities he has time to see attractions in, and obviously we will pick the best cities within the range.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">That's immediately sufficient to solve the first test case. To solve more, we can consider an incremental approach. Fix one end-point, and gradually extend the other end-point, keeping track of the best cities (and their sum) in a priority queue (with the worst of the best cities at the front). As the range is extended, the number of cities that can be visited shrinks, so items will need to be popped. Of course, the next city in the range needs to be added each time as well. Using a binary heap, this gives an \(O(n^2\log n)\) algorithm: a factor of n for each endpoint, and the \(\log n\) for the priority queue operations. That's sufficient for subtask 3. It's also good enough for subtask 2, because the left endpoint will be city 0, saving a factor of n.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">For subtask 4, it is clearly not possible to consider every pair of end-points. Let's try to break things up. Assume (without loss of generality) that we move first left, then back to the start, then right. Let's compute the optimal solution for the left part and the right part separately, then combine them. The catch is that we need to know how we are splitting up our time between the two sides. So we'll need to compute the answer for each side for all possible number of days spent within each side. This seems to leave us no better off, since we're still searching within a two-dimensional space (number of days and endpoint), but it allows us to do some things differently.<br /><br />We'll just consider the right-hand side. The left-hand side is similar, with minor changes because we need two days for travel (there and back) instead of one. Let f(d) be the optimal end-point if we have d days available. Then with a bit of work one can show that f is non-decreasing (provided one is allowed to pick amongst ties). If we find f(d) for d=1, 2, 3, ... in that order, it doesn't really help: we're only, on average, halving the search space. But we can do better by using a divide-and-conquer approach: if we need to find f for all \(d \in [0, D)\) then we start with \(d = \frac{D}{2}\) to subdivide the space, and then recursively process each half of the interval on disjoint subintervals of the cities. This reduces the search space to \(O(n\log n)\).<br /><br />This still leaves the problem of efficiently finding the total number of attractions that can be visited for particular intervals and available days. The official solution uses one approach, based on a segment tree over the cities, sorted by number of attractions rather than position. The approach I found is, I think, simpler. Visualise the recursion described above as a tree; instead of working depth-first (i.e., recursively), we work breadth-first. We make \(O(\log n)\) passes, and in each pass we compute f(d) where d is an odd multiple of \(2^i\) (with \(i\) decreasing with each pass). Each pass can be done in a single incremental process, similar to the way we tackled subpass 2. The difference is that each time we cross into the next subinterval, we need to increase \(d\), and hence bring more cities into consideration. To do this, we need either a second priority queue of excluded cities, or we can replace the priority queue with a balanced binary tree. Within each pass, d can only be incremented \(O(n)\) times, so the total running time will be \(O(n\log n)\) per pass, or \(O(n\log n \log n)\) overall.</div></div></div>Unknownnoreply@blogger.com2tag:blogger.com,1999:blog-31847281.post-30935558131283065492014-07-16T08:00:00.000-07:002014-07-17T03:41:53.826-07:00IOI 2014 day 1 analysis<div dir="ltr" style="text-align: left;" trbidi="on"><div><h2 style="text-align: left;">IOI 2014 day 1</h2><div style="text-align: left;">Since there is no online judge, I haven't tried actually coding any of these. So these ideas are not validated yet. You can find the problems <a href="http://www.ioi2014.org/index.php/competition/contest-tasks">here</a>.</div><h3 style="text-align: left;">Rails</h3><div style="text-align: left;">I found this the most difficult of the three to figure out, although coding it will not be particularly challenging.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">Firstly, we can note that distances are symmetric: a route from A to B can be reflected in the two tracks to give a route from B to A. So having only \(\frac{n(n-1)}{2}\) queries is not a limitation, as we can query all distances. This might be useful in tackling the first three subtasks, but I'll go directly to the hardest subtask.</div><div style="text-align: left;"><br /></div>If we know the position and type of a station, there is one other that we can immediately locate: the closest one. It must have the opposite type and be reached by a direct route. Let station X be the closest to station 0. The other stations can be split into three groups:<br /><ol style="text-align: left;"><li>d(X, Y) < d(X, 0): these are reached directly from station X and of type C, so we can locate them exactly.</li><li>d(0, X) + d(X, Y) = d(0, Y), but not of type 1: these are reached from station 0 via station X, so they lie to the left of station 0.</li><li>All other stations lie to the right of station X.</li></ol>Let's now consider just the stations to the right of X, and see how to place them. Let's take them in increasing distance from 0. This ensures that we encounter all the type D stations in order, and any type C station will be encountered at some point after the type D station used to reach it. Suppose Y is the right-most type D station already encountered, and consider the distances for a new station Z. Let \(z = d(0, Z) - d(0, Y) - d(Y, Z)\). If Z is type C, then there must be a type D at distance \(\frac{z}{2}\) to the left of Y. On the other hand, if Z is of type D (and lies to the right of Y), then there must be a type C station at distance \(\frac{z}{2}\) to the left of Y. In the first case, we will already have encountered the station, so we can always distinguish the two cases, and hence determine the position and type of Z.</div><div></div><div>The stations to the left of station zero can be handled similarly, using station X as the reference point instead of station 0.</div><div></div><div>How many queries is this? Every station Z except 0 and X accounts for at most three queries: d(0, Z), d(X, Z) and d(Y, Z), where Y can be different for each Z. This gives \(3(n-2) + 1\), which I think can be improved to \(3(n-2)\) just by counting more carefully. Either way, it is sufficient to solve all the subtasks.<br /><h3 style="text-align: left;">Wall </h3></div><div style="text-align: left;">This is a fairly standard interval tree structure problem, similar to Mountain from IOI 2005 (but a little easier). Each node of the tree contains a range to which its children are clamped. To determine the value of any element of the wall, start at the leaf with a value of 0 and read up the tree, clamping the value to the range in each node in turn. Initially, each node has the range [0, inf). When applying a new instruction, it is done top-down, and clamps are pushed down the tree whenever recursion is necessary.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">An interesting aspect of the problem is that it is offline, in that only the final configuration is requested and all the information is provided up-front. This makes me think that there may be an alternative solution that processes the data in a different order, but I can't immediately see a nicer solution than the one above. </div><h3 style="text-align: left;">Game</h3><div style="text-align: left;">I liked this problem, partly because I could reverse-engineer a solution from the assumption that it is always possible to win, and partly because it requires neither much algorithm/data-structure training (like Wall) nor tricky consideration of cases (like Rails). Suppose Mei-Yu knows that certain cities are connected. If there are any flights between the cities that she has not asked about, then she can win simply by saving one of these flights for last, since it will not affect whether the country is connected. It follows that for Jian-Jia to win, he must always answer no when asked about a flight between two components that Mei-Yu does not know to be connected, <i>unless</i> this is the last flight between these components?</div><div style="text-align: left;"><br /></div><div style="text-align: left;">What if he always answers yes to the last flight between two components? In this case he will win. As long as there are at least two components left, there are uncertain edges between every pair of them, so Mei-Yu can't know whether any of them is connected any other. All edges within a component are known, so the number of components can only become one after the last question.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">What about complexity? We need to keep track of the number of edges between each pair of components, which takes \(O(N^2)\) space. Most operations will just decrement one of these counts. There will be \(N - 1\) component-merging operations, each of which requires a linear-time merging of these edge counts and updating a vertex-to-component table. Thus, the whole algorithm requires \(O(N^2)\) time. This is optimal given that Mei-Yu will ask \(O(N^2)\) questions.</div></div>Unknownnoreply@blogger.com1tag:blogger.com,1999:blog-31847281.post-72428844748534215182014-06-30T09:18:00.000-07:002014-06-30T09:27:26.945-07:00ICPC Problem H: Pachinko<div dir="ltr" style="text-align: left;" trbidi="on">Problem A has been covered quite well <a href="http://keet.wordpress.com/2014/06/28/acm-icpc-2014-solution-to-problem-a-baggage/">elsewhere</a>, so I won't discuss it. That leaves only problem H. I started by reading the Google translation of a <a href="http://fajnezadania.wordpress.com/2014/06/28/pachinko/">Polish writeup</a> by gawry. The automatic translation wasn't very good, but it gave me one or two ideas I borrowed. I don't know how my solution compares in length of code, but I consider it much simpler conceptually.<br /><br />This is a fairly typical Markov process, where a system is in some state, and each timestep it randomly selects one state as a function of the current state. One variation is that the process stops once the ball reaches a target, whereas Markov processes don't terminate. I was initially going to model that as the ball always moving from the target to itself, but things would have become slightly complicated.<br /><br />Gawry has a nice way of making this explicitly a matrix problem. Set up the matrix M as for a Markov process i.e., \(M_{i,j}\) is the probability of a transition from state j to state i. However, for a target state j, we set \(M_{i,j}=0\) for all i. Now if \(b\) is our initial probability vector (equal probability for each empty spot in the first row), then \(M^t b\) represents the probability of the ball being in each position (and the game not having previously finished) after \(t\) timesteps. We can then say that the expected amount of time the ball spends in each position is given by \(\sum_{t=0}^{\infty} M^t b\). The sum of the elements in this vector is the expected length of a game and we're told that it is less than \(10^9\), so we don't need to worry about convergence. However, that doesn't mean that the matrix series itself converges: Gawry points out that if there are unreachable parts of the game with no targets, then the series won't converge. We fix that by doing an initial flood-fill to find all reachable cells and only use those in the matrix. Gawry then shows that under the right conditions, the series converges to \((I - M)^{-1} b\).<br /><br />This is where my solution differs. Gawry dismisses Gaussian elimination, because the matrix can be up to 200,000 square. However, this misses the fact that it is <i>banded</i>: by numbering cells left-to-right then top-to-bottom, we ensure that every non-zero entry in the matrix is at most W cells away from the main diagonal. Gaussian elimination (without pivoting) preserves this property. We can exploit this both to store the matrix compactly, and to perform Gaussian elimination in \(O(W^3H)\) time.<br /><br />One concern is the "without pivoting" caveat. I was slightly surprised that my first submission passed. I think it is possible to prove correctness, however. Gaussian elimination without pivoting is known (and easily provable) to work on strictly column diagonally dominant matrices. In our case the diagonal dominance is weak: columns corresponding to empty cells have a sum of zero, those corresponding to targets have a 1 on the diagonal and zeros elsewhere. However, the matrix is also irreducible, which I think is enough to guarantee that there won't be any division by zero.<br /><br />EDIT: actually it's not guaranteed to be irreducible, because the probabilities can be zero and hence it's possible to get from A to B without being able to get from B to A. But I suspect that it's enough that one can reach a target from every state. </div>Unknownnoreply@blogger.com4tag:blogger.com,1999:blog-31847281.post-10944906129548100282014-06-28T09:59:00.001-07:002014-06-28T12:27:23.563-07:00ICPC Problem L: Wires<div dir="ltr" style="text-align: left;" trbidi="on">While this problem wasn't too conceptually difficult, it requires a <i>lot</i> of code (my solution is about 400 lines), and careful implementation of a number of geometric algorithms. A good chunk of the code comes from implementing a rational number class in order to precisely represent the intersection points of the wires. It is also very easy to suffer from overflow: I spent a long time banging my head against an assertion failure on the server until I upgraded my rational number class to use 128-bit integers everywhere, instead of just for comparisons.<br /><br />The wires will divide the space up into connected regions. The regions can be represented in a planar graph, with edges between regions that share an edge. The problem is then to find the shortest path between the regions containing the two new end-points.<br /><br />My solution works in a number of steps:<br /><ol style="text-align: left;"><li>Find all the intersection points between wires, and the segments between intersection points. This just tests every wire against every other wire. The case of two parallel wires sharing an endpoint needs to be handled carefully. For each line, I sort the intersection points along the line. I used a dot produce for this, which is where my rational number class overflowed, but would probably have been safer to just sort lexicographically. More than two lines can meet at an intersection point, so I used a std::map to assign a unique ID to each intersection point (I'll call them vertices from here on).</li><li>Once the intersection points along a line have been sorted, one can identify the segments connecting them. I create two copies of each segment, one in each direction. With each vertex A I store a list of all segments A->B. Each pair is stored contiguously so that it is trivial to find its partner. Each segment is considered to belong to the region to its left as one travels A->B.</li><li>The segments emanating from each vertex are sorted by angle. These comparisons could easily cause overflows again, but one can use a handy trick: instead of using the vector for the segment in an angle comparison, one can use the vector for the entire wire. It has identical direction but has small integer coordinates.</li><li>Using the sorted lists from the previous step, each segment is given a pointer to its following segment from the same region. In other words, if one is tracing the boundary of the region and one has just traced A->B, the pointer will point to B->C.</li><li>I extract the <i>contours</i> of the regions. A region typically consists of an outer contour and optionally some holes. The outermost region lacks an outer contour (one could add a big square if one needed to, but I didn't). A contour is found by following the next pointers. A case that turns out to be inconvenient later is that some segments might be part of the contour but not enclose any area. This can make a contour disappear completely, in which case it is discarded. Any remaining contours have the property that two adjacent segments are not dual to each other, although it is still possible to both sides of an edge to belong to the same contour.</li><li>Each contour is identified as an outer contour or a hole. With integer coordinates I could just measure the signed area of the polygon, but that gets nasty with rational coordinates. Instead, I pick the lexicographically smallest vertex in the contour and examine the angle between the two incident segments (this is why it is important that there is a non-trivial angle between them). I also sort the contours by this lexicographically smallest vertex, which causes any contour to sort before any other contours it encloses.</li><li>For each segment I add an edge of weight 1 from its containing region to the containing region of its dual.</li><li>For each hole, I search backwards through the other contours to find the smallest non-hole that contains it. I take one vertex of the hole and do a point-in-polygon test. Once again, some care is needed to avoid overflows, and using the vectors for the original wires proves useful. One could then associate the outer contour and the holes it contains into a single region object, but instead I just added an edge to the graph to join them with weight 0. In other words, one can travel from the boundary of a region to the outside of a hole at no cost.</li><li>Finally, I identify the regions containing the endpoints of the new wire, using the same search as in the previous step.</li></ol>After all this, we still need to implement a shortest path search - but by this point that seems almost trivial in comparison.<br /><br />What is the complexity? There can be \(O(M^2)\) intersections and hence also \(O(M^2)\) contours, but only \(O(M)\) of them can be holes (because two holes cannot be part of the same connected component). The slowest part is the fairly straightforward point-in-polygon test which tests each hole against each non-hole segment, giving \(O(M^3)\) time. There are faster algorithms for doing point location queries, so it is probably theoretically possible to reduce this to \(O(M^2\log N)\) or even \(O(M^2)\), but certainly not necessary for this problem.</div>Unknownnoreply@blogger.com3tag:blogger.com,1999:blog-31847281.post-86068129023016434372014-06-28T02:23:00.000-07:002014-06-30T09:18:51.138-07:00ICPC Problem J: Skiing<div dir="ltr" style="text-align: left;" trbidi="on">I'll start by mentioning that there is also some analysis discussion on the <a href="http://apps.topcoder.com/forums/?module=Thread&threadID=823018&start=0&mc=5#1894272">Topcoder forums</a>, which includes alternative solutions that in some cases I think are much nicer than mine.<br /><br />So, on to problem J - which no team solved, and only one team attempted. I'm a little surprised by this, since it's the sort of problem where one can do most of the work on paper while someone else is using the machine. A possible reason is that it's difficult to determine the runtime: I just coded it up and hoped for the best.<br /><br />It is a slightly annoying problem for a few reasons. Firstly, there are a few special cases, because \(v_y\) or \(a\) can be zero. We also have to find the lexicographically smallest answer.<br /><br />Let's deal with those special cases first. If \(v_y = 0\) then we can only reach targets with \(y = 0\). If \(a \not= 0\) then we can reach all of them in any order, otherwise we can only reach those with \(x = 0\). To get the lexicographically smallest answer, just go through them in order and print out those that are reachable. I actually dealt with the \(v_y \not= 0\), \(a = 0\) case as part of my general solution, but is also easy enough to deal with. We can only reach targets with \(x = 0\), and we can only reach them in increasing order of y. One catch is that targets are not guaranteed to have distinct coordinates, so if two targets have the same coordinates we must be careful to visit them in lexicographical order. I handled this just by using a stable sort.<br /><br />So, onwards to the general case. Before we can start doing any high-level optimisation, we need to start by answering this question: if we are at position P with X velocity \(v\) and want to pass through position Q, what possible velocities can we arrive with? I won't prove it formally, but it's not hard to believe that to arrive with the smallest velocity, you should start by accelerating (at the maximum acceleration) for some time, then decelerate for the rest of the time. Let's say that the total time between P and Q is \(T\), the X separation is \(x\), the initial acceleration time is \(T - t\) and the final deceleration time is \(t\). Some basic integration then gives<br />\[x = v(T-t)+\tfrac{1}{2}a(T-t)^2 + (v+a(T-t))t - \tfrac{1}{2}at^2.\]<br />This gives a quadratic in \(t\) where the linear terms conveniently cancel, leaving<br />\[t = \sqrt{\frac{vT+\tfrac{1}{2}aT^2-x}{a}}\]<br />The final velocity is just \(v+aT-2at\). We can also find the largest possible arrival velocity simply by replacing \(a\) with \(-a\). Let's call these min and max velocity functions \(V_l(v, T, x)\) and \(V_h(v, T, x)\).<br /><br />More generally, if we can leave P with a given range of velocities \([v_0, v_1]\), with what velocities can we arrive at Q? We first need to clamp the start range to the range from which we can actually reach Q i.e., that satisfy \(|vT - x| \le \frac{1}{2}aT^2\). A bit of calculus shows that \(V_l\) and \(V_h\) are decreasing functions of \(v\), so the arrival range will be \([V_l(v_1), V_h(v_0)]\).<br /><br />Finally, we are ready to tackle the problem as a whole, with multiple targets. We will use dynamic programming to answer this question, starting from the last target (highest Y): if I start at target <i>i</i> and want to hit a total of <i>j</i> targets, what are the valid X velocities at <i>i</i>? The answer will in general be a set of intervals. We will make a pseudo-target at (0, 0), and the answer will then be the largest <i>j</i> for which 0.0 is a valid velocity at this target.<br /><br />Computing the DP is generally straightforward, except that the low-level queries we need are the reverse of those we discussed above i.e. knowing the arrival velocities we need to compute departure velocities. No problem, this sort of physics is time-reversible, so we just need to be careful about which signs get flipped. For each (<i>i, j</i>) we consider all options for the next target, back-propagate the set of intervals from that next target, and merge them into the answer for (<i>i, j</i>). Of course, for \(j = 1\) we use the interval \((-\infty, \infty)\).<br /><br />The final inconvenience is that we must produce a lexicographical minimum output. Now we will see the advantage of doing the DP in the direction we chose. We build a sequence forwards, keeping track of the velocity interval for our current position. Initially the position will be (0, 0) and the velocity interval will be [0.0, 0.0]. To test whether a next position is valid, we forward-propagate the velocity interval to this next position, and check whether it has non-empty intersection with the set of velocities that would allow us to hit the required number of remaining targets. We then just take the next valid position with the lowest ID.<br /><br />What is the complexity? It could in theory be exponential, because every path through the targets could induce a separate interval in the interval set. However, in reality the intervals merge a lot, and I wouldn't be surprised if there is some way to prove that there can only be a polynomial number of intervals to consider. My solution still ran pretty close to the time limit, though.</div>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-31847281.post-84951348545507614602014-06-27T02:40:00.000-07:002014-06-27T02:40:42.907-07:00More ICPC analysis<div dir="ltr" style="text-align: left;" trbidi="on"><div>Now we're getting on to the harder problems. Today I've cover two that I didn't know how to solve. Some of the others I have ideas for, but I don't want to say anything until I've had a chance to code them up.<br /><br />Firstly, problem I. This is a maximum clique problem, which on a general graph is NP-complete. So we will need to use the geometry in some way. Misof has a nice set of slides showing how it is done: http://people.ksp.sk/~misof/share/wf_pres_I.pdf. I had the idea for the first step (picking the two points that are furthest apart), but it didn't occur to me that the resulting conflict graph would be bipartite.<br /><br />Now problem G. I discovered that this is a problem that has had a number of research papers published on the topic, one of which achieves \(O(N^3)\). Fortunately, we don't need to be quite that efficient. Let's start by finding a polynomial-time solution. Let's suppose we've already decided the diameters of the two clusters, D(A) and D(B), and just want to find out whether this is actually possible. For each shipment we have a boolean variable that says whether it goes into part A (false) or part B (true). The constraints become boolean expressions: if d(i, j) > D(A) then we must have i || j, and if d(i, j) > D(B) then we must have !i || !j. Determining whether the variables can be satisfied is just 2-SAT, which can be solved in \(O(N^2)\) time.<br /><br />Now, how do we decide which diameters to test? There are \(O(N^2)\) choices for each, so the naive approach will take \(O(N^6)\) time. We can reduce this to \(O(N^4\log N)\) by noting that once we've chosen one, we can binary search the other (it's also possible to eliminate the log factor, but it's still too slow). So far, this is what I deduced during the contest.<br /><br />The trick (which I found in one of those research papers) is that one can eliminate most candidates for the larger diameter. If there is an odd-length cycle, at least one of the edges must be within a cluster, and so that is a lower bound for the larger diameter. What's more, we can ignore the shortest edge of an even cycle (with some tie-breaker), because if the shortest edge lies within a cluster, then so must at least one other edge.<br /><br />We can exploit this to generate an O(N)-sized set of candidates: process the edges from longest to shortest, adding each to a new bipartite graph (as for constructing a maximum spanning tree with Kruskal's algorithm). There are three cases:<br /><ol style="text-align: left;"><li>The next edge connects two previously disconnected components. Add the edge to the graph (which will remain bipartite, since one can always re-colour one of the components. This edge length is a candidate diameter.</li><li>The next edge connects two vertices in the same component, but the graph remains bipartite. This edge is thus part of an even cycle, and so can be ignored.</li><li>The next edge connects two vertices, forming an odd cycle. This edge length is a candidate diameter, and the algorithm terminates.</li></ol></div>The edges selected in step 1 form a tree, so there are only O(N) of them.</div>Unknownnoreply@blogger.com2tag:blogger.com,1999:blog-31847281.post-39471291094526605002014-06-25T11:43:00.004-07:002014-06-28T10:17:31.896-07:00ACM ICPC 2014<div dir="ltr" style="text-align: left;" trbidi="on">ACM ICPC 2014 is over. The contest was incredibly tough: solving less than half the problems was enough to get a medal, and 20 teams didn't solve any of the problems (unfortunately including the UCT team, who got bogged down in problem K).<br /><br />I managed to solve 5 problems during the contest (in 4:30, since I didn't wake up early enough to be in at the start), and I've solved one more since then. Here are my solutions, in approximately easy-to-hard order.<br /><br />EDIT: note that I've made follow-up blog posts with more analysis. <br /><h3 style="text-align: left;">D: game strategy</h3><div style="text-align: left;">This was definitely the easiest one, and this is reflected in the scores. We can use DP to determine the set of states from which Alice can force a win within <i>i</i> moves. Let's call this w[i]. Of course, w[0] = {target}. To compute w[i+1], consider each state s that is not already a winning state. If one of Alice's options is the set S and \(S \subseteq w[i]\), then Alice can win from this state in i+1 moves.<br /><br />We can repeat this N times (since no optimal winning sequence can have more than N steps), or just until we don't find any new winning states. We don't actually need to maintain all the values of w, just the latest one. </div><h3 style="text-align: left;">K: surveillance</h3><div style="text-align: left;">Let's first construct a slower polynomial-time algorithm. Firstly, for each i, we can compute the longest contiguous sequence of walls we can cover starting at i. This can be done in a single pass. We sort all the possible cameras by their starting wall, and as we sweep through the walls we keep track of the largest-numbered ending wall amongst cameras whose starting wall we have passed. The camera with \(a > b\) are a bit tricky: we can handle them by treating them as two cameras \([a - N, b]\) and \([a, b + N]\).</div><div style="text-align: left;"><br /></div><div style="text-align: left;">Let's suppose that we know we will use a camera starting at wall i. Then we can use this jump table to determine the minimum number of cameras: cover as much wall starting from i as possible with one camera, then again starting from the next empty slot, and so on until we've wrapped all the way around and covered at least N walls. We don't know the initial i, but we can try all values of i. Unfortunately, this requires \(O(N^2)\) time.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">To speed things up, we can compute accelerated versions of the jump table. If \(J_c[i]\) is the number of walls we can cover starting from i by using c cameras, then we already have \(J_1\), and we can easily calculate \(J_{a+b}\) given \(J_a\) and \(J_b\). In particular, we can compute \(J_2, J_4, J_8\) and so on, and them combine these powers of two to make any other number. This can then be used in a binary search to find the smallest c such that \(J_c\) contains a value that is at least N. The whole algorithm thus requires \(O(N\log N)\) time.<br /><br />One catch that broke my initial submission is that if the jump table entries are computed naively, they can become much bigger than N. This isn't a problem, until they overflow and become negative. Clamping them to N fixed the problem.</div><h3 style="text-align: left;">C: crane balancing</h3><div style="text-align: left;">This is a reasonably straightforward geometry problem, requiring some basic mechanics. Let the mass of the crane be \(q\), let the integral of the x position over the crane be \(p\), let the base be the range \([x_0, x_1]\), and let the x position of the weight be \(x\). The centre of mass of the crane is \(\frac p q\), but with a weight of mass \(m\), it will be \(\frac{p+mx}{q+m}\). The crane is stable provided that this value lies in the interval \([x_0, x_1]\). This turns into two linear inequalities in \(m\). Some care is then needed to deal with the different cases of the coefficient being positive, negative or zero.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">Computing the area and the integral of the x value is also reasonably straightforward: triangulate the crane by considering triangles with vertices at \((0, i, i+1)\) and computing the area (from the cross product) and centre of mass (average of the vertices) and then multiply the area by the x component of the centre of mass to get the integral.</div><div style="text-align: left;"><br /></div><div style="text-align: left;">I prefer to avoid floating point issues whenever possible, so I multiplied all the X coordinates by 6 up front and worked entirely in integers. This does also cause the masses to be 6 times larger, so they have to be adjusted to compensate.</div><h3 style="text-align: left;">E: maze reduction</h3><div style="text-align: left;">This is effectively the same problem I set in Topcoder SRM 378 (thanks to ffao on Topcoder for reminding me where I'd seen this). If you have an account on Topcoder you can read about an alternative solution in the match analysis, in which it is easier to compute worst-case bounds.<br /><br />The approach I used in this contest is similar in concept to D, in that you incrementally update a hypothesis about which things are distinguishable. We will assign a label to every door of each chamber. Two doors are given the same label if we do not (yet) know of a way to distinguish them. Initially, we just label each door with the degree of the room it is in, since there is nothing else we can tell by taking zero steps.</div><div style="text-align: left;">Now we can add one inference step. Going clockwise from a given door, we can explore all the corridors leading from the current chamber and note down the labels on the matching doors at the opposite end of each corridor. This forms a signature for the door. Having done this for each door, we can now assign new labels, where each unique signature is assigned a corresponding label. We can repeat this process until we achieve stability, i.e., the number of labels does not change. We probably ought to use each door's original label in the signature too, but my solution passed without requiring this.</div><div style="text-align: left;">Finally, two rooms are effectively identical if their sequence of door labels is the same up to rotation. I picked the lexicographically minimum rotation for each room and used this as a signature for the room.</div><div style="text-align: left;">I'm not sure what the theoretical work-case performance is, but I suspect it would be quite large. For a start, by algorithm requires \(O(NE\log N)\) time <i>per iteration</i>, which I suspect could be improved by using a hash table with some kind of rolling hash to rotation in O(1) time. I was surprised when my first submission ran in time.</div><h3 style="text-align: left;">B: buffed buffet</h3><div style="text-align: left;">This one was deceptively sneaky, but the principles are reasonably simple. It's not too hard to guess that one will solve the continuous and discrete problems separately, and then consider all partitions between the two.</div><div style="text-align: left;">Let's start with the continuous problem, since it is a little easier. I don't have a formal proof, but it shouldn't be too hard to convince yourself that a greedy strategy works. We start by eating the tastiest food. Once it degrades to being as tasty as the second-tastiest food, we then each both, in the proportion that makes them decay at the same rate. In fact, at this point we can treat them as a combined food with a combined (slower) decay rate. We continue eating this mix until tastiness decays to match the third-tastiest, and so on. There are some corner cases that need to be handled if there are foods that don't decay.</div><div style="text-align: left;">Now what about the discrete case? Because the items have different weights, a greedy strategy won't work here, as with any knapsack problem. There is a reasonably straightforward \(O(DW^2)\) dynamic programming, where you consider each possible number of serving of each discrete food type, but this will be too slow. But there is some structure in the problem, so let's try to exploit it. Let's say that \(f(i)\) is the maximum tastiness for a weight of \(i\) using only the foods we've already considered, and we're now computing an updated \(f'\) by adding a new discrete food with weight \(w\), initial tastiness \(t\) and decay rate \(d\). For a given i, \(f'(i)\) clearly only depends on \(f(j)\) where \(i - j\) is a multiple of \(w\), so let's split things up by the remainder modulo \(i\). Fix a remainder \(r\) and let \(g(i) = f(iw + r)\) and \(g'(i) = f'(iw + r)\). Now we can say that</div><div style="text-align: left;">\[<br />\begin{aligned}<br />g'(i) &= \max_{0 \le j \le i}\big\{ g(j) + \sum_{n=1}^{i-j}(t - (n-1)d\big\}\\<br />&= \max_{0 \le j \le i}\big\{ g(j) + (i-j)t - \frac{(i-j-1)(i-j)}{2}\cdot d\big\}\\<br />&= \max_{0 \le j \le i}\big\{ g(j) + (i-j)t - \frac{i(i-1)+j(j+1)-2ij}{2}\cdot d\big\}\\<br />&= it - \frac{i(i-1)d}{2} + \max_{0 \le j \le i}\big\{ g(j)-\frac{j(j+1)d}{2} - jt + ijd\big\}\\<br />&= it - \frac{i(i-1)d}{2} + \max_{0 \le j \le i}\big\{ h(j) + ijd \big\}<br />\end{aligned}<br />\]<br />Here we have defined \(h(j) = g(j)-\frac{j(j+1)d}{2} - jt\), which can be computed in constant time per \(j\).<br /><br />The key observation is that we have turned the expression inside the maximum into a linear function of \(j\). Thus, if we plot \(h\) on a graph, only the upper convex hull is worth considering. We can maintain this upper hull as we increase \(i\) (remember, \(j \le i\)). The second observation is that the optimal choice of \(j\) will increase as \(i\) increases, because increasing \(i\) will increase the \(ijd\) more for larger values of \(j\). It follows that we can incrementally find the optimal \(j\) for each \(i\) by increasing the previous \(j\) along the upper hull until increasing it further would start decreasing the objective function. There are a few corner cases: the upper hull might not have any valid entries (because there might be no way to make up a weight of exactly \(jw+r\)), and we might have popped off the previous \(j\) as part of updating the hull. These just need careful coding. Another snag to watch out for is that if all the foods decay very fast, the optimal result may in fact overload a 32-bit integer in the negative direction.<br /><br />The discrete part of the algorithm now requires O(DW), and the continuous part requires O(D\log D + W).</div><h3 style="text-align: left;">F: messenger (solved after contest)</h3><div style="text-align: left;">This is a nasty problem mostly because of numerical stability problems. The problem itself is numerically unstable: a rounding error in measuring the total lengths of the paths is sufficient to change the answer between possible and impossible. It requires the black art of choosing good epsilons. I'll only describe the ideal case in which none of these nasty rounding problems occur.<br /><br />Suppose we pick a point on both Misha and Nadia's path. In general this won't work, because the courier will arrive too late or too early at this point to intercept Nadia. Let L(A, B) be the number of time units that the courier is late when travelling from A to B. L can be late if the courier is early. Valid solutions are those where L = 0.<br /><br />First, let's prove that L is an increasing function of A (where "increasing" means that A moves further along Misha's path). Suppose that the courier decided to, instead of moving straight to B, instead kept pace with Misha for a while, then went to B. Obviously this would mean arriving later (or at the same time). But this is equivalent to the arrival time if A increases. A similar argument shows that L is a decreasing function of B. In both cases, this is not strict.<br /><br />This means that there can be at most \(O(N)\) pairs of segments for which L=0, rather than \(O(N^2)\), because we cannot move forward on one path and backwards on another. We can iterate over the candidate pairs by successively advancing either one segment or the other, by measuring L at pairs of segment endpoints.<br /><br />Now we have reduced the problem to a single segment from each path. Given a point on one path, we can find the corresponding point on the other by solving what looks like a quadratic but which decays into a linear equation. Again, there are corner cases where the linear coefficient is also zero, in which case we have to break ties so as to minimise the travel time. Using this mapping function, we can identify the range of positions on Misha's path that correspond to valid positions on Nadia's path. I then used ternary search to find the optimal position along this segment. I didn't actually prove that the function is convex, but it seemed to work.<br /><br />The solution I implemented that passed is actually rather messier than what I've described, and ran close to the time limit. I tried implementing it as described above but it hits assertions in my code due to numeric instabilities, but I think it should still be possible to fix it.</div></div>Unknownnoreply@blogger.com12tag:blogger.com,1999:blog-31847281.post-78518432906570376112014-06-02T03:23:00.004-07:002014-06-02T03:23:39.904-07:00New website<div dir="ltr" style="text-align: left;" trbidi="on">I've finally gotten around to replacing my ugly nineties-looking <a href="http://www.brucemerry.org.za/">personal web page</a> that was cobbled together with raw HTML and m4, with a much prettier version that nevertheless still contains roughly the same content. It is generated using <a href="http://sphinx-doc.org/">Sphinx</a>, with the individual pages written in <a href="http://docutils.sf.net/rst.html">reStructuredText</a>. I've also merged my publications page into the main website instead of leaving it on the equally ugly but separately cobbled-together page I originally set up for my PhD.</div>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-31847281.post-45467314060592092412014-05-10T08:45:00.002-07:002014-05-10T08:45:23.889-07:00Challenge 24 finals 2014<div dir="ltr" style="text-align: left;" trbidi="on">Here are my thoughts/solutions for the Challenge 24 Finals problems from this year.<br /><h3 style="text-align: left;">A: Halting problem</h3><div style="text-align: left;">Firstly, be aware that the problem had a typo: the condition should be "A != 0" rather than "A == 0". The intuition is that "most" of the time the g edge will be taken, but occasionally the h edge might be taken. Given a starting point, let us suppose that we have an efficient way to iterate the algorithm until either an h edge is taken or we terminate (or we determine that neither condition will ever hold). In the former case, we have just determine that we need f(0) for some f. Since there are only O(N) of these values, we can cache them as needed, and so only require O(Q + N) rounds of the inner algorithm (namely, iterating until we reach A = 0 or terminate).</div><div style="text-align: left;">Now let us solve this inner problem. Consider the directed graph in which each function f is connected to g. A graph where every vertex has out-degree one has a specific structure, in which each component contains a single loop from which hang upward-directed trees. Iteration thus consists of walking up a tree until reaching the loop, then going round and round the loop until we exit. Of course, we might finish early, or we might go around the loop forever.</div><div style="text-align: left;">For the walking up the tree part, let's use brute force for now. Assuming we make it to the loop, how do we find the exit point? Let's say that every complete time around the loop we increase A by C, and that at some point around the loop we have A = a. Then to exit at this point after t full cycles, we must have a + tC = 0 (mod 2<sup>64</sup>). We can solve this using the extended GCD algorithm. We can measure this for every point around the loop, and the smallest valid t value tells us where we will exit. If there are no valid t values, then we will go round forever.</div><div style="text-align: left;">This is essentially the solution I used in the contest. It is somewhat slow: O(N(N+Q)) I think. I did a lot of microoptimisation, and even then it took hours for the largest test case. The important optimisations were precomputing parameters for each loop (particularly the inverse of C) once, and reordering the data so that points around a loop are consecutive in memory. In fact over 75% of time ended up being spent in pointer-chasing up the trees, because this was cache-unfriendly.</div><div style="text-align: left;">I think the problem can be solved in something like O((N+Q)log (N+Q)), but it is rather more complex, and a blog doesn't really lend itself to the mathematical notation. I might write it up later if I get around to it. I'll give a few hints. If C is odd then increasing the A value on entry to the loop by a certain amount will increase the time-to-exit for each point by the same amount; this can be generalized if C is an odd multiple of 2^k by partitioning things by their remainder mod 2^k. The tree part can be handled by considering all queries at the same time, in a preorder walk of the tree.</div><h3 style="text-align: left;">C: Complete program</h3><div style="text-align: left;">I had a nasty super-polynomial time solution during the contest, which was complex and only managed to solve about 7 test cases. There is a much simpler and polynomial-time DP solution, however.</div><div style="text-align: left;">Let's start with some observations. Any O not followed by a valid number will need one, so let's add it immediately (it can be anything valid). After this, O and the number might as well be treated as a single unit (there is no point inserted anything between an O and a valid number either). There is no point ever adding an I. Adding F's is useful only to increment the nesting depth, so we might as well add F's only right at the front, since that way everything will be nested inside them. Finally, whenever we add a number, it might as well be a 0.<br />What is less obvious is that it is only necessary to add an A immediately before a number. This requires checking a few cases, but is because an A added anywhere else can be shuffled to the right until it reaches a number, without affecting whether a token string is valid. Similar rotation arguments show that we never need to add two As or Os in front of a number.</div><div style="text-align: left;">For the DP, let's compute, for each contiguous substring of the input and each F nesting level, the minimum number of tokens we need to add to make it into a valid parse tree. There is a special case for the empty range: we must just add a 0, and we must check that the nesting depth at that point is at least 1. Otherwise, we can either</div><ul style="text-align: left;"><li>Insert an A or an O.</li><li>Use the first token in the substring as the root.</li></ul>In the first case, the A or O will "swallow" the number, so if it is at most 127 we can just use an O, otherwise we must use an A and the nesting level must be sufficiently large. We then use DP to find the best way to handle the rest of the substring. The second case is largely straightforward, unless the first given token is an A. In this case, we must consider all ways in which the rest of the substring may be split into two valid trees.<br />Finally, we need to determine the nesting level to use at the root. It is possible, although a little slow, to just keep incrementing this value until there are so many initial Fs that it is clearly impossible to improve the solution. A more efficient solution in practice is to compute the DP over each substring as a piecewise constant function of the nesting level; this is more complex since one needs to do various operations on these piecewise constant functions, but it guarantees an O(N<sup>4</sup>) solution independent of the size of the numbers in the input.<br /><h3 style="text-align: left;">D: Firing game</h3><div style="text-align: left;">I solved 9 out of the 10 test cases. The primary observation is that any manager that is unfirable in the initial setup creates a completely independent subtree: firings within this subtree have no effect on the rest of the tree, and vice versa. This partitions the tree into separate Nim piles, and we just need to determine the Grundy number for each such subproblem. In all but one test case, these trees are reasonably small, and can be solved by DP/memoized recursion.</div><h3 style="text-align: left;">N: Dog tags</h3><div style="text-align: left;">This problem looked as though it could be extremely nasty, but fortunately the data was somewhat friendly. For example, each bubble was intersected enough times to get a good guess at the volume, even though the problem only guaranteed a single intersection.</div><div style="text-align: left;">Firstly one must identify the bubbles. The loops can be considered to be a graph, with an edge between two loops if they are on adjacent slices and their interiors intersect in the XY plane. I was lazy and considered two loops to intersect if a vertex of one lay inside another, which is good enough if the loops are sampled well. Each connected component in this graph constitutes a bubble. I estimated the volume as the sum of areas of the loops, and also found the centroid (mean of all points contained in the loops).</div><div style="text-align: left;">Finding the fiduciary bubbles is easy: they are the 3 bubbles with greatest volume. Determining which one is the corner is not quite as easy, since one does not know the inter-slice distance and hence distances cannot easily be measured. I identified the first plane of bubbles (see later), and noted that the mid-point of the two corner bubbles should be close to the centroid of this set of bubbles. I just took all 3 edges to find which is best.</div><div style="text-align: left;">This now gives the grid coordinate system XYZ. One does need to check that the bubbles are on the positive Z side, otherwise the two corners need to be swapped.</div><div style="text-align: left;">Now we can partition the bubbles into planes. I sorted by Z, computed all the differences between adjacent Z values, sorted this list, and looked for a big jump between adjacent values (small values are noise within a plane, big values are plane-to-plane steps).</div><div style="text-align: left;">The same process can then be used to find each row within a plane, and then bubbles in a row can be sorted by X.</div></div>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-31847281.post-78303376044283592462014-02-08T09:39:00.001-08:002014-02-08T09:41:56.929-08:00Challenge 24 EC round 2014<div dir="ltr" style="text-align: left;" trbidi="on">Here are my thoughts on the problems we solved from this year's electronic round of Challenge 24.<br /><h2 style="text-align: left;">A. Safe</h2><div style="text-align: left;">I didn't get a complete solution to this, but I was able to solve cases 1-9. I noticed that in most cases the state space for the wheels (keys, but the pictures made me think of them as wheels) is reasonably small. For each password, the solution will involve first pressing the keys a number of times to set up the initial state, and then typing the password. So we want to find the initial states from which we can input the password, and pick the cheapest amongst them.<br />This can be done by dynamic programming, running backwards over the password. When there are no more characters to type, any state is fine. Considering each letter in turn, we need to find the states in which we can type the next letter on one of the keys, and from which the resulting state allows us to complete the password.</div><div style="text-align: left;"></div><div style="text-align: left;">For A10 there are 6.4 billion states, and it was going to take a long time (it did churn out the first two passwords in about 3 hours). I suspect it might be possible to do it by keeping an explicit list/set of reachable states and working from there, rather than iterating over the entire state space.</div><h2 style="text-align: left;">B. Wiretapping</h2><div style="text-align: left;">This problem basically just requires an application of <a href="http://en.wikipedia.org/wiki/Kirchhoff%27s_theorem">Kirchhoff's Theorem</a>, which gives the number of spanning trees of the graph in polynomial time. To count the number of spanning trees that use the tapped link, just collapse the two endpoints into a single vertex and count the number of spanning trees of the remaining graph. The ratio between the number of spanning trees in this reduced graph and the original graph is the answer.<br />One complication is that the number of spanning trees may overflow a double-precision float. I worked around that by scaling down the Laplacian matrix by some factor f (I used 1/32 for the larger cases), which reduces the determinant of the cofactor by f<sup>N-1</sup>. </div><div style="text-align: left;"><h2 style="text-align: left;">C. Visual Programming - VM</h2><div style="text-align: left;">This was a fairly straightforward implementation problem, just requiring careful interpretation of the rules, and recognising that a lot of the information given is unnecessary (e.g., for lines you only need to keep the two endpoints).</div><h2 style="text-align: left;">D. Visual Programming</h2><div style="text-align: left;">Due to lack of time I only solved D1, D2 and D4, and had a broken solution to D3. As with most VM problems, I invented a slightly higher-level language that is possible (but still hard) to code in by hand, and write a simpler compiler. I made a few simplifying assumptions</div><ul style="text-align: left;"><li>Polygons are laid out one below the other. Each polygon is rectangular, except that the right hand edge may be ragged.</li><li>Each line is only ever used in one direction. To ensure this, the outgoing lines from a polygon all emanate from the top part of the right edge, and incoming lines arrive at the bottom part of the right edge. There is also an unconditional jump from each polygon to the next one to ensure that the routing terminates before encountering any of the incoming lines.</li><li>To avoid lines cutting through polygons, each line emanates directly horizontally from the polygon to (256, y), then has a second segment to the target point. The x value of the start is chosen depending on which register is to be used in the comparison.</li><li>I reserved registers 0 and 1 to always contain the values 0 and 1, which made it easier to construct some other operations, as well as to build an unconditional jump.</li></ul></div><div style="text-align: left;">The programming mini-language just specifies instructions and conditional a list of conditional jumps that takes place after each instruction (falling through to the next instruction if none match). The instructions I wrote at the level of setting the parameters (A-F, X-Z, T, Q, L, H), but I had useful defaults for all of them so that most instructions only changed one or two defaults e.g. setting A=3 B=15 would generate an instruction that copies from SIO into register 3.</div><div style="text-align: left;">Once all that is done, it is not particularly difficult for anyone who has done any assembly to write code for D1, D2, and D4. The only slightly tricky part was to extract the lower 8 bits from the result, which I did by multiplying to 256 then dividing by 256.</div><div style="text-align: left;"><h2 style="text-align: left;">G. Spy Union</h2><div style="text-align: left;">At first I thought this could be done greedily, but it doesn't work: reinstating a fired person might let you fire an additional person <i>in each hierarchy</i>. Instead, we can use network flow. Let's consider just one hierarchy for now, and connect the source to every node, connect each node to its parent, and connect the root to the sink (directed edges). The edges from the source have capacity 1, with the interpretation that they have flow if the corresponding employee is fired. The flow through each intermediate edge will then be the total number of employees fired from the corresponding subtree, so we will set the capacities to be the maximum number we can fire each subtree. The maximum flow will thus be the maximum number of people we can fire. </div><div style="text-align: left;"></div><div style="text-align: left;">However, we have two hierarchies. Since network flow is symmetric, we can apply the same treatment to the second hierarchy, but connect the source to the root, and the individual employees to the sink. To join up the problems, we get rid of the source in the first graph and the sink in the second graph, and create capacity-1 edges directly from each employee in the second graph to the matching employee in the first. Once again, these edges have flow if the corresponding employee is fired, and the maximum flow gives the solution.</div><div style="text-align: left;"><br /></div></div></div>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-31847281.post-1822817891268781262013-11-23T01:44:00.004-08:002013-11-23T01:44:46.861-08:00Fun with C++11 variadic templates<div dir="ltr" style="text-align: left;" trbidi="on"><div>The other day I decided to experiment a bit with the support for variadic template support in C++11 (the new C++ standard). Like many C++ features, they're incredibly powerful, but not always that easy to use. After some experimentation I figured out how to do what I wanted, and I thought I'd write it up in case anyone else wanted to try to do the same thing.<br /><br />I wanted to create a fixed-size vector class (similar to <a href="http://en.cppreference.com/w/cpp/container/array">std::array</a>), but with a constructor that accepts the appropriate number of arguments. So for example, Vec<float 3=""> should have a constructor that accepts 3 floats.</float><br /><br />The simplest solution is to use initializer lists, which allow a constructor to take an arbitrary sequence of arguments of the same type. However, this has a number of disadvantages:<br /><ol style="text-align: left;"><li>It can only be used with brace-initializer syntax, not parentheses.</li><li>It does not enforce the correct length at compile time.</li></ol></div>Variadic templates allow a function to take a variable number of arguments. My second attempt at a constructor started by looking something like this:<br /><br /><span style="font-family: "Courier New",Courier,monospace;">template<typename ...="" args=""> Vec(Args&&... args);</typename></span><br /><br />This is just a constructor that accepts an arbitrary set of arguments, so it doesn't seem to fix the second problem, and additionally it doesn't even require the types to be correct. We can fix that using SFINAE to statically validate the arguments before the template is instantiated. The tricky part is figuring out where to insert the <a href="http://en.cppreference.com/w/cpp/types/enable_if">enable_if</a>: the constructor has no return type, so it can't go there; and we can't add a default argument, because that makes the interpretation of the variadic argument list ambiguous. However, we can add an extra template parameter with a default value:<br /><span style="font-family: "Courier New",Courier,monospace;"><br /></span><span style="font-family: "Courier New",Courier,monospace;">template<typename ...="" args="" span=""></typename></span><br /><span style="font-family: "Courier New",Courier,monospace;"> typename E = typename enable_if_all<sizeof ...="" n="" rgs="" span=""></sizeof></span><br /><span style="font-family: "Courier New",Courier,monospace;"> std::is_convertible<args t="">::value...>::type></args></span><br /><span style="font-family: "Courier New",Courier,monospace;">Vec(Args&&... args);</span><br /><span style="font-family: inherit;"><br /></span><span style="font-family: inherit;">Here </span><span style="font-family: "Courier New",Courier,monospace;">enable_if_all</span> is a variadic version of enable_if, which has a type member if all its arguments are true. It's not part of the standard, so I had to implement it myself. I won't go into details since this isn't the solution I used. It works, but it's roundabout and I'm not convinced that it won't differ in some subtle ways from a constructor that actually accepts only type Twhen in the presence of other constructors (because every parameter will match exactly, where as a constructor with float parameters might require a conversion sequence which would make this a worse match than another constructor).<br /><br />In the end I realised that the solution was to create a template parameter pack with N copies of T. At first I thought this couldn't be done, because parameter packs are not types and so one cannot typedef one in a helper class to be expanded in the constructor. The solution is to create it in the vector class itself, using recursion to generate the appropriate type. You can see the code <a href="http://pastebin.com/1FqXUawP">here</a>. The <span style="font-family: "Courier New",Courier,monospace;">VecBase</span> class is templated with the type T, followed by N copies of T as a variadic parameter pack. Thus, for example, <span style="font-family: "Courier New",Courier,monospace;">VecBase<int int=""></int></span> is a vector with 3 ints. This makes it easy to write the constructor, since we just expand the parameter pack.<br /><br />The tricky part is that we really want to write <span style="font-family: "Courier New",Courier,monospace;">Vec<int 3=""></int></span> rather than <span style="font-family: "Courier New",Courier,monospace;">VecBase<int int=""></int></span>. This is where <span style="font-family: "Courier New",Courier,monospace;">VecHelper</span> comes in: it recursively adds copies of T to the type name. Thus, <span style="font-family: "Courier New",Courier,monospace;">VecHelper<int 3="">::type</int></span> is <span style="font-family: "Courier New",Courier,monospace;">typedef</span>ed to <span style="font-family: "Courier New",Courier,monospace;">VecHelper<int 2="" int="">::type</int></span>, which in turn is typedefed to <span style="font-family: "Courier New",Courier,monospace;">VecHelper<int 1="" int=""></int></span> and finally <span style="font-family: "Courier New",Courier,monospace;">VecHelper<int 0="" int=""></int></span>. This is the base case that is handled in the partial specialization at line 29 to be <span style="font-family: "Courier New",Courier,monospace;">VecBase<int int=""></int></span>.<br /><br />Finally, we make use of the new <span style="font-family: "Courier New",Courier,monospace;">using</span> feature to typedef <span style="font-family: "Courier New",Courier,monospace;">Vec<int 3=""></int></span> to be <span style="font-family: "Courier New",Courier,monospace;">VecHelper<int 3="">::type</int></span>, and the problem is solved.</div>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-31847281.post-81162523441104920392013-07-19T00:49:00.000-07:002013-07-19T00:49:02.603-07:00IOI 2013 day 2 analysis<div dir="ltr" style="text-align: left;" trbidi="on"><h2 style="text-align: left;">Cave</h2><div style="text-align: left;">This is a nice problem following a recent IOI trend of having problems where the limiting factor is not execution time, but some abstract measure of efficiency. This is slightly easier than some of the previous problems in each vein.</div><div style="text-align: left;">To achieve 100%, we can use up to 14 queries per door. Since 2<sup>14</sup> is slightly more than the number of doors, this strongly suggests some form of binary search on each door. It will be easiest if we attack the doors in order. Once we know which switch and position opens a door, we can lock that switch in place so that we will always be able to see the next door, and thereafter pretend the switch does not even exist.<br />To solve for a door, we can start with all switches down, which will immediately tell us which switch position opens the door. At this point every switch is a candidate. We can then test with half the candidates down and half up, which will eliminate half the candidates. This process is then repeated until only one candidate remains.<br /><h2 style="text-align: left;">Robots</h2><div style="text-align: left;">As is commonly the case for problems that ask how long some agents need to achieve a goal, the answer can be found by a binary search on the answer. So we need to decide whether the robots can clear the floor in S seconds.</div><div style="text-align: left;">We can simplify the problem slightly by noting that for each toy, we only need to know how many weak and how many small robots are able to move it, which can be found by binary searching the two lists (after sorting them). Of course, if a toy cannot be moved by any robot, return -1.</div><div style="text-align: left;">Let's first decide the actions of the weak robots, starting from the weakest. There will be some set of toys it can handle. Since they effectively differ only in size, the weakest robot should work from the largest downwards, so as to make the job of the small robots easier. Also, there is never any reason for it to move fewer than S toys, unless it runs out. Now consider the 2nd-weakest robot. There will be extra toys it can handle, plus any light toys that the weakest robot didn't have time for. Since the weakest robot is finished, the difference in weights are irrelevant, and the 2nd-weakest robot should again work in decreasing order of size amongst the toys it can handle. The same process can be continued for the remaining weak robots.</div><div style="text-align: left;">Now consider the small robots, from largest to smallest. These can again take up to S toys, starting from the largest remaining one. If a robot is unable to handle the largest remaining toy, then S was too small.</div><div style="text-align: left;">Implementation can be done using a priority queue, implemented as a binary heap, representing toys that are light enough to be handled by the current robot and ordered with the largest at the head. The toys are initially sorted by weight. Each time a new weak robot is considered, new elements are added from the list of toys to the priority queue, and the robot then removes items starting from the head of the queue.</div><div style="text-align: left;">Assuming that T is larger than A, B, the running time will be O(T (log T)<sup>2</sup>): one log T for the binary search, the other for the priority queue operations. I think that it may be possible to reduce this to something like O(T·log T·log(max(A, B))) using the right sort of data structure for the priority queue (to allow S items to be removed in log time): something like an interval tree for the number of toys of each size.</div><h2 style="text-align: left;">Game</h2><div style="text-align: left;">I disliked this problem because it has a nice solution that takes just a bit too much memory. I only managed to get 80% for it in the time I spent on it, and I didn't feel inspired to modify my solution to pass fully.</div><div style="text-align: left;">In 1D, this can be solved by a fairly straightforward use of a segment tree: each node stores the GCD of its two children. Since R can be quite big, this needs to be a sparse segment tree; another alternative would be a balanced binary tree.</div><div style="text-align: left;">In 2D, it is tempting to use a quadtree, but in fact that doesn't guarantee poly-logarithmic time. A 1xC query will force refinement down to the individual non-zero cell entries. Instead, we can use a range tree, which is a tree of trees: an outer tree is built over the columns, and for each column span corresponding to a node in this tree, we have an inner tree over the rows. Each node in this inner tree corresponds to a rectangle of the grid, and stores the GCD of elements from this rectangle. A query now uses the columns to select O(log C) nodes from the outer tree i.e., O(log C) inner trees, and applies a query to each of them. Queries thus take O(log R·log C) time when the implementation uses segment trees for the inner and outer trees. With balanced binary trees, it would only be O((log Nu)<sup>2</sup>).<br />Unfortunately, using segment trees also requires O(log R·log C) memory per non-zero element, which just exceeds the available memory. Using balanced binary trees instead should fit within memory, but is a lot more painful to implement. I think it might also be possible to make it work by sticking with segment trees, but compressing the representation by compacting chains of nodes with one child. </div></div></div>Unknownnoreply@blogger.com1