Entropy always increases: ICPC World Finals 2024 solutions

I recently got around to looking at the problems from the 2024 ICPC World Finals, and noticed that they don't have test data or solution sketches posted. Fortunately Kattis has the problems in its online judge (https://icpc.kattis.com — the "official" online judge link is broken), so I was at least able to submit solutions to check correctness.

So here are my solution sketches below. In the cases of a few problems I had some intuition about a solution but I don't actually have a proof of correctness.

A: Billboards

It's easy enough to compute the total area of each value function. We can then easily scale the functions so that each has a total area of n (and hence each section must have a value of at least 1).

Let's suppose we already know the order in which we'll present the billboards. Then we can compute the partition points greedily: make each section in turn the length such that the value to the sponsor is 1. The last section has to end at l so it might be longer. Computing this requires a little maths. For each sponsor, we can keep a prefix sum of the areas of the piecewise-linear segments, and binary searching that gives us the segment where the area crosses 1. To find the exact x value we'll need to solve a quadratic equation, and handle a few special cases.

How do we choose the order? We can actually do this greedily too. Just repeatedly pick the sponsor that will allow us to use the smallest section of billboard. Initially, the total of the value functions of all sponsors is n². We're then eliminating one sponsor (with total value n), and a section of billboard which the other n - 1 sponsors value at 1 or less. So the remaining total of value functions of the other sponsors is at least n² - n - (n - 1) = (n - 1)². We can thus keep repeating the process, and see that at end the remaining value will still be least 0.

B: Bingo for the Win!

For a player to be last, one of their numbers must be the last number picked. Consider each distinct number that appears on a particular player P's board. If a slower player also has that number, then this number being last won't make P finish last. Otherwise, if this number is drawn last, P will finish last, and if the number appears m times in total (out of all nk numbers), it has a m/nk chance of being picked last. So for each player, add this up over all numbers on that player's board.

C: Citizenship

This is a fairly straightforward implementation problem. You have to convert dates into a more useful representation (e.g. days since 0000-01-01), and just iterate through possible answers until you find one that satisfies the conditions. One optimisation is that you can pre-compute a prefix sum of days away to quickly determine the number of days resident during any 365-day period.

D: Doubles Horseback Wrestling

I don't have a solid proof that the following works, but it is accepted. Note that two riders i and j are compatible if (and only if) \(l_i + l_j \le s\) and \(u_i + u_j \ge s\).

Sort the riders by l. Iterate through them in decreasing order of l, trying to find a match for each one (unless already matched). We can maintain a set S of riders j which satisfy \(l_i + l_j \le s\) (which grows each time l decreases). From S, take the rider with the smallest \(u_j\) such that \(u_i + u_j \ge s\) (if any exists).

The intuition is that any match currently in S will always remain in S (unless matched) later on, so there is no benefit to matching with any rider other than the one whose u value will be the hardest to match.

E: Flipping Container

We will try to consider all sequences of flips that might lead to the desired answer. Since that's a large (actually infinite) space, we'll try to identify symmetries that allow the search space to be cut down to a more reasonable size.

With "orientation" defined as it is, there are 6 possible orientations. We can write them as abc, cba, cab, bac, bca, acb, according to which sides are aligned to the x, y and z axes respectively. There are four possible transitions from each orientation: two of them will go to the next orientation in that sequence (wrapping around), and two to the previous one.

Each transition can also be labelled according to the amount it to adds to x or y (possibly negative). It follows that to determine where the container ends up, it's sufficient to know how many times each transition occurs, and the order doesn't matter. Conversely, if we decide how many times to use each transitions, then we can reach the corresponding position provided that the conditions for an Eulerian cycle are satisfied (connected and in-degree equals out-degree at every vertex).

Transitions occur in pairs with the same start and end orientation, corresponding to opposite edges to flip over. For example, from abc one can transition to cba either with x+=a or x-=c. To reduce the number of possibilities to cover, we'll start by just deciding how many times each pair appears, then later decide how to split that count amongst the elements of the pair. I'll call such a pair an "edge".

Between each adjacent pair of orientations there is an edge in each direction. Let's call the difference between the frequencies of these edges the "balance" of the pair of orientations. The Eulerian cycle condition requires the balance of each adjacent pair to be the same, and in fact this corresponds to the number of times the Eulerian cycle "winds" around the centre of the cycle. It's not hard to check that the effect of winding one way around is the same as winding the other way, so without loss of generality we can assume that this winding number/balance is either 0 or 1.

We also have to consider the connectivity requirement for an Eulerian cycle. If the window number is 1 this is trivial (we clearly visit all vertices). Otherwise, we can iterate to decide how many vertices we proceed clockwise and anticlockwise from the starting orientation.

With those outer loops (to decide winding number of which vertices are reachable), we now just need to determine how many times each edge gets used. For each pair of adjacent orientations we need only pick one number, since we've already determined the balance. We'll also need to know how to split each edge into the two component transitions. We can cut down the search space further here by noting that half of the edges affect only x and the other half only y, so we can solve for them independently.

At this point we can use dynamic programming. Consider the edges between abc and cba for example. The forward transitions provide X deltas of +a and -c, and the reverse transitions provide -a and +c. A forward-and-back thus provides either -a-c, +0 or +a+c. (Note that while the +0 case is normally useless, one might require one of them to satisfy the graph connectivity requirements.) Similarly, other pairs of edges provide -c-b or +0 or +c+b and -b-a or +0 or +b+a. This is a slightly modified knapsack problem (due to the negative values) but the usual dynamic programming approach works with minimal change (the negative values just require iterating the update in the reverse direction). However, the possible range of inputs is far too large to handle with dynamic programming. When the sum is very large, one can note that it will always be optimal to make up "most" of the sum using the largest component (in particular, if the largest component is q, then there is no point using q of a smaller component p, since this can be replaced by p instances of q). So we can bound the DP to "small" values, and then for each small value with the right remainder modulo q, determine the number of q's to add.

F: Friendly Rivalry

If the separation is to be at least d, then any pair of chapters separated by less than d must be the in same team. So create a graph with an edge for each such pair, and identify the components. Then check whether we can pick a subset of the components to form a team of size n, which is a classic knapsack problem. This gives a test for whether the answer is at least d; a binary search completes the solution.

G: Kindergarten

This is another tough one where I just kept refining my solution until it passed, but I don't have a water-tight proof.

In general, a directed graph where every vertex has out-degree 1 has a known structure: each (weakly-)connected component consists of a set of trees whose roots are joined into a cycle. For this problem, the constraints on coolness ensure that there is only one such component in the jealousy graph, although it's not clear why that's part of the problem except perhaps to make the implementation slightly easier.

We'll build a directed graph of relationships of the form "A must go before B" which is sufficient to prevent any surprises. Provided we ensure this graph is a DAG, we can then find a topological order on it. This DAG will be mostly the jealousy graph (with edges reversed), but since that graph has a cycle we will need to modify it to use "like" relationships instead in some cases (I'll call these "crushes").

Start by finding the cycle of tree roots. We'll have to remove one of the edges from this DAG; we can just iterate to pick which one. Let's say we're going to remove the edge "P must come before Q" (meaning that Q is jealous of P). To prevent surprises, we'll instead need to add Q → R and R → P where Q's crush is R. In some cases this may be sufficient to produce a DAG, but if R is a descendant of P then we have a new problem, because this will produce a cycle involving R and P. The key observation is if we now pretend that P is jealous of R, this looks a lot like the original problem, just constrained to the tree rooted at P, and hence we can solve it recursively. There are a few tweaks needed in the recursive step though:

When choosing which edge of the cycle to cut, we cannot cut the edge R → P, because that's needed to prevent Q leaving a surprise for P.
Let R be jealous of S and have a crush on T. If we cut the edge S → R then we'll be adding the edge R → T, and if T = Q we'll have created a cycle. More generally, this is a problem if T is a predecessor of Q in the DAG we're constructing. With a single level of recursion, Q has no predecessors, but when we cut the S → R edge at consecutive levels of recursion, we will accumulate a chain of predecessors.

This recursive search will either produce a DAG which we can linearise, or fail to find a solution. I don't yet have a proof that there is no solution in the latter case.

If each recursive call takes time proportional to the size of the whole tree, this will be too slow (quadratic time). Instead, each call should do self-work that scales with the size of the cycle: the edges in this cycle are not processed again at lower levels of the recursion, so this would make the algorithm linear-time. In practice, there is likely to be an extra log factor. The trickiest part is determining whether R is a descendant of P. I enumerated the vertices in a depth-first walk, which makes the descendants of any node a contiguous range. I then maintain a data structure that records the current root for each vertex. This could be done easily with a segment tree, but I used an ordered map that stores contiguous ranges that have the same root.

H: Maxwell's Demon

There are two parts to this problem. The first part is to determine the times at which a single particle will reach the demon. Note that letting a particle through just mirrors its future path, so has no impact on when it will next reach the demon again. I'll just discuss a particle in the right-hand side, since a particle on the left-hand side can be handled symmetrically (and in fact my solution maps all particles to the right-hand side and just records whether they need to swap sides).

To avoid the complications of reflections, it's easier to conceptualise the chamber as having size 2w × 2h, with the particle wrapping around to the other side whenever it hits a wall but maintaining its velocity. The other 3 copies of the chamber are like images in the mirror if the chamber has mirrored sides.

It's pretty simple to determine when and where the particle will first hit the dividing wall. It's also clear that it will hit it every \(\frac{2w}{\lvert v_x\rvert}\) seconds, and we can determine the change in y over that time. To determine when the particle will hit the demon or its mirror image at \(y = 2h - d\), we just have to solve a linear equation modulo 2h, which requires just finding a modular inverse. This becomes a little simpler if all coordinates and time values are scaled up by \(\lvert v_x\rvert\), which makes all values integral. This will give both an initial hit time and a period; or the linear equation may have no solution.

There are a few corner cases:

If d = 0 or d = h, then the demon and its reflection are the same, and one may need to handle this to avoid counting encounters with the demon twice.
If the particle is moving purely vertically, it will never hit the demon (avoid dividing by zero).
If the particle is moving purely horizontally, it might or might not hit the demon, but the calculation may need to be handled differently. More generally, if successive hits on the dividing wall have the same y value, this needs to be handled carefully to avoid division by zero.

The second part of the problem is to deal with the interference caused by multiple particles hitting the demon simultaneously. One can simulate time by keeping a priority queue of hit events ordered by time. Each time there is a hit, identify all the particles that hit at that time. These particles either all swap or all stay where they are. You need to determine whether there is a set of these times whose XOR corresponds to the necessary set of particles swapping. That's just a linear algebra problem over Z₂.

The naïve approach would be to try to solve the linear equation from scratch over all events at each time. However, one can speed things up by keeping the result vectors from the previous Gaussian reduction. The new vector is then added as a new row, and reduced using all the previous rows. If it's reduced to all zeros, it can be ignored, as it is just a linear combination of existing options. Otherwise, it can be kept as a new row, and used to reduce the right hand side.

Finally, we need to decide whether to give up. Because the velocities are all integers, after 2w seconds the X coordinates will all be the same as at the start, and after 2h seconds the same applies to the Y coordinates. Hence, the whole system is cyclic with period 2wh (or a factor thereof), and so we can give up after 2wh seconds.

I: Steppe on it

This is a fairly standard tree DP. We'll solve the problem of determining how many fire-engines are needed for a given response time. The optimal response time for a given number of fire-engines is then found by binary search.

So assume the response time must be at most T. For each subtree, calculate:

The minimum number of fire-engines (placed within the subtree) needed to cover all the cities in the tree, and the minimum possible response time this gives for the root (and hence how much coverage that engine can give above the root).
Whether it is possible to save one fire-engine by having the root and possibly some descendants get their service from an external fire-engine, and the maximum response time that engine may have for the root.

To compute these (recursively), treat the current node as a leaf and fill in the values, then add an edge to each child tree in turn. When adding an edge, one considers the different possibilities for the two trees that are being joined.

J: The Silk Road… with Robots!

Let's first consider how to solve the problem if it doesn't have any updates. Firstly, note that there is never any benefit to having two robots cross over or even meet, since they will be duplicating work. Now consider the road segment between any two locations ("locations" meaning robot or shop locations), and what tracks robots take across it. There are 5 possibilities:

A robot crosses it left-to-right.
A robot crosses it right-to-left.
A robot crosses it left-to-right, then returns (right-to-left).
A robot crosses it right-to-left, then returns (left-to-right).
No robot crosses it.

One can now sweep the locations left-to-right, computing \(d_{i,s}\) which is the maximum profit over locations [0, i] where the segment to the right of location i has type s. To compute \(d_{i,s}\), one needs to consider each possible type t to the left of location i, check whether t and s are compatible (e.g. don't require a robot to materialise from this air) and compute a profit from \(d_{i-1,t}\), the tenges collected at i (if any) and the costs of robot movements due to s. I used a 5×5 table to track which states are compatible depending on whether there is a robot or shop at the location.

Running this after every update will be too slow, and we clearly need a way to do incremental updates. Instead of a left-to-right sweep, we can treat this as a divide-and-conquer DP. Let \(d_{i,j,s,t}\) be the maximum profit over [i, j) with type s to the left of i and type t to the left of j. We can combine this with \(d_{j,k,t,u}\) to produce a candidate for \(d_{i,k,s,u}\). Thus, if we can solve the problem recursively for the left and right halves of a range, we can combine those solutions to compute the solution for the full range. These partial solutions can be stored in a segment tree.

When we receive an update, we can update one leaf in the segment tree, then propagate the update up to the root in logarithmic time. Note that since we have the full set up of updates available offline, we can determine the locations in advance. A location that has not yet received any update behaves just like a shop with no tenges.

K: Tower of noiHa

This is another very tricky one where I fiddled with my solution until it passed, but for which I don't have a formal proof of correctness.

The first step is to determine the initial state. This is relatively easy: if the first (most significant) bit is 1, then the biggest disk has been moved to the right-hand stack, and the remaining bits describe moving the remaining disks from the middle to the right. Otherwise, the biggest disk hasn't yet been moved, and the remaining bits describe moving the remaining disks from the left to the middle. This decoding process can be applied recursively.

We can dispose of a simple case first: if the initial position is valid (no disk on top of a smaller disk), then a similar procedure to the above allows the state to be turned back into a binary representation of the number of moves remaining. To let's assume that there is a disk (call it d) that is on top of smaller disk(s). We'll divide the disks into three sets: A is the disks smaller than d and not sitting under d, B is the disks bigger than or equal to d, and C is the rest (smaller disks trapped under d).

Consider the movements made just with disks in B. Since these disks are properly stacked big-to-small relative to each other, we can move them in the standard way for Hanoi. This tells us where the disks in A need to be placed to allow the first move from B, which can be similarly computed. After this, and until we move d, each time we need to make a move from B we also have to shift all the disks in A, which means the total move costs \(2^{\lvert A\rvert}\). Once we move d, the whole game will have reached a valid state, and we can solve it in a standard way.

There is another special case: if all the disks from B are already on the right-hand stack, then the algorithm above will not try to move them at all, but this will leave C out of order. So in this case one needs to move all disks from A to one of the other two stacks, then move d to reach a valid state. It's unclear which stacks to use, so I just try both possibilities and take the cheaper one.

L: Where am I now?

I was a little surprised by the low number of solutions on this — but interactive problems are always a little more difficult to get right.

There is a fairly generous limit on the number of steps. It's not actually necessary to use the exact distance information given; it's sufficient to know whether there is a wall in the next cell. My approach was:

Do a depth-first exploration of the map to determine the shape of the connected component one is in. Call this a "fragment".
Consider all possible translations and rotations to determine how your fragment could be placed on the map. Call these "transformations".
Consider each empty cell within your fragment. If there is a cell which corresponds to the same map position under all transformations, then move to that cell and declare your position. Otherwise, the problem is impossible.

Entropy always increases

Sunday, January 05, 2025

ICPC World Finals 2024 solutions