## Friday, June 27, 2014

### More ICPC analysis

Now we're getting on to the harder problems. Today I've cover two that I didn't know how to solve. Some of the others I have ideas for, but I don't want to say anything until I've had a chance to code them up.

Firstly, problem I. This is a maximum clique problem, which on a general graph is NP-complete. So we will need to use the geometry in some way. Misof has a nice set of slides showing how it is done: http://people.ksp.sk/~misof/share/wf_pres_I.pdf. I had the idea for the first step (picking the two points that are furthest apart), but it didn't occur to me that the resulting conflict graph would be bipartite.

Now problem G. I discovered that this is a problem that has had a number of research papers published on the topic, one of which achieves $$O(N^3)$$. Fortunately, we don't need to be quite that efficient. Let's start by finding a polynomial-time solution. Let's suppose we've already decided the diameters of the two clusters, D(A) and D(B), and just want to find out whether this is actually possible. For each shipment we have a boolean variable that says whether it goes into part A (false) or part B (true). The constraints become boolean expressions: if d(i, j) > D(A) then we must have i || j, and if d(i, j) > D(B) then we must have !i || !j. Determining whether the variables can be satisfied is just 2-SAT, which can be solved in $$O(N^2)$$ time.

Now, how do we decide which diameters to test? There are $$O(N^2)$$ choices for each, so the naive approach will take $$O(N^6)$$ time. We can reduce this to $$O(N^4\log N)$$ by noting that once we've chosen one, we can binary search the other (it's also possible to eliminate the log factor, but it's still too slow). So far, this is what I deduced during the contest.

The trick (which I found in one of those research papers) is that one can eliminate most candidates for the larger diameter. If there is an odd-length cycle, at least one of the edges must be within a cluster, and so that is a lower bound for the larger diameter. What's more, we can ignore the shortest edge of an even cycle (with some tie-breaker), because if the shortest edge lies within a cluster, then so must at least one other edge.

We can exploit this to generate an O(N)-sized set of candidates: process the edges from longest to shortest, adding each to a new bipartite graph (as for constructing a maximum spanning tree with Kruskal's algorithm). There are three cases:
1. The next edge connects two previously disconnected components. Add the edge to the graph (which will remain bipartite, since one can always re-colour one of the components. This edge length is a candidate diameter.
2. The next edge connects two vertices in the same component, but the graph remains bipartite. This edge is thus part of an even cycle, and so can be ignored.
3. The next edge connects two vertices, forming an odd cycle. This edge length is a candidate diameter, and the algorithm terminates.
The edges selected in step 1 form a tree, so there are only O(N) of them.