Runtime improvement for Minimum Cut algorithm first time in 25 years (2019)

chaoxu · on Jan 11, 2021

There are a few misconceptions that some other commenters are already making.

This min-cut is different from the max-flow min-cut theorem's min-cut. This is what referred to as global min-cut, which does not fix the terminals. In max-flow min-cut theorem, it is about disconnecting two fixed vertices s and t. In the global min-cut, it is about disconnecting two arbitrary vertices.

Source: thesis was on hypergraph min-cut

est31 · on Jan 11, 2021

So the question is to find an arbitrary set of edges that partition the graph?

In a K_n with equally weighted edges this problem seems easy: just take a random vertex and remove the edges connecting it to the neighbourhood. Any attempt at removing more vertices would yield in having to cut more edges, so the strategy removes the least amount of edges while partitioning the graph.

In a tree with equally weighted edges, this problem seems easy too: just remove a leaf. Every tree has leaves.

I can't think about graphs "between" the two creating a property that makes the the "I just take the vertex with the fewest adjacencies and remove it" strategy moot. Is it trivial with equally weighted edges? Is the problem only interesting when edges have differing weights?

wdevanny · on Jan 11, 2021

Take two copies of K_4. Connect them with a single edge. The minimum cut is the single connector edge while your proposed strategy would delete the three edges out of one of the vertices not incident to the connector edge.

est31 · on Jan 11, 2021

Makes sense. Thanks!

londons_explore · on Jan 11, 2021

Can someone give a more concrete example of this problem?

I read the wiki page[1], and I'm left with the impression that for almost all cases, the minimum cut will be to pick a node and seperate it from the rest of the nodes. It doesn't sound like a very computationally difficult or useful algorithm if one of the two parts of the results usually consists of a single node...

[1]: https://en.wikipedia.org/wiki/Minimum_cut

est31 · on Jan 11, 2021

The classical example used to introduce this problem in lectures is the one about the soviet rail network: https://jeffe.cs.illinois.edu/teaching/algorithms/book/10-ma...

Legend has it both Soviets and Americans studied it, both identifying the bottleneck for opposite reasons: Americans wanted to disrupt as much as possible by destroying as few rails as possible while Soviets wanted to extend the network's ability to transport goods with building the fewest connections... Both problems are equivalent, known as the max flow min cut theorem.

interpol_p · on Jan 11, 2021

I used the minimum cut algorithm to implement a texture synthesis algorithm [1]. When the images to merge are represented as a graph of connected pixels, you can use the minimum cut algorithm to try to find the lowest energy path that cuts across both images

This minimises the differences between the images along the seam so they appear to blend together

[1] https://www.cc.gatech.edu/~turk/my_papers/graph_cuts.pdf

xdavidliu · on Jan 11, 2021

> for almost all cases, the minimum cut will be to pick a node and seperate it from the rest of the nodes.

You're assuming that nodes have 1 or 2 edges coming out of them. If every node has up to (# nodes)-1 edges incident on it, then it's non-obvious which of the 2^(# node) subsets gives the min cut.

It's definitely not justified to only consider the subsets with 1 nodes: what if you have 10 nodes: 5 blue and 5 yellow, and the blue ones are all connected to each other with extremely heavy edges, and similar for the yellow nodes, and there is one very light edge between the two regions? Then in this case, the min cut would be to separate blue from yellow (i.e. cut off a 5-node subset), _not_ 1 node.

The general case of arbitrary nodes and edges will be somewhere in between what I have described and what you're thinking of, so a general algorithm that works for all cases is definitely not as trivial as just trying out every 1-node subset.

ivanbakel · on Jan 11, 2021

Wikipedia has a nice diagram on another page[0] - in a weighted graph where all nodes have enough connections, separating just a single node is likely to cost more than finding some cheap partition between two groups.

[0] https://en.wikipedia.org/wiki/Stoer%E2%80%93Wagner_algorithm...

wdevanny · on Jan 11, 2021

Repeating a concrete example from another comment:

> Take two copies of K_4. Connect them with a single edge. The minimum cut is the single connector edge while your proposed strategy would delete the three edges out of one of the vertices not incident to the connector edge.

0x23 · on Jan 11, 2021

On that note, here is the currently fastest implementation for the problem https://arxiv.org/abs/1808.05458 and here on github https://github.com/viecut/viecut

saagarjha · on Jan 11, 2021

Question: is this algorithm practical to implement?

vsskanth · on Jan 11, 2021

What is it used for

krastanov · on Jan 11, 2021

This is a bit like asking what is matrix diaginalization used for. The applications are much too numerous in many different fields of science and engineering. "Graph partition" problems generally search for ways to separate a graph into subgraphs such that some measure is optimized. E.g. resilience of network connections, immunizations, some error correction algorithms, optimizing flow of stuff through nodes, logistics. This list probably barely scratches the surface.

Edit: my comment sounds a bit rude on second read. My apologies, it is the limitations of this medium.

KMag · on Jan 11, 2021

There are tons of applications, but many of them come down to variations on breaking up some universe of things to minimize the downsides of breaking them up.

If you have a bunch of chat users and a bunch of chat servers, and you want to figure out how to assign users to servers in a way that minimizes the number of messages going between your chat servers, then you're solving a min cut problem on a graph where users are nodes, edges are communications, and edge weights are communications volumes. Any links that you cut in order to partition the graph end up being cross-server traffic.

at_a_remove · on Jan 11, 2021

Thank you. That's really an excellent example. I recall the early days of IRC networks being plagued by frequent "splits" as different chunks of the graph became disconnected, which was largely a result of the architecture of the servers and is probably the worst-case scenario for this sort of thing, but your example is immediately graspable.

vsskanth · on Jan 11, 2021

Thanks. I wasnt being snarky. I really didn't know what such algorithms are used for and the paper didn't mention examples either.

anticensor · on Jan 11, 2021

This problem is assumed to be common knowledge among graph theory specialists.

Fordec · on Jan 11, 2021

Everyone on Hacker News is not assumed to be graph theory specialists.

strictfp · on Jan 11, 2021

That's assumed common knowledge for non-graph theory specialists.

bosswipe · on Jan 11, 2021

I would say it's common knowledge for anyone with a CS degree.

aliahsan07 · on Jan 11, 2021

its a bit advanced. Many university curriculums just skim through max flow min cut algorithms in a week or two as part of the algorithms class.

adamnemecek · on Jan 11, 2021

Isn’t graph partition a form of diagonalization?

dataflow · on Jan 11, 2021

What is the relationship? I guess most of us are not spectral graph theory experts. Are you looking at the adjacency matrix or the Laplacian matrix?

adamnemecek · on Jan 11, 2021

http://www.math.ucsd.edu/~fan/talks/mlg.pdf

dataflow · on Jan 11, 2021

Okay yeah, Cheeger's inequality and all those are straight up graduate-level theoretical computer science. Definitely not something you can just throw into a discussion assuming people know what you're talking about!

j16sdiz · on Jan 11, 2021

It is used for calculate the maximum flow of a flow network. https://en.wikipedia.org/wiki/Max-flow_min-cut_theorem#Appli...

stiar · on Jan 11, 2021

Well if you have designated source (s) and sink (t) in a directed network, then yes, finding max s-t flow and min s-t cut are dual problems. Here the paper seems to be dealing with undirected graphs and all cuts in them, not necessarily between a given pair of vertices, so it is slightly different.

PartiallyTyped · on Jan 11, 2021

The max flow is the prime to the min-cut. But you don't use min-cut to find the max flow.

vmladenov · on Jan 11, 2021

I don’t understand the distinction you’re drawing. Can you explain? Max flow and min cut are entirely dual. Which one you use is more a matter of convenience

PartiallyTyped · on Jan 11, 2021

What I meant was that if you wish to find the maximum flow, you'd use push-relabel or another max-flow algorithm instead of finding the min-cut first and then building the max-flow solution from there.

xiphias2 · on Jan 11, 2021

It is a fundamental graph problem with many applications in different fields, such as network reliability, where assuming equal failure probability edges, the smallest edge cut in the network has the highest chance to disconnect the network; in VLSI design, where the minimum cut can be used to minimize the number of connections between microprocessor blocks; and as a subproblem in branch-and-cut algorithms for the Traveling Salesman Problem (TSP) and other combinatorial problems.

(Copy paste from viecut github repo)

vsskanth · on Jan 11, 2021

Thanks.

glglwty · on Jan 11, 2021

Graph cut is useful in computer vision.

petters · on Jan 11, 2021

Still used extensively for background-forground segmentation in e.g. Photoshop, I am guessing.

Research these days mostly focuses on deep learning though.

jojobas · on Jan 11, 2021

Ban half of Twitter accounts so that as little people as possible notice.

dang · on Jan 11, 2021

"Eschew flamebait. Don't introduce flamewar topics unless you have something genuinely new to say. Avoid unrelated controversies and generic tangents."

https://news.ycombinator.com/newsguidelines.html

optimalsolver · on Jan 11, 2021

I remember a CS prof saying he moved from hardware to software because the speedups possible in algorithm design dwarf anything you can achieve by improving the bare metal.

thwero234234 · on Jan 11, 2021

It's a nice sentiment, but it's not always true.

Case in point: none of the algorithmic improvements to matrix-multiplication has had any real effect on deep-learning etc (except Winograd convolution).

andrewem · on Jan 11, 2021

There’s this example of your point, which claims that “Compiler Advances Double Computing Power Every 18 Years”

http://proebsting.cs.arizona.edu/law.html

GeorgeIsRight · on Jan 11, 2021

Alexnet software has improved 44x in past 8 years, hardware something like 11x.

https://openai.com/blog/ai-and-efficiency/

vlovich123 · on Jan 11, 2021

I don’t think that’s actually the case. HW is just harder to implement in a cost effective way as the ways you make it fast for a given problem may not lend itself to solving other problems and HW is insanely expensive to build stemming from how long and labor intensive it is (design, validation and tape out + cost of fixing bugs)

klodolph · on Jan 11, 2021

But that’s a selective way to think about cost, isn’t it? If you ignore NRE, almost anything is cheaper in hardware. But you can't ignore NRE.

vlovich123 · on Jan 11, 2021

The same algorithm in SW is drastically cheaper to implement even before you get to anything else you need for Hw. Everything else is gravy on top.

3pt14159 · on Jan 11, 2021

It takes both.

The advantage of speeding up hardware is that it scales to most present and future programs. Though it's a bit of an economic activity. As the hardware gets faster, software developers de-prioritize algorithmic speed in favour of faster product development or research.

DSingularity · on Jan 11, 2021

That is misleading. In some fields that may hold. But in other fields nothing can trump specializing the hardware to the computation and the algorithm.

onion2k · on Jan 11, 2021

nothing can trump specializing the hardware to the computation and the algorithm

Is implementing an algorithm in hardware ever done before all of the obvious software optimizations have been exhausted? If not, how would you know you're implementing the best known version of the algorithm?

It seems to me that hardware is a route to improvement, but only after all of the software improvements have been done. If you want to spand your career making computers do stuff faster there'll always be more work to do in software, simply because if you can make something fast enough there you don't need to implement it in hardware.

samus · on Jan 11, 2021

One would assume that hardware implementations are attempted only for tried&true algorithms that saw no improvements in decades. Developing software is usually easier than developing hardware, and does not require the overhead of insane manufacturing pipelines and the associated logistics. Furthermore, FPGAs often make it possible to do product development and establish oneself on the market _now_. Even if improvements on the original algorithms are found, competitors would have trouble gaining a foothold in the market if the gains from the new algorithms are their only unique selling points.

Then again, there is this crypto-mining boom, where custom ASICs and GPUs literally pay for themselves. There, the associated algorithms were specified to be slow, and no improvements on them can be reasonably expected unless quantum computing takes off for real and opens new venues.

hvidgaard · on Jan 11, 2021

To state it in another way. A less than optimal algorithm implemented in hardware, can be completly dominated by algorithmic improvement. If the improved algorithm is implemented in hardware it would be the new fastest implementation.

That said, when you implement specialized algorithms in hardware, it's likely because it's good enough for the task it's designed for.

fredophile · on Jan 11, 2021

I don't specialize in hardware but I'd guess this is very situation dependant. Algorithms that can easily be implemented on FPGAs and parallelized could probably see easy gains. There are also real world considerations that big o notation doesn't cover. Sorting algorithm A may be faster than algorithm B due to better cache usage even if B is algorithmically better.

kingsleyopara · on Jan 11, 2021

Did your CS prof work for Intel?

saagarjha · on Jan 11, 2021

Depends. At some point your buffer no longer fits in L1D and there’s nothing software can do about that…

jayd16 · on Jan 11, 2021

Algorithm design can lower the buffer size needed.

saagarjha · on Jan 11, 2021

Depends. Many algorithms are “solved” to the point where keeping all the data in memory is a requirement, which means you can’t do much.

klyrs · on Jan 11, 2021

[2019]

dang · on Jan 11, 2021

Added. Thanks!