Talk:Hopcroft–Karp algorithm

Mathematics B‑class Low‑priority

	Mathematics portal This article is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.MathematicsWikipedia:WikiProject MathematicsTemplate:WikiProject Mathematicsmathematics
B	This article has been rated as B-class on Wikipedia's content assessment scale.
Low	This article has been rated as Low-priority on the project's priority scale.

Computing Stub‑class

	This article is within the scope of WikiProject Computing, a collaborative effort to improve the coverage of computers, computing, and information technology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.ComputingWikipedia:WikiProject ComputingTemplate:WikiProject ComputingComputing
Stub	This article has been rated as Stub-class on Wikipedia's content assessment scale.
???	This article has not yet received a rating on the project's importance scale.
	This article has been automatically rated by a bot or other tool as Stub-class because it uses a stub template. Please ensure the assessment is correct before removing the `\|auto=` parameter.

This needs an explanation of the "run really fast" step (http://www.fysh.org/~katie/computing/methodologies.txt)

Delta

Is $N\Delta P$ the only standard terminology we can use? I would prefer to leave $\Delta$ to the maximum degree of the graph. Adking80 (talk) 22:06, 18 July 2008 (UTC)[reply]

I have seen

N\oplus P

(symmetric difference) used in several papers. It seems a much better notation to me. Also, the meaning of

\Delta

is not explained on the page. I have updated the page accordingly. —Preceding unsigned comment added by Michael Veksler (talk • contribs) 22:29, 9 October 2008 (UTC)[reply]

Shortestest

Is shortestest a word? ;D User:Ozzie (talk) 20:37, 7 March 2008 (UTC)[reply]

Maximum vs. maximal

Changed the incorrect "maximum" to "maximal". —Preceding unsigned comment added by 80.145.225.62 (talk • contribs) 22:44, 3 October 2006 UTC

I agree. Thanks for persisting after the incorrect revert. —David Eppstein 05:55, 4 October 2006 (UTC)[reply]

I am sorry for the incorrect revert. I now think that I was wrong. I started this article because the algorithm was mentioned in the text of another article. The only reference I had was CLRS, where it said maximum. So when I saw this edit from an unregistered user, I checked in CLRS, saw that it said maximum, and assumed that the editor was either in error or mischevious. Sorry about this. Let me check that we agree with terminology:

If we go through a one-dimmensional state-space (x-axis below), where each state has a value (y-axis below),

           B                                                                   
           /\                                                                  
   A      /  \                                                                 
   /\    /    \                                                                
  /  \  /      \                                                               
 /    \/        \                                                              
/                \

then here both A and B are maximal points, while only B is a maximum. This is the terminology used in the article Matching.

The description in CLRS is actually just given as an exercise. I tried to solve it, but could not see how to find "maximum set of vertex-disjoint shortest augmenting paths" in

O(V+E)

, so I just wrote the pseudocode for the algorithm as described, and put it in my ever-growing TODO to figure out the missing part. I now found a description of the pseudo-code which still says maximum in the comments, but the code clearly gives only something maximal. So the problem was simpler than I thought :) Thanks for the correction, and sorry about my error. Klem fra Nils Grimsmo 09:16, 4 October 2006 (UTC)[reply]

Difference from Dinic algorithm (Dinits' algorithm)?

What's the difference between this algorithm and Dinic algorithm? To me, it just looks like Dinic algorithm - only run on a specific input, so it has better time complexity than general-case Dinic. --Lukas Mach 22:27, 27 December 2006 (UTC)[reply]

It's none... H-K is 1973 algorithm and Dinic is 1976 algorithm. --Lukas Mach 01:55, 26 February 2007 (UTC)[reply]

Another estimations

Another estimations are O((V+E)V ^1/2) and O(V ^5/2): Lipski W., Kombinatoryka dla Programistow, [Combinatorics for Programmers], Wydawnictwa Naukowo-Techniczne, Warzawa, 1982.--Tim32 (talk) 10:58, 19 November 2007 (UTC)[reply]

Both bounds are worse. We can assume that every vertex is incident to at least one edge, so

V+E=O(E)

. We also know that

E<V^{2}/2

. Adking80 (talk) 22:04, 18 July 2008 (UTC)[reply]

Pseudocode

Is it just me, or is the pseudocode rather hard to read? For example, the first loop tripped me a bit, because it is some akward mixture of high-level functions ((u,v) in E) and low-level implementation (for node u, node v). Furthermore, the code checks for nodes being in M. I do not understand this, as M is a matching, and thus, a set of edges? --Tetha (talk) 05:49, 11 June 2008 (UTC)[reply]

M is a matching. When the code asks whether a vertex is in M, it means, is it an endpoint of one of the edges in M? I don't think it's any shorter or clearer than what's here, but if it would help you to see actual code for this algorithm, I have some here. (You can ignore the "imperfections" routine in that file; it solves a different and somewhat more complicated problem.) —David Eppstein (talk) 06:32, 11 June 2008 (UTC)[reply]

At least for me, the python code helps a lot. My definition of a matching came from graph theory, where a matching was just a set of edges that encode the matching (that is, an edge (u,v) in M means: u is matched with v). Your matching is a mapping that maps a node u to the node v, with u being matched with v. In your world, it makes a lot of sense to ask: node in matching, in mine, it did not (I rather would ask: exist an x with (x, u) in M.). Thank you. --Tetha (talk) 09:39, 11 June 2008 (UTC)[reply]

LIFO queue?

Page says:

We also maintain a queue(last in first out datastructure)

Last in, first out is a stack, no? —Preceding unsigned comment added by Talex (talk • contribs) 11:40, 20 July 2008 (UTC)[reply]

Queue is FIFO, stack is LIFO...I seem to be strange. 210.122.65.2 (talk) 07:50, 18 August 2008 (UTC)[reply]

Incorrect pseudo-code

I am sorry that I was trigger-happy with the main article. Only after I have completed my edit did I realize that it should have been discussed on this forum first. (First time to edit a Wikipedia page).

Did not touch the pseudo-code itself, only the part that explains it. It may take me much longer to write a readable and correct version of the pseudo-code.

The pseudo code is not complicated without a reason, it is incorrect. I realized it after I have implemented the algorithm only to see that in some cases it requires almost |V| iterations (rather than the promised $O({\sqrt {(|V|)}})$ ).

The main problem with the pseudo code (other than clarity) is that the BFS stops immediately when it discovers an augmenting path. The is against the way the original algorithm works, when it generates a maximal number of augmenting paths each time.

The proof that only $O({\sqrt {(|V|)}})$ iterations are needed is based on the premises that each BFS iteration will return strictly longer augmenting paths. The current pseudo-code does not work this way.

For example, consider a graph with edges (divided into n subsets):

(u01, v01) (u01, v02) (u02, v01)
(u11, v11) (u11, v12) (u12, v11)

...

(u_n1, v_n1) (u_n1, v_n2) (u_n2, v_n1)

With initial matching marked in bold The pseudo-code will require n iterations, when the real algorithm requires only one. Michael Veksler (talk) 23:09, 9 October 2008 (UTC)[reply]

Copied the suggested pseudo-code to the main page. Removed it from here to avoid redundancy of a big piece of information. Michael Veksler (talk) 07:21, 11 October 2008 (UTC)[reply]

The pseudo-code was introduced byMichael Veksler (talk) 01:31, 10 October 2008 (UTC)[reply]

Minor fixes to the previous version of the algorithm (also by me) Michael Veksler (talk) 06:39, 10 October 2008 (UTC)[reply]

This pseudo-code is huge. Maybe it should be rewritten in python as Tetha (talk) has suggested. Unfortunately, I have never written more than few lines in python at a time.

Log: Using u.matching instead of M throughout the algorithm. This gives the right sense of the O(1) nature of operations on M. Michael Veksler (talk) 08:30, 10 October 2008 (UTC)[reply]

Log: Put constraints over "for all" elements at the "for all" line, rather than on a separate if line.

I have implemented the pseudo-code and it does not seem to be working (Kursancew) —Preceding unsigned comment added by 189.95.58.234 (talk) 15:38, 25 November 2008 (UTC)[reply]

I regret that this pseudo-code came out to be as huge as it is. Maybe it will be possible to refine it into something saner than what I could come up at the time. I have a perl implementation of the algorithm that works, but its readability is poor. If anyone wants to have a look at that perl implementation I will be happy to mail or publicize this implementation. Michael Veksler (talk) 09:34, 16 December 2008 (UTC)[reply]

Terminology: augmenting and alternating paths

It would be nice if the definitions of augmenting/alternating paths in Matching#Definition and Hopcroft–Karp algorithm#Alternating paths matched. — Miym (talk) 23:04, 5 March 2009 (UTC)[reply]

It turns out that the Matching article's terminology matches the terminology in Ahuja, Magnanti, and Orlin. So I'll fix up this article to match. —David Eppstein (talk) 23:19, 5 March 2009 (UTC)[reply]

Weighted graphs?

I read some papers that, for a maximum weighted bipartite graph matching, claim they use Hopcroft-Karp instead of the Hungarian algorithm, since it's O(n^2.5) instead of O(n^3). See for example "Tracking Across Multiple Cameras With Disjoint Views" (Javed, Shah, 2003). Are these guys just clueless people trying to make it seem like they're using the new "hip" algorithm that is faster instead of the "old" Hungarian algorithm? Hopcroft-Karp has nothing to do with weights. Or am I missing something? This can be considered a widely referenced paper in Computer Vision. Jotaf (talk) 18:22, 31 August 2010 (UTC)[reply]

If you're missing something, so am I; I don't know of a weighted version of Hopcroft-Karp. —David Eppstein (talk) 19:33, 31 August 2010 (UTC)[reply]

Error in the algorithm

I think there is a principle error both in the high-level description and in the pseudo-code implementation. The problem is that the algorithm may find "paths" from U to V having the same start and end point, so it may only find M-alternating cycles. (These cycles then automatically have odd length). The cycles can not be used to augment the matching.

In the worst case it may happen that all unmatched vertices in the last layer only give rise to cycles, so a phase may not yield an augmenting path because it stops too early.

An (awkward?) solution would be to store during the breadth first search for each node v the following information: I(v): A list of all unmatched nodes u such that there exists a shortest M-alternating path from u to v.

Clearly it is possible to compute I(v) during the breadth first search eficiently (staying at O(E) runtime).

When a layer contains an unmatched node v, the information I(v) enables us to decide whether we have found a loop only (iff I(v) contains only the node v) or whether we have really found an augmenting path (iff I(v) contains a node != v).

The deapth first search must also be slightly adapted. When starting from node u in the last layer, we only explore a vertex v if I(v) contains a node distinct from u. (The point is essential. Otherwise the subtree fails to produce a valid augmenting path for u, but it may still yield an augmenting path for other nodes, so we would have to explore the subtree again).

There may be easier solutions, but Blum ("A Simplified Realization of the Hopcroft-Karp Approach to Maximum Matching in General Graphs", in the references) also uses some additional information that he stores in the vertices.