Jump to content

Talk:Merge algorithm

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by 209.157.132.93 (talk) at 05:15, 8 December 2005. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

I would like to know if it is standard nomenclature to call "merge algorithms" the ones that follow.

  • given a set of current account balances and a set of transactions, both sorted by account number, produce the set of new account balances after the transactions are applied; this requires always advancing the "new transactions" pointer in preference to the "account number" pointer when the two have equal keys, and adding all the numbers on either tape with the same account number to produce the new balance.
  • produce a sorted list of records with keys present in all the lists (equijoin); this requires outputting a record whenever the keys of all the p0..n are equal.
  • similarly for finding the largest number on one tape smaller than each number on another tape (e.g. to figure out what tax bracket each person is in).
  • similarly for computing set differences: all the records in one list with no corresponding records in another.

I checked several books on algorithms and in all of them "merge algorithm" is taking two or more ordered lists and producing one ordered list in linear time. Maybe the above algorithms could be in another article.

Pablo.cl 02:04, 24 May 2004 (UTC)[reply]


The only reference I've found so far that calls set difference a merge algorithm is Kragen Sitaker. I think the nomenclature is misleading, and that the examples in italics above, and also the middle one I copy below, should be in a separate article, tentatively called Sorted list algorithms.

I copy form kragen-hacks

Some examples of merge algorithms:

  • produce a single sorted list from multiple sorted lists. (This is the kind of merge used in mergesort.)
  • produce the set union, set intersection, or set difference of two sorted lists. [Set intersection and set difference would be moved to Sorted list algorithms.]
  • given a master file and an update file sorted in the same order, produce a new master file with all the updates applied.

Pablo.cl 01:50, 26 May 2004 (UTC)[reply]

Minor change to python example

The example originally had a[0] < b[0] rather then a[0] <= b[0] as the basis for using the item from the a[] array. This would not result in a stable sort, since equal elements would have been selected from the b[] array before those from the a[] array. The change ensures stability.

It's a bit of a shame that one sees so many assertions that merge sort is stable. As the uncorrected example shows, it's easily possible to write an unstable, but otherwise valid, merge sort. -- jpl


Pablo may be correct; perhaps merging a list of updates with a master file is not a common meaning for the term "merge", and perhaps the Sort-merge join in Oracle and other RDBMSs' query evaluation isn't a common meaning for the term either.

Certainly when Knuth mentions "the idea of merging" on page 385 of volume 3 2ed, he's talking about merging two sorted lists to get a sorted list containing all the items of both lists. -- KragenSitaker