Jump to content

Talk:Selection algorithm/GA1

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by David Eppstein (talk | contribs) at 20:45, 5 August 2023 (Algorithms: r). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

GA Review

GA toolbox
Reviewing

Article (edit | visual edit | history) · Article talk (edit | history) · Watch

Reviewer: RoySmith (talk · contribs) 14:31, 5 August 2023 (UTC)[reply]

Starting review RoySmith (talk) 14:31, 5 August 2023 (UTC)[reply]

Problem statement

  • it should be possible to sort the values into an order from smallest to largest; for instance, they may be numbers... Maybe it's better to say "they may be real numbers", to explicitly exclude the non-orderable complex numbers.
  • some sources may assume that the values are all distinct from each other.... The word "source" could refer to "source code" or "a reference in the literature". I'm pretty sure you mean the later, but perhaps a different word would remove the ambiguity.

Algorithms

  • selection of the kth smallest value in a collection of values can be performed very simply... Delete "very". Or maybe "very simply" -> "trivially".
  • if the output of the sorting algorithm is an array, jump to its kth element... I assume the intent of "is an array" is that it's some data structure which can be indexed in constant time, not strictly an Array (data structure).
  • A careful design of these factories leads to an algorithm that, when applied to median-finding, uses at most 2.942 comparisons. I don't have access to the source; I assume it explains where 2.942 comes from. Is it feasible to give a short description here of how that mysterious number is derived?

Language support

  • Python ... A linear-time selection algorithm is not included. needs a better citation than a link to a code repository.
  • If a WP:RS is available, it would be interesting to explore why the STL and Python implementations differ (one provides O(1), the other doesn't). Is there something intrinsic to the languages which drove this? As an aside, I did a little hunting and was surprised to find that Java (or even guava) doesn't supply anything.

Approximate algorithms

As a thought for possible expansion, there's lots of places where you want "approximately the k-th largest". A search results page will want to return the k-th best matches but it's OK if that's approximate. If relaxing the exactness criteria gives to a significant speed improvement, that may be a good tradeoff. It would be interesting to explore variations on selection algorithms which allow this.

Specific GA criteria