Talk:Maximum subarray problem

This article is within the scope of WikiProject Computer science, a collaborative effort to improve the coverage of Computer science related articles on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.Computer scienceWikipedia:WikiProject Computer scienceTemplate:WikiProject Computer scienceComputer science

???

This article has not yet received a rating on Wikipedia's content assessment scale.

???

This article has not yet received a rating on the project's importance scale.

Things you can help WikiProject Computer science with:

Here are some tasks awaiting attention:

Article requests :
- Requested articles/Applied arts and sciences/Computer science, computing, and Internet
Cleanup :
- Computer science articles needing attention
- Computer science articles needing expert attention
Copyedit :
- Computing
Expand :
- Computer science
Infobox :
- Computer science articles without infoboxes
Maintain :
- Timeline of computing 2020–present
Photo :
- Find pictures for the biographies of computer scientists (see List of computer scientists)
- Computing articles needing images
Stubs :
- Computer science stubs
Unreferenced :
- WikiProject Computer science/Unreferenced BLPs
Project-related :
- Tag all relevant articles in Category:Computer science and sub-categories with {{WikiProject Computer science}}

Possibly Erroneous Initialization of s

In the O(n) algorithm, s (the largest calculated sum so far) is initialized as follows:

int s=1<<31;

First, this assumes that each integer is exactly 4 bytes wide, which is not the case for all architectures.

~~Second, it seems sufficient to initialize it to the first element of the array, A[0]~~

Debajit (talk) 23:29, 25 November 2007 (UTC)[reply]

I've struck out my second proposition, which would fail for sequences having exactly one element. For my first proposition, I think s is best initialized to -Infinity as (for C++)

int s = numeric_limits<int>::min()

or to MIN_INT in C

Debajit (talk) 23:45, 25 November 2007 (UTC)[reply]

Notability

I have seen a few publications in IEEE and JSTOR refering to this algorithm. I have no understanding of the algorithm itself, therefore I cannot verify the accuracy of the information here, but I would conclude that this article is based on a notable topic. --MattWatt 22:07, 10 April 2007 (UTC)[reply]

Incorrect solution

Kadane's algorithm is not a solution for general maximum subarray problem: it only finds maximum subarray starting from the brginning of array, not any subarray.

The correct algorithm is:

Kadane_s_Algorithm(arr[1..n])
begin
    (max, a, b) := (-INFINITY, 0, 0)
    curr := 0
    aa := 1
    for bb := 1 to n do
        curr := curr + arr[bb]
        if curr > max then
            (max, a, b) := (curr, aa, bb)
        endif

        if curr < 0 then
            curr := 0
            aa := bb + 1
        endif
    endfor

    return (max, a, b)
end

217.132.204.221 (talk) 05:47, 15 June 2010 (UTC)[reply]

No, it finds any subarray of the whole array. That's what the line "max_ending_here = max(0, max_ending_here + x)" is for: the points where the max is 0 are the ones where a new subarray starts. —David Eppstein (talk) 06:28, 15 June 2010 (UTC)[reply]

I confirm that the shown solution is incorrect. Given the following array: {5,-2,4,-6,1,3, -7 , 3, 10, -15, -4 ,3, -2, 16, -9, 10};

~~Applying the algorithm as shown in the page, I get a maximum subarray of 11, whereas the maximum subarray should be 13 (3 + 10)~~ — Preceding unsigned comment added by 187.194.146.244 (talk) 17:31, 16 November 2013 (UTC) (Scratch previous comment, the algorithm is correct)[reply]

Clarification of maximum sum when maximum sum is < 0.

If I pass in an array of entirely negative numbers, e.g. (-1,-1,-1,-1), why wouldn't the maximum sum be -1 instead of 0? —Preceding unsigned comment added by 76.247.17.199 (talk) 22:47, 31 July 2010 (UTC)[reply]

Because the empty sum has a larger total than sums of one or more elements. —David Eppstein (talk) 23:44, 31 July 2010 (UTC)[reply]

That's not necessarily true. If you want to include negative sums, first increase max_ending_here, then if max_ending_here > max_so_far, replace max_so_far, THEN if max_ending_here < 0, set it to 0. In that way the max will contain the closest negative member to 0 for an all negative array. (Alternatively, you could just find the max member if the resulting max == 0 and it will still be O(n)) -April 9, 2012 — Preceding unsigned comment added by 128.138.45.66 (talk) 04:21, 10 April 2012 (UTC)[reply]

Kadane's algorithm code

I am finding it difficult to believe that we should just scrap the C++ code in this section. I think that before we do that we should implement the ability to track indices in the Python version as well. The C++ code did this, but the Python code does not. Mrcsmcln (talk) 11:45, 2 October 2014 (UTC)[reply]

Wikipedia is not a code repository. We should not have 20 different implementations of the same algorithm in 20 different programming languages, or even 2. Multiple implementations are ok only when they show significant variations in the algorithms, and tracking indices is not a significant variation. Pseudocode might be a better choice than any specific language, but Python is closer to pseudocode than C++ is. —David Eppstein (talk) 15:04, 2 October 2014 (UTC)[reply]

I added an external link to the corresponding Rosetta Code page (where it is called greatest subsequential sum algorithm), there are implementations in many other languages as well Andre.holzner (talk) 12:28, 27 August 2016 (UTC)[reply]

Erroneous (or confusing) divide and conquer version

The divide and conquer version of the maximum subarray is notoriously O(n lg n) (see CLRM for instance as a reference.) I have trouble understanding the solution, for instance, what sum[f - 1] and sum[f]? the only array defined clearly in the problem is a, I don't know what sum is supposed to contain. An additional explanation would be great. — Preceding unsigned comment added by Poitevinpm (talk • contribs) 21:25, 7 January 2018 (UTC)[reply]

For my own part, I don't understand why other editors insist on devoting so much of the space in the article (and in textbooks like CLRS) to bad algorithms like the brute force and divide and conquer ones, when Kadane's algorithm is so simple (simpler than the divide and conquer) and in all respects better. —David Eppstein (talk) 21:47, 7 January 2018 (UTC)[reply]

The pseudo code for Divide and Conquer has pretty bad notation and mainly wrong complexity, it is surely O(n * log(n)). The equation at the end is wrong, time complexity can be described as T(n) = 2T(n/2) + O(n), T(1) = O(1) (see the CLRS book - Introduction to Algorithms, the algorithm is described from p. 68 in 3rd ed.). The O(n) term is coming from searching for overlapping sums in each iteration. Apart from that, I agree that Kadane's algorithm is really simple and the most efficient, so I would rather have it mentioned first. The value of other two solutions is only for showing different approaches and comparing complexities. Protivinsky (talk) 19:18, 25 February 2018 (UTC)[reply]

The pseudocode given in this article is clearly 2T(n/2)+O(1) (where do you think it takes more than constant time to combine solutions)? If CLRS gives an algorithm with different complexity then it is surely a different algorithm. —David Eppstein (talk) 19:46, 25 February 2018 (UTC)[reply]

You are right indeed, I got confused by the pseudocode and thought it is the same implementation as in CLRS (where the computation of center problem has linear cost). I didn't realize that the code uses array of cumulative sums without bothering to define it or explain it. I put together python code for the divide and conquer algorithm instead of that confusing pseudocode and I will replace it. Protivinsky (talk) 23:23, 27 February 2018 (UTC)[reply]

Discussion: Pharlan35244

@Pharlan35244:: I moved this discussion to the talk page, the proper page for things like this. Hdjensofjfnen (If you want to trout me, go ahead!) 03:47, 20 September 2018 (UTC)[reply]

Pharlan35244 said:

"(actually theta(n) is a better answer because both the lower bound and the upper bound has the same runtime. By saying the runtime is , you do not clarify that Kadane's algorithm has the same lower and upper bound, thus it is not completely correct for the run time to be )"

the solution for keeping track of the indices for the subarray is wrong

here is a working script

max_ending_here = A[0]

startOld = start = end = max_so_far = 0

for i, x in (zip (range (1, len (A)), A[1:])):

if max_ending_here + x > x: max_ending_here = max_ending_here + x

else: max_ending_here = x; start = i

if max_so_far < max_ending_here: max_so_far = max_ending_here; end = i

print (max_so_far , start, end)