Jump to content

Duplicate code

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Derek farn (talk | contribs) at 17:37, 28 January 2006. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.
(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)

Duplicate code is a computer programming term for a sequence of code that occurs more than once in a programs source code. A minimum requirement is usually applied to the quantity of code that must appear in a sequence for it to be considered a duplicate rather than a coincidental similarity.

Sequences of duplicate code are sometimes known as clones.

The following are some of the ways in which two code sequences can be a duplicate of each other:

  • character for character identical
  • character for character identical with white space characters and comments being ignored
  • token for token identical
  • functionally identical

A number of different algorithms have been proposed to detect duplicate code:

  • Baker's paper
  • Swiss work on visual clone detection
  • other people

Example of functionally duplicate code

Consider the following code snippit for caluclating the average of an array of integers

int array1[4] = {3,5,7,9}
int array2[3] = {2,4,6]

int sum1 = 0;
int sum2 = 0;
int average1 = 0;
int average2 = 0;

for (int i = 0; i < 4; i++)
{
   sum1 += array1[i];
}
average1 = sum1/4;

for (int i = 0; i < 3; i++)
{
   sum2 += array2[i];
}
average2 = sum1/3;

The two functions can be rewritten as the single function:

int calcAverage (int* myArray, int length)
{
   int sum = 0;
   for (int i = 0; i < length; i++)
   {
       sum += myArray[i];
   }
   return sum/length;
}