Duplicate code

Duplicate code is a computer programming term for a sequence of code that occurs more than once in a programs source code. A minimum requirement is usually applied to the quantity of code that must appear in a sequence for it to be considered a duplicate rather than a coincidental similarity.

Sequences of duplicate code are sometimes known as clones.

The following are some of the ways in which two code sequences can be a duplicate of each other:

character for character identical
character for character identical with white space characters and comments being ignored
token for token identical
functionally identical

A number of different algorithms have been proposed to detect duplicate code:

Baker's paper
Swiss work on visual clone detection
other people

Example of functionally duplicate code

Consider the following code snippit for caluclating the average of an array of integers

int array1[4] = {3,5,7,9}
int array2[3] = {2,4,6]

int sum1 = 0;
int sum2 = 0;
int average1 = 0;
int average2 = 0;

for (int i = 0; i < 4; i++)
{
   sum1 += array1[i];
}
average1 = sum1/4;

for (int i = 0; i < 3; i++)
{
   sum2 += array2[i];
}
average2 = sum1/3;

The two functions can be rewritten as the single function:

int calcAverage (int* myArray, int length)
{
   int sum = 0;
   for (int i = 0; i < length; i++)
   {
       sum += myArray[i];
   }
   return sum/length;
}

Example of functionally duplicate code

External links