Duplicate code
Appearance
Duplicate code is a computer programming term for a sequence of code that occurs more than once in a programs source code. A minimum requirement is usually applied to the quantity of code that must appear in a sequence for it to be considered a duplicate rather than a coincidental similarity.
Sequences of duplicate code are sometimes known as clones.
The following are some of the ways in which two code sequences can be a duplicate of each other:
- character for character identical
- character for character identical with white space characters and comments being ignored
- token for token identical
- functionally identical
A number of different algorithms have been proposed to detect duplicate code:
- Baker's paper
- Swiss work on visual clone detection
- other people
Example of functionally duplicate code
Consider the following code snippit for caluclating the average of an array of integers
int array1[4] = {3,5,7,9} int array2[3] = {2,4,6] int sum1 = 0; int sum2 = 0; int average1 = 0; int average2 = 0; for (int i = 0; i < 4; i++) { sum1 += array1[i]; } average1 = sum1/4; for (int i = 0; i < 3; i++) { sum2 += array2[i]; } average2 = sum1/3;
The two functions can be rewritten as the single function:
int calcAverage (int* myArray, int length) { int sum = 0; for (int i = 0; i < length; i++) { sum += myArray[i]; } return sum/length; }