Shift Or Algorithm
The Shift Or Algorithm is a string searching algorithm that seeks a pattern, i.e. a substring, within a text using bitwise techniques.
Explanation
Let R be a bit array of size m, Vector Rj is the value of the array R after text character y[j] has been processed. It contains informations about all matches of prefixes of x that end at position j in the text.
For 0 <= i <= m-1:
Rj[i] = 0 if x[0..i] = y[j-i..j]
Rj[i] = 1 otherwise
The vector Rj+1 can be computer after Rj as follows:
For each Rj[i] = 0:
Rj+1[i+1] = 0 if x[i+1] = y[j+1]
Rj+1[i+1] = 1 otherwise
and
Rj+1[0] = 0 if x[0]=y[j+1]
Rj+1[0] = 0 otherwise
If Rj+1[m-1] = 0 then a complete match can be reported.
Shift Or Algorithm implementation
The Shift Or Algorithm, in full generality, looks like this when implemented in C:
int preSo( char *x, int m, unsigned int S[]) { unsigned int j, lim; int i; for( i=0; i < ASIZE; ++i ) S[i] = ~0; for( lim=i=0, j=1; i<m; ++i, j<<=1 ) { S[x[i]] &= ~j; lim |= j; } lim = ~(lim>>1); return (lim); } void StringOrSearch( char *x, int m, char *y, int n) { unsigned int lim, state; unsigned int S[ASIZE]; int j; /* Preprocessing */ lim = preSo( x, m, S ); /* Searching */ for( state = ~0, j=0; j<n; ++j ) { state = (state<<1) | S[y[j]]; if( state < lim ) OUTPUT( j-m+1 ); } }
Some assumptions regarding the algorithm. The size of the string that is being search for the pattern should be less than the "word" size of the computer running the algorithm. This is not an exact implementation and may not compile. This, however will give a good idea of the algorithm.