Shift Or Algorithm
The Shift Or Algorithm is a string searching algorithm that seeks a pattern, i.e. a substring, within a text string using bitwise techniques.
Explanation
Let R be a bit array of size m, Vector Rj is the value of the array R after text character y[j] has been processed. It contains information about all matches of prefixes of x that end at position j in the text.
For 0 <= i <= m-1:
Rj[i] = 0 if x[0..i] = y[j-i..j]
Rj[i] = 1 otherwise
The vector Rj+1 can be computer after Rj as follows:
For each Rj[i] = 0:
Rj+1[i+1] = 0 if x[i+1] = y[j+1]
Rj+1[i+1] = 1 otherwise
and
Rj+1[0] = 0 if x[0]=y[j+1]
Rj+1[0] = 1 otherwise
If Rj+1[m-1] = 0 then a complete match can be reported.
Shift Or Algorithm implementation
The Shift Or Algorithm looks like this when implemented in C:
#define ASIZE (CHAR_MAX + 1) int preSo( char *x, int m, unsigned long S[]) { unsigned long j, lim; int i; for( i=0; i < ASIZE; ++i ) S[i] = ~0; for( i=0, lim=0, j=1; i<m; ++i, j<<=1 ) { S[(unsigned char) x[i]] &= ~j; lim |= j; } lim = ~(lim>>1); return (lim); } void StringOrSearch( char *x, int m, char *y, int n) { unsigned long lim, state; unsigned long S[ASIZE]; int j; /* Preprocessing */ lim = preSo( x, m, S ); /* Searching */ for( state = ~0, j=0; j<n; ++j ) { state = (state<<1) | S[(unsigned char) y[j]]; if( state < lim ) OUTPUT( j-m+1 ); } }
Some assumptions regarding the algorithm. The size of the string that is being search for the pattern should be less than the "word" size of the computer running the algorithm. This is not a complete implementation; however it will give a good idea of the algorithm.