Jump to content

Shift Or Algorithm

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Canley (talk | contribs) at 10:15, 1 August 2006 (added category, some cleanup). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

The Shift Or Algorithm is a string searching algorithm that seeks a pattern, i.e. a substring, within a text string using bitwise techniques.

Explanation

Let R be a bit array of size m, Vector Rj is the value of the array R after text character y[j] has been processed. It contains informations about all matches of prefixes of x that end at position j in the text.


For 0 <= i <= m-1:


Rj[i] = 0 if x[0..i] = y[j-i..j]

Rj[i] = 1 otherwise


The vector Rj+1 can be computer after Rj as follows:


For each Rj[i] = 0:


Rj+1[i+1] = 0 if x[i+1] = y[j+1]

Rj+1[i+1] = 1 otherwise


and


Rj+1[0] = 0 if x[0]=y[j+1]

Rj+1[0] = 1 otherwise


If Rj+1[m-1] = 0 then a complete match can be reported.

Shift Or Algorithm implementation


The Shift Or Algorithm, in full generality, looks like this when implemented in C:

int preSo( char *x, int m, unsigned int S[])
{
	unsigned int j, lim;
	int i;

	for( i=0; i < ASIZE; ++i )
		S[i] = ~0;
	for( lim=i=0, j=1; i<m; ++i, j<<=1 )
	{
		S[x[i]] &= ~j;
		lim |= j;
	}
	lim = ~(lim>>1);
	return (lim);
}

void StringOrSearch( char *x, int m, char *y, int n)
{
	unsigned int lim, state;
	unsigned int S[ASIZE];
	int j;

	/* Preprocessing */
	lim = preSo( x, m, S );

	/* Searching */
	for( state = ~0, j=0; j<n; ++j )
	{
		state = (state<<1) | S[y[j]];
		if( state < lim )
			OUTPUT( j-m+1 );
	}
}

Some assumptions regarding the algorithm. The size of the string that is being search for the pattern should be less than the "word" size of the computer running the algorithm. This is not an exact implementation and may not compile. This, however will give a good idea of the algorithm.