Scale factor (computer science)

A scale factor is used in computer science when a real-world set of numbers needs to be represented on a different scale in order to fit a specific number format. For instance, a 16 bit unsigned integer (uint16) can only hold a value as large as 65,535₁₀. If uint16's are to be used to represent values from 0 to 131,070₁₀, then a scale factor of 1⁄2 would be introduced. Although the scale factor extends the range, it also decreases the precision. In this example, for instance, the number 3 cannot be represented, because a stored 1 represents a real world 2, and a stored 2 represents a real world 4.

Uses

Certain number formats may be chosen for an application for convenience in programming, or because of certain advantages offered by the hardware for that number format. For instance, early processors did not natively support the IEEE floating-point standard for representing fractional values, so integers were used to store representations of the real world values by applying a scale factor to the real value. Similarly, because hardware arithmetic has a fixed width (commonly 16, 32, or 64 bits), scale factors allow representation of larger numbers (by manually multiplying or dividing by the specified scale factor), though at the expense of precision.^[1] By necessity, this was done in software, since the hardware did not support fractional value. Scale factors are also used in floating-point numbers, and are always powers of two. For example, the double-precision format sets aside 11 bits for the scaling factor (a binary exponent) and 53 bits for the significand, allowing various degrees of precision for representing different ranges of numbers, and expanding the range of representable numbers beyond what could be represented using 64 explicit bits (though at the cost of precision).^[2]

Operations on scaled values

Once the scaled representation of a real value is stored, the scaling can often be ignored until the value needs to come back into the "real world". For instance, adding two scaled values is just as valid as unscaling the values, adding the real values, and then scaling the result, and the former is much easier and faster. In either approach, though, the two added numbers must be scaled the same.^[3] For other operations, the scaling is very important.

Multiplication, for instance, needs to take into account that both numbers are scaled. As an example, consider two real world values A and B. The real world multiplication of these real world values is:

A * B = P

If they are instead represented with a scale factor of Z, and these scaled representations are subsequently multiplied, the result is the following:

AZ * BZ = Q

AZ is the scaled real world value of A, or simply the product of A * Z, and likewise, BZ is the scaled representation of B. After the scaled multiplication, the answer is not written PZ, because the value stored in PZ is not the answer. This can be seen by rearranging the statement, where each line in the following is equivalent:

AZ * BZ = Q
A * Z * B * Z = Q
(A * B) * Z * Z = Q
P * Z * Z = Q
PZ * Z = Q

In line 4, P substitutes A * B. It follows that the result of AZ * BZ (which is Q) is not PZ, but rather PZ * Z. If PZ were the answer, it could be stored directly since it has the scale factor built in, as is the case with addition and subtraction. For multiplication, however, the product of two scaled values has an extra scaling built in. As long as this is taken into account, there is still no need to convert AZ and BZ into A and B before performing the operation; the result must be divided by Z before storing it back. After this, PZ will be stored as the result of the multiplication, which is indeed the scaled representation of the result of A * B (the desired answer) rather than the result of AZ * BZ (which is still scaled).

Common scaling scenarios

Fractional values scaled to integers

As previously described, many older processors (and possibly some current ones) do not natively support fractional mathematics. In this case, fractional values can be scaled into integers by multiplying them by ten to the power of whatever decimal precision is desired. In other words, to preserve n digits to the right of the decimal point, it is necessary to multiply the entire number by 10ⁿ. In computers, which perform calculations in binary, the real number is multiplied by 2^m to preserve m digits to the right of the binary point; alternately, one can bit shift the value m places to the left. For example, in the following set of real world fractional values, all have three digits to the right of the decimal point:

15.400, 0.133, 4.650, 1.000, 8.001

To save all of that information (in other words, not lose any precision), these numbers must be multiplied by 10³ (1,000), giving integer values of:

15400, 133, 4650, 1000, 8001

Because of the value of the scaled numbers, they cannot be stored in 8bit integers; they will require at least 14 unsigned bits, or, more realistically, 16.

integer values to fractions

Certain processors, particularly DSPs common in the embedded system industry, have built in support for the fixed point arithmetic, such as Q and IQ formats.

Since the fractional part of a number takes up some bits in the field, the range of values possible in a fixed9point value is less than the same number of bits would provide to an integer.^[4] For instance, in an 8 bit field, an unsigned integer can store values from [0, 255], but an unsigned fixed-point with 5 bits allocated to the fractional part only has 3 bits left over for the integer value, and so can only store integer values from [0, 7]. (The number of distinct values that the two fields can store is the same, 2⁸ = 256, because the fixed-point field can also store 32 fractional values for each integer value.) It is therefore common that a scaling factor is used to store real world values that may be larger than the maximum value of the fixed-point format.

As an example, when using an unsigned 8-bit fixed-point format (which has 4 integer bits and 4 fractional bits), the highest representable integer value is 15, and the highest representable mixed value is 15.9375 (0xF.F or 1111.1111_b). If the desired real world values are in the range [0,160], they must be scaled to fit within this fixed-point representation. A scale factor of 1⁄10 cannot be used here, because scaling 160 by 1⁄10 gives 16, which is greater than the greatest value that can be stored in this fixed-point format. However, 1⁄11 will work as a scale factor, because the maximum scaled value, 160/11 = 14.54, fits within this range. Given this set:

154, 101, 54, 3, 0, 160

Scaling these with the scale factor 1⁄11 gives the following values:

154/11 = 14
101/11 = 9.1818...
54/11 = 4.9090...
3/11 = 0.2727...
0/11 = 0
160/11 = 14.5454...

Many of these values have been truncated because they contain repeating decimals, which follows from the chosen scale factor (elevenths do not terminate in decimal). When storing these in our fixed-point format, some precision will be lost (in contrast to the exact values of the original integers). This is also a problem because an 8-bit format can store 256 different values, but the numbers in this set are from a range with only 161 possible values (0 through 160). As it turns out, the problem was the scale factor, 11, which introduced unnecessary precision requirements and rounding error (when approximating a real value with the nearest representable value).^[5] To avoid or resolve this problem, one must choose a better scale factor.

Picking a Scale Factor

An example above illustrated how certain scale factors can cause unnecessary precision loss. We will revisit this example to further explore the situation.

We're storing representations of real data in 8 bit unsigned fixed point fields with 4 integer bits and 4 fractional bits. This gives us a range of [0, 15.9375] in decimal, or [0x0.0, 0xF.F] in hex. Our real world data is all integers and in the range [0, 160] in decimal. Note that there are only 161 unique values that we may want to store, so our 8 bit field should be plenty, since 8 bits can have 256 unique configurations.

In the example given above, we picked a scale factor of 11 so that all the numbers would be small enough to fit in the range. However, when we began scaling the following real world data:

154, 101, 54, 3, 0, 160

We discovered that the precision of these fractions is going to be a problem. The following box illustrates this showing the original data, its scaled decimal values, and the binary equivalent of the scaled value.

154/11 = 14 = 1110.0
101/11 = 9.1818... = 1001.00101110...
54/11 = 4.9090... = 100.111010...
3/11 = 0.2727... = 0.010010...
0/11 = 0 = 0.0
160/11 = 14.5454... = 1110.10010...

Notice how several of the binary fractions require more than the 4 fractional bits provided by our fixed point format. To fit them into our fields, we would simply truncate the remaining bits, giving us the following stored representations:

Or in decimal:

And when we need to bring them back into the real world, we need to divide by our scale factor, 1/11, giving the following "real world" values:

Notice how they've changed? For one thing, they aren't all integers anymore, immediately indicating that an error was introduced in the storage, due to a poor choice of scaling factor.

Picking a better Scale Factor

Most data sets will not have a perfect scale factor; you will probably always get some error introduced by the scaling process. However it certainly may be possible to pick a better scaling factor. For one thing, note that dividing a number by a power of two is the same as shifting all the bits to the right once for each power of two. (It's the same thing in decimal, when you divide by 10, you shift all the decimal digits one place to the right, when you divide by 100, you shift them all two places to the right). The pattern of bits doesn't change, it just moves. On the other hand, when you divide by a number that is NOT an integer power of 2, you are changing the bit pattern. This is likely to produce a bit pattern with even more bits to the right of the binary point, artificially introducing required precision. Therefore, it is almost always preferable to use a scale factor that is a power of two. You may still lose bits that get shifted right off the end of the field, but at least you won't be introducing new bits that will be shifted off the end.

To illustrate the use of powers of two in the scale factor, let's use a factor of 1/16 with the above data set. The binary value for our original data set is given below:

154 = 1001 1010
101 = 0110 0101
54 =  0011 0110
3 =   0000 0011
0 =   0000 0000
160 = 1010 0000

As we already knew, they all fit in 8 bits. Scaling these by 1/16 is the same as dividing by 16, which is the same as shifting the bits 4 places to the right. All that really means is inserting a binary point between the first four and last four bits of each number. Conveniently, that's the exact format of our fixed point fields. So just as we suspected, since all these numbers don't require more than 8 bits to represent them as integers, it doesn't have to take more than 8 bits to scale them down and fit them in a fixed point format.

References

^ Linz 2003, pp. 12–13.
^ Linz 2003, pp. 14–15.
^ Yates 2013, p. 6.
^ Yates 2013, pp. 4–5.
^ Linz 2013, p. 18.

Yates, R. (2013). "Fixed-Point Arithmetic: An Introduction" (PDF). Digital Signal Labs. Archived (PDF) from the original on 12 September 2015.{{cite web}}: CS1 maint: ref duplicates default (link)
Linz, P.; Wang, R. L. C. (2003). Exploring Numerical Methods: An Introduction to Scientific Computing Using MATLAB. Jones and Bartlett Publishers. ISBN 0-7637-1499-2.

[FOOTNOTELinz200312–13-1] Linz 2003, pp. 12–13.

[FOOTNOTELinz200314–15-2] Linz 2003, pp. 14–15.

[FOOTNOTEYates20136-3] Yates 2013, p. 6.

[FOOTNOTEYates20134–5-4] Yates 2013, pp. 4–5.

[FOOTNOTELinz201318-5] Linz 2013, p. 18.

[1]

[2]

[3]

[4]

[5]