Format-preserving encryption
In cryptography, format-preserving encryption (FPE) is a mode of operation of a symmetric encryption algorithm that keeps the format of the encrypted data the same as the format of the unencrypted data. There are ways to implement FPE that are provably as secure as the underlying symmetric encryption algorithm, and three ways to do this are described in a paper by cryptographers John Black and Phillip Rogaway[1]. One of these techniques has been accepted by NIST for consideration as an approved mode of the AES algorithm under the name "Feistel Finite Set Encryption Mode (FFSEM)"[2].
The motivation for FPE
The motivation for using FPE comes from the problems associated with integrating encryption into existing applications. Suppose that you want to encrypt the 16-digit credit card number 1234567812345670
. Using other modes of the AES algorithm like ECB or CBC will transform a credit card number into a very different format.
Using AES-128-ECB, this credit card number might get encrypted to the hexadecimal value 0x96a45cbcf9c2a9425cde9e274948cb67
, which contains many bytes that are invalid characters. If a credit card number is stored in a column of a database whose entries are char
or varchar
data, then the encrypted data cannot be stored in same column without changing the format of the column. If the encryped data is Base64 encoded to ensure that it only contains valid characters, the size of the encrypted credit card number increases from 16 bytes to 24 bytes, changing the encrypted credit card number to lqRcvPnCqUJc3p4nSUjLZw==
, for example. In either case, applications that process the credit number may similarly be unable to handle an encrypted value without some modification.
Using AES-128-CBC, this credit card number might get encrypted to the hexadecimal value 0xde015724b081ea7003de4593d792fd8b695b39e095c98f3a220ff43522a2df02
, which both contains invalid characters and is longer than the unencrypted data. In addition to the problems caused by creating invalid characters and increasing the size of the data, data encrypted using the CBC mode of an encryption algorithm will also change its value when it is decrypted and encrypted again. This happens because the random seed value that is used to initialize the encryption algorithm and is included as part of the encrypted value is different for each encryption operation. This means that it is impossible to use encrypted data as a unique key to identify a row in a database.
The three FPE constructions of Black and Rogaway
Black and Rogaway described three ways to construct an FPE algorithm and proved that each of these techniques is as secure as the symmetric algorithm that is used to construct it. This means that if the AES algorithm is used to create an FPE algorithm, then the resulting FPE algorithm is as secure as AES because an adversary of defeating the FPE algorithm can also defeat the AES algorithm. So if we assume that AES is secure, then we also need to believe that the FPE algorithms constucted from it are secure. In all of the following, we use E to denote the symmetric algorithm that is used to construct an FPE algorithm.
FPE from a prefix cipher
One easy way to create an FPE algorithm is to build a lookup table that implements the encryption. Suppose that we want to be able to encrypt the n values 0, 1, ..., n-1. To do this, we can create the table I = (E(0),E(1),...E(n-1)). From this table, we create a second table J from replacing each element of the table I with its ordinal position, starting at 0. Then we encrypt an integer m as the mth entry in the table J.
For example, suppose that we want to create a way of encrypting the integers 0 through 3. We first build the table I by encrypting 0 through 7 to get I = (E(0),...,E(3)). Suppose that we get the following:
E(0)= E(1)= E(2)= E(3)=