Locality of reference
One common pattern in computing is that memory locations are typically read in batches. That implies that any particular memory location is much more likely to be read if another "near" it was read recently. This leads to potential performance increases if the memory in question supports faster operations on closer locations, a concept known as locality of reference.
In order to take advantage of this fact the concept of a cache has long been used to improve performance. Instead of asking for data in a "one by one" fashion, caching assumes you will read a number of locations in the same area, and loads up a "page" of memory into a much faster pool. When the program then asks for the next bit of data, there is a good chance that it is already in this pool due to locality of reference.
In the past caching typically referred to loading data from disk drives and other slow devices into faster devices such as core memory. This process has continued up the line all the way to the CPU today, where the CPU uses a pool of very fast memory to cache data being read from the RAM. In turn the RAM now includes the ability to feed pages of data to the CPU in a faster manner, specifically to improve the speed at which the CPU cache can be filled.