Memory latency

In computing, memory latency is the time between initiating a request for a byte or word in memory until it is retrieved. If the data are not in the processor's cache, it takes longer to obtain them, as the processor will have to communicate with the external memory cells. Latency is therefore a fundamental measure of the speed of memory: the less the latency, the faster the speed.

However, memory latency should not be confused with memory bandwidth, which measures the throughput of memory. It is possible that an advance in memory technology increases bandwidth (an apparent increase in performance), and yet latency increases (an apparent decrease in performance). For example, DDR memory has been superceded by DDR2, and yet DDR2 has significantly greater latency when both DDR and DDR2 have the same clock frequency. DDR2 can be clocked faster, however, increasing its bandwidth; only when its clock is significantly greater than that of DDR will DDR2 have lower latency than DDR.

Some modern microprocessors, including most of AMD's range have the memory controller on the CPU die to reduce the memory latency.

Memory latency is also the time between initiating a request for data and the beginning of the actual data transfer. On a disk, latency is the time it takes for the selected sector to come around and be positioned under the read/write head.

Overview of the different kinds of Memory Latency