Fragmentation (computing)
External fragmentation =
External fragmentation is the phenomenon where the free space, the space still available for use, in a piece of storage becomes divided into many small pieces. It is caused over time by allocating and deallocating ("freeing") pieces of the storage space of many different sizes. The result is that, although one may have plenty of free space, you may not be able to use it all, or at least use it as efficiently as one would like to.
For example, in dynamic memory allocation, a block of 1000 bytes might be requested, but the largest contiguous block of free space, or memory hole, has only 300. Even if there are ten blocks of 300 bytes of free space, separated by allocated regions, one still cannot allocate the requested block of 1000 bytes.
External fragmentation also occurs in file systems as many files of different sizes are created, change size, and are deleted. The effect is even worse if a file which is divided into many small pieces is deleted, because this leaves similarly small regions of free space.
External fragmentation can be eliminated through a process called compaction, where existing objects are all moved into one large adjacent block, leaving all of the remaining free space in one large block. Moving garbage collectors use compaction to improve dynamic memory allocation performance, and tools that defragment disk drives also perform a compaction step. It is often possible to do a partial but still useful form of compaction more efficiently, or to continually compact in an incremental fashion so that external fragmentation is always kept low.
Data fragmentation =
Data fragmentation occurs when a piece of data in memory is broken up into many pieces that are not close together. It is typically the result of attempting to insert a large object into storage that has already suffered external fragmentation.
For example, files in a file system are often broken up into pieces called blocks. When a disk is new, there is space to store the blocks of a file all together in one place. This allows for rapid sequential file reads and writes. However, as files are added, removed, and changed in size, the disk becomes externally fragmented, leaving only small holes in which to place new data. When a new file is written, or when an existing file is extended, the new data blocks will be scattered out across the disk, slowing access due to seek time and rotational delay of the read/write head.
As another example, if the nodes of a linked list are allocated consecutively in memory, this improves locality of reference and enhances data cache performance during traversal of the list. If the memory pool's free space has become fragmented, however, the linked list nodes will be spread throughout memory, increasing the number of cache misses.
Just as compaction can eliminate external fragmentation, data fragmentation can be eliminated by rearranging pieces of data so that related pieces are close together. For example, the primary job of a defragmentation tool is to rearrange blocks on disk so that the blocks of each file are contiguous and in order. Some moving garbage collectors will also move related objects close together to improve cache performance.
Internal fragmentation =
Internal fragmentation refers to the result of reserving a piece of space without ever intending to use it. This space is wasted. While this seems foolish, it is often accepted in return for increased efficiency or simplicity.
For example, in many file systems, files always start at the beginning of a sector, because this simplifies organization and makes it easier to grow files. Any space left over between the last byte of the file and the first byte of the next sector is internal fragmentation. Similarly, a program which allocates a single byte of data is often allocated many additional bytes for metadata and alignment. This extra space is also internal fragmentation.
Another common example: Letters are often stored in 8-bit bytes even though in standard ASCII strings the 8th bit of each byte is always zero. The "wasted" bits are internal fragmentation.
Similar problems with leaving reserved resources unused appear in many other areas. For example, IP addresses can only be reserved in blocks of certain sizes, resulting in many IPs that are reserved but not actively used. This is contributing to the IPv4 address shortage.
Unlike other types of fragmentation, internal fragmentation is difficult to reclaim; usually the best way to remove it is with a design change. For example, in dynamic memory allocation, memory pools drastically cut internal fragmentation by spreading the space overhead over a larger number of objects.