Log-structured File System (BSD)
The Log-Structured Filesystem (or LFS) is an implementation of a log-structured file system for NetBSD. As of NetBSD 2.0.2, it appears to be no longer completely functional, having not been maintained for many years.
Design
Most of the on-disk format of LFS is borrowed from UFS. The indirect block, inode and directory formats are almost identical. This allows well-tested UFS file system code to be re-used.
LFS divides the disk into segments (usually between 384KB and 1MB in size), which will be filled bottom-up, one at a time. Each segment has a header called a summary block; pointers in the summary block together already-written segments into one long continuous log. Whenever a file is changed, LFS writes to the log head:
- Raw file data (or directory blocks, if a directory is being updated).
- Indirect blocks updated to point to (1).
- Inodes updated to point to (2).
- Inode map blocks updated to point at (3).
Unlike UFS, inodes in LFS do not have fixed locations, so an inode map—a flat list of inode block locations—is used to track them. As with everything else, inode map blocks are also written to the log when they are changed.
At a checkpoint (usually scheduled about once every 30 seconds), LFS writes the number of the current segment and the last known block locations of the inode map to a checkpoint region at a fixed place on disk. There are two such regions; LFS alternates between them each checkpoint. Once written, a checkpoint represents the last consistent snapshot of the file system. Recovery after a crash and normal mounting work the same way—the file system simply reconstructs its state from the last checkpoint and resumes logging from there.
When the current segment is filled, LFS goes onto to fill the next free (clean) one. Segments are said to be dirty if they contain live blocks, or blocks for which no newer copies exist further ahead in the log. The LFS garbage collector turns dirty segments into clean ones by copying live blocks from the dirty segment into the current segment and skipping the rest. Live blocks are tracked with a map in the summary block.
Generally, garbage collection is delayed until there are no clean segments left; it can also be deferred for when the system is idle. Even then, only the least-dirty segments are picked for collection. This is intended to avoid the penalty of cleaning full segments when I/O bandwidth is most needed.
Disadvantages
- There can be severe file fragmentation in LFS, especially for slowly-growing files or multiple simultaenous large writes. This inflicts a severe performance penalty, even though the design rationale for log-structured file systems assumes disk reads will mostly be cached away.
- LFS becomes progressively less efficient as it nears maximum capacity, when the garbage collector has to run almost constantly to make clean segments available.
References
- Rosenblum, Mendel and Ousterhout, John K. (June 1990). "The LFS Storage Manager." Proceedings of the 1990 Summer Usenix. 315-324.
- Rosenblum, Mendel and Ousterhout, John K. (February 1992). "The Design and Implementation of a Log-Structured Filesystem." ACM Transactions on Computer Systems. 10(1). 26-52.
- Seltzer, Margo, Bostic, K., McKusick, M. and Staelin, C. (January 1993). "The Design and Implementation of the 4.4BSD Log-structured File System". Proceedings of the 1993 Winter Usenix.
- McKusick, M. et al (1996). The Design and Implementation of the 4.4BSD Operating System. Addison Wesley. ISBN 0-201-54979-4.
- J. Matthews, D. Roselli, A. Costello, R. Wang, and T. Anderson (October 1997). "Improving the Performance of Log-Structured File Systems with Adaptive Methods", Proceedings of the Sixteenth ACM SOSP.