Clustered file system

A clustered file system is a file system which is shared by being simultaneously mounted on multiple servers. There are several approaches to clustering, most of which do not employ a clustered file system (only direct attached storage for each node). Clustered file systems can provide features like location-independent addressing and redundancy which improve reliability or reduce the complexity of the other parts of the cluster. Parallel file systems are a type of clustered file system that spread data across multiple storage nodes, usually for redundancy or performance.^[1]

Shared-disk file system

A shared-disk file system uses a storage area network (SAN) to allow multiple computers to gain direct disk access at the block level. Access control and translation from file-level operations that applications use to block-level operations used by the SAN must take place on the client node. The most common type of clustered file system, the shared-disk file system —by adding mechanisms for concurrency control—provides a consistent and serializable view of the file system, avoiding corruption and unintended data loss even when multiple clients try to access the same files at the same time. Shared-disk file-systems commonly employ some sort of fencing mechanism to prevent data corruption in case of node failures, because an unfenced device can cause data corruption if it loses communication with its sister nodes and tries to access the same information other nodes are accessing.

The underlying storage area network may use any of a number of block-level protocols, including SCSI, iSCSI, HyperSCSI, ATA over Ethernet (AoE), Fibre Channel, network block device, and InfiniBand.

There are different architectural approaches to a shared-disk filesystem. Some distribute file information across all the servers in a cluster (fully distributed).^[2]

Examples

Blue Whale Clustered file system (BWFS)
Silicon Graphics (SGI) clustered file system (CXFS)
Veritas Cluster File System
Microsoft Cluster Shared Volumes (CSV)
DataPlow Nasan File System
IBM General Parallel File System (GPFS)
Oracle Cluster File System (OCFS)
OpenVMS Files-11 File System
PolyServe storage solutions
Quantum StorNext File System (SNFS), ex ADIC, ex CentraVision File System (CVFS)
Red Hat Global File System (GFS2)
Sun QFS
TerraScale Technologies TerraFS
Versity VSM
VMware VMFS
WekaFS
Apple Xsan

Design considerations

Avoiding single point of failure

The failure of disk hardware or a given storage node in a cluster can create a single point of failure that can result in data loss or unavailability. Fault tolerance and high availability can be provided through data replication of one sort or another, so that data remains intact and available despite the failure of any single piece of equipment. For examples, see the lists of distributed fault-tolerant file systems and distributed parallel fault-tolerant file systems.

Performance

A common performance measurement of a clustered file system is the amount of time needed to satisfy service requests. In conventional systems, this time consists of a disk-access time and a small amount of CPU-processing time. But in a clustered file system, a remote access has additional overhead due to the distributed structure. This includes the time to deliver the request to a server, the time to deliver the response to the client, and for each direction, a CPU overhead of running the communication protocol software.

Concurrency

Concurrency control becomes an issue when more than one person or client is accessing the same file or block and want to update it. Hence updates to the file from one client should not interfere with access and updates from other clients. This problem is more complex with file systems due to concurrent overlapping writes, where different writers write to overlapping regions of the file concurrently.^[3] This problem is usually handled by concurrency control or locking which may either be built into the file system or provided by an add-on protocol.

History

IBM mainframes in the 1970s could share physical disks and file systems if each machine had its own channel connection to the drives' control units. In the 1980s, Digital Equipment Corporation's TOPS-20 and OpenVMS clusters (VAX/ALPHA/IA64) included shared disk file systems.^[4]

References

^ Saify, Amina; Kochhar, Garima; Hsieh, Jenwei; Celebioglu, Onur (May 2005). "Enhancing High-Performance Computing Clusters with Parallel File Systems" (PDF). Dell Power Solutions. Dell Inc. Retrieved 6 March 2019.
^ Mokadem, Riad; Litwin, Witold; Schwarz, Thomas (2006). "Disk Backup Through Algebraic Signatures in Scalable Distributed Data Structures" (PDF). DEXA 2006 Springer. Retrieved 8 June 2006.
^ Pessach, Yaniv (2013). Distributed Storage: Concepts, Algorithms, and Implementations. ISBN 978-1482561043.
^ Murphy, Dan (1996). "Origins and Development of TOPS-20". Dan Murphy. Ambitious Plans for Jupiter. Retrieved 6 March 2019. Ultimately, both VMS and TOPS-20 shipped this kind of capability.