Grid file system
A Grid File System is a computer file system whose goal is improved reliability and availability by taking advantage of many smaller file storage areas.
Components
Current file systems contain up to three components: -File Table (FAT table, MFT, etc) -File Data -MetaData (user permissions, etc)
A Grid File System would have similar needs: -File Table (or search index) -File Data -MetaData
Comparisons
Because current File Systems are designed to appear as a single disk for a single computer to manage (entirely), many new challenges arrise in a grid scenario whereby any single disk within the grid should be capable of handling requests for any data contained in the grid.
Features
Most file storage utilizes layers of redundancy to achieve a high level of data protection (inability to loose data). Current means of redundancy include replication and parity checks. Such redundancy can be implemented via a RAID array (whereby multiple physical disks appear to a local computer as a single disk, which may include data replication, and/or disk partitioning). Similarly, a Grid File System would consist of some level of redundancy (either at the logical file level, or at the block level, possibly including some sort of parity check) across the various disks present in the "Grid".
Framework
First and foremost, a File Table mechanism is necessary. Additionally, the file table must include a mechanism for locating the (target/destination) file within the grid. Secondly, a mechanism for working with File Data must exist. This mechanism is responsible for making File Data available to requests.
Implementation
With the recent advent of Torrent technology, a parallel can be drawn to a Grid File System, in that a torrent tracker (and search engine) would be the "File Table", and the torrent applications (transmitting the files) would be the "File Data" component.
A File system which incorporates Torrent technology (distributed replication, distributed data request/fulfillment) would likely be a good start for such a technology.
Availability
Assuming there exists some method of managing data replication (assigning quotas, etc) autonomously within the grid, data could be configured for high availability, regardless of loss or outage.
Troubles
The largest problem currently revolves around distributing data updates. Torrents support minimal heiarchy (currently implemented either as metaData in the torrent tracker, or strictly as UI and basic categorization). Updating multiple nodes concurrently (assuming atomic transactions are required) presents latency during updates and additions, usually to the point of not being feasible. Additionally, a grid (network based) file system breaks traditional TCP/IP paradigms in that a File System (generally low level, ring 0 type of operations) require complicated TCP/IP implementations, introducing layers of abstraction and complication to the process of creating such a grid file system.