TernFS
| Linux File system TernFS | |
|---|---|
| Developer | XTX Markets |
| Initial release | 2025 |
| Repository | github |
| Written in | |
| Operating system | Linux |
| Type | Distributed file system |
| License | GPL-2.0-or-later (core) Apache-2.0 with LLVM exception (protocol and client libraries) |
TernFS is an open source distributed file system developed by the quantitative trading firm XTX Markets.[1][2] It is designed for large-scale machine learning workloads requiring efficient handling of immutable files across exabyte-scale storage.[3] TernFS aims to provide high performance, robustness, and data integrity across multiple data centers and regions. The system was open-sourced by XTX Markets in 2025 under a combination of GPL-2.0-or-later and Apache-2.0 (with LLVM exceptions) licenses.[4]
Overview
[edit]TernFS was designed to meet the data requirements of machine learning pipelines at XTX Markets, which involve reading and writing very large immutable files.[5] In this context, "immutable" refers to files that are never modified after their creation, and “large” refers to files that are typically several megabytes or larger.
The system targets deployments of up to:
- 10 exabytes (EB) of logical file storage,
- 1 trillion files (average size ~10 MB),
- 100 billion directories (average 10 files per directory), and
- 1 million clients simultaneously connected.
TernFS is intended to run on commodity hardware connected through standard Ethernet networking.[6]
Design and architecture
[edit]TernFS emphasizes durability, consistency, and recoverability:[7]
- Files are either fully written or not visible to other clients, preventing "half-written" states.
- Power loss or node failure cannot corrupt metadata or file data.
- Bit-rot and silent data corruption are mitigated through redundancy and scrubbing processes.
- Data loss is considered highly unlikely except in catastrophic events (e.g. fire or flooding).
- The system can continue operating during node maintenance or partial failures.
- Deleted files can be restored depending on retention policy.
- Supports multi-region replication for geo-redundancy and scaling across data centers.
History
[edit]TernFS was first developed internally at XTX Markets in 2022 to handle the firm’s rapidly growing machine learning workloads. It entered production use during 2023 and was officially released as open-source software in October 2025.[4][8]
References
[edit]- ^ "XTX Markets Open-Sources TernFS: Exabyte-Scale Filesystem for ML Trading". 22 September 2025. Retrieved 9 October 2025.
- ^ "High-Frequency Trading Firm XTX Markets Open Sources TernFS File System". 7 October 2025. Retrieved 9 October 2025.[dead link]
- ^ "XTX Markets Open-Sources TernFS: Exabyte-Scale Filesystem for ML Trading". 22 September 2025. Retrieved 9 October 2025.
- ^ a b "TernFS: A distributed filesystem for exabyte-scale ML workloads". XTX Markets Tech Blog. 2025. Retrieved 9 October 2025.[dead link]
- ^ "TernFS — an exabyte scale, multi-region distributed filesystem". Retrieved 9 October 2025.
- ^ "A Major Trading Firm Has Open-Sourced The Latest Linux File-System: TernFS". Retrieved 9 October 2025.
- ^ "XTX Markets Open-Sources TernFS, Its Exabyte-Scale Filesystem". 23 September 2025. Retrieved 9 October 2025.
- ^ "XTX Markets Open-Sources TernFS: Exabyte-Scale Filesystem for ML Trading". 22 September 2025. Retrieved 9 October 2025.