Talk:Record-oriented filesystem

This article is within the scope of WikiProject Computer science, a collaborative effort to improve the coverage of Computer science related articles on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.Computer scienceWikipedia:WikiProject Computer scienceTemplate:WikiProject Computer scienceComputer science

???

This article has not yet received a rating on Wikipedia's content assessment scale.

???

This article has not yet received a rating on the project's importance scale.

Things you can help WikiProject Computer science with:

Here are some tasks awaiting attention:

Article requests :
- Requested articles/Applied arts and sciences/Computer science, computing, and Internet
Cleanup :
- Computer science articles needing attention
- Computer science articles needing expert attention
Copyedit :
- Computing
Expand :
- Computer science
Infobox :
- Computer science articles without infoboxes
Maintain :
- Timeline of computing 2020–present
Photo :
- Find pictures for the biographies of computer scientists (see List of computer scientists)
- Computing articles needing images
Stubs :
- Computer science stubs
Unreferenced :
- WikiProject Computer science/Unreferenced BLPs
Project-related :
- Tag all relevant articles in Category:Computer science and sub-categories with {{WikiProject Computer science}}

I'm confused about this article, is it a FILESYSTEM (like FAT, ext2, ReiserFS) or a FILE-FORMAT (magic numbers, file extension) system? Improfane

A: It's not about a specific file system, but rather the whole class of filesystems that support record-oriented operation. The key point is that the system calls used to access files are designed to access records, rather than chunks of data read or written in application-specific formats. Most mainframe operating systems support a rich variety of record-oriented record formats. Most commonly, records are fixed in length within any given file, or a file may have variable-length records. Unlike the stream-oriented systems found on systems like Unix, PC-DOS, Windows, and Mac, the data in the file is accessed strictly in terms of records. Variable-length records are preceded by a (usually) binary byte-count, and may contain any coded bytes at all, both binary and characters. There is no concept of an "end of line" delimiter, such as a carriage-return character.

Some people, particularly Unix advocates, dismiss record-oriented file systems as being based on punched-card technology, and therefore presumably "old-fashioned." The Unix-like stream-oriented approach is modelled after another 19th century technology, that of the paper-tapes used by the printing telegraph, used to mechanize the transmission of telegrams. These started being used for computers in the form of Teletype machines used as inexpensive input devices by the mini-computers of the '60s and '70s.

For its part, the Hollerith punched card was at least originally conceived for computational purposes.

This article, it seems to me, was written by a Unix advocate who wished to diminish the advantages of record-oriented file access methods. It is clearly not NPOV. I plan to fix it, when I find time to address the matter properly.

--RussHolsclaw 04:23, 12 February 2006 (UTC)[reply]

I agree! Moreover, terminology: does an IBM mainframe OS even use a filesystem? You have VTOCs, catalogs, data sets, but file system? Never heard of that. Source needed. --Kubanczyk 09:05, 7 October 2007 (UTC)[reply]

IBM calls file systems Access Methods.

Paper tape as used for text message transmission actually contain individual records which are delimited by various control characters. Each line of text (aka record) is terminated be a carriage-return character (which sends the print head to the left) and a line-feed character which rolls the paper platen up a line in position for the next line.

A better example of a datastream used in punched tape is in a numerical controlled machine tools NC These use a stream of commands to define which cutting tool to use, the starting position, subsequent points along the cutting path and other control information.

A record oriented file has several advantages. After a program writes a collection of data as a record the program that reads that record has the understanding of that data as a collection. Although it is permitted to read only the beginning of a record, the next sequential read returns the next collection of data (record) that the writer intended to be grouped together. Another advantage is that the record has a length and there is no restriction on the bit patterns composing the data record, i.e. there is no delimiter character.

There is a cost associated with record oriented. The length definition takes up space. On a magnetic tape that definition takes the form of an inter-record gap. On a disk a meta data area must be allocated. This is minimal in a file where all the records are the same length. On a file composed of varying length records a maximum record length is defined to determine the size of the length metadata associated with each record.

DGerman (talk) 01:09, 7 February 2008 (UTC)[reply]

After adding all this information in the discussion page, I decided it best to basically rewrite the article. I have saved the original article if anyone wants it. It is also available in the wiki history. Tired now. In the future I may locate and include some references. DGerman (talk) 02:15, 7 February 2008 (UTC)[reply]

Too specific

While it is true that current IBM mainframe operating systems have record-oriented file systems that do not use delimitor characters, that is not universally true. Even IBM used record delimitors on the 14xx/7010, and RCA used them on several different product lines. Shmuel (Seymour J.) Metz (talk) 19:04, 1 June 2010 (UTC)[reply]

"Advantages and costs" section needs work

The first problem is that the section doesn't clearly indicate with what a record-oriented file system is being compared.

Is it being compared to a byte-stream file system such as those offered by UN*Xes and Windows, where the lowest-level file system operations are "read N bytes from the current location and advance the current location pointer by N bytes", "write N bytes to the current location and advance the current location pointer by N bytes", "move the current location pointer to byte N", "adjust the current location pointer by N bytes, N being positive or negative", and "move the current location pointer to the current end of the file and then adjust it by N bytes, N being positive or negative" (possibly with an additional operation to set the file size to a specified number of bytes)?

Is it being compared with a block-array file system, where the lowest-level file system operations are "read from N blocks starting from block M" and "write to N blocks starting from M", "block" here referring to some fixed physical block size, such as a "block" being a single disk sector? As I remember, the usual file system APIs of RSX-11M and VMS were record-oriented, but the layer offered by the file system code was more like a block array, with QIO calls to read from or write to a file, with, at least on VMS, user-mode code being required to go through RMS, but RMS, running in a more privileged mode, doing file I/O in response to requests by making those QIO calls?

Or is it being compared to other file system types, or to more than one file system type?

If it's being compared to byte-stream file systems (as used on most desktop/notebook computers, most smartphones and tablets, and a lot of servers), then note that a byte-stream file can be structured as a sequence of records, and there are frequently libraries for OSes with byte-stream files that do so (sometimes called, for example, "ISAM packages"), so it's not clear how some of the points apply.

A record oriented file has several advantages. After a program writes a collection of data as a record the program that reads that record has the understanding of those data as a collection.

What does it mean to "[have] the understanding of those data as a collection"? And, if the program that reads that record is doing so through a library that implements a record structure on a byte-stream file system, would that program also "[have] the understanding of those data as a collection"?

Often a file will contain several related records in sequence; after the program reads the beginning of the sequence, the next sequential read returns the next collection of data (record) that the writer intended to be grouped together.

That's the definition of a sequential read. Again, how is this different from a program using a record-oriented library for a byte-stream file?

Another advantage is that the record has a length and there is usually no restriction on the bit patterns composing the data record, i.e. there is no delimiter character.

Not all files in a byte-stream file system have delimiter characters. Text files typically do, but object file, executable image files, library files, database files, and many other file types do not. Many of them have structures in the file that are, in effect, records with a record length field in the record.

There is usually a cost associated with record oriented files. For fixed length records, some records may have unused space, while for variable length records the delimiter or length field takes up space. Variable length blocks may have overhead due to delimiters or length fields.

That would also apply to record structures atop a byte-oriented file system.

In addition, there is overhead imposed by the device. On a magnetic tape overhead typically takes the form of an inter-record gap.

That's a characteristic of a magnetic tape, not of a record-oriented file system; the only way to reduce that would be to accumulate many logical records in a physical record/block on the tape.

On a direct access device with fixed length sectors, there may be unused space in the last sector of a block.

That's true only if records aren't allowed to begin in the middle of a sector. Record-oriented file systems may choose to do so, so that they don't need to do the sort of buffering that byte-stream file systems do, but a library implementing records atop a byte-stream file system might also do so in order to avoid some of the buffering overhead.

On a direct access device with variable length physical records, that overhead typically takes the form of metadata and inter-record gaps.

True, although multiple logical records might be packed into a single physical record/block, just as on tape. This may be somewhat specific to S/360 and successors (and compatibles); minicomputers tended to use direct access devices with fixed block/sector sizes, as do personal computing devices and UN*X/Windows-based servers.

A major advantage of record-oriented file systems is that they abstract files kept on paper in earlier times. A record might contain data associated with a particular, e.g., building, contact, employee, part, venue.

Again, that's just a question of which software abstracts files; again, it's quite possible to implement records atop a byte-oriented - or block-array - file system.

A second motivator for the idea of record orientation is that it is in some sense the more natural orientation for persistent storage on a non-volatile but slow physical storage device. Most physical storage devices can communicate only in units of a block. Significant portions of modern operating system kernels and associated device drivers are devoted to hiding the naturally structured and delimited (and in some sense a block is just a physical record) nature of physical storage devices.

Operating system kernels, yes; the buffer caches of UN*Xes and Windows, and the per-open-file OS data structures that maintain the aforementioned current location pointer, do hide the block structure. However, given that records don't necessariy directly correspond to blocks, some code will have to hide the blocks, to some degree, from applications reading or writing records.

Associated device drivers, not really; they generally get "read from N sectors, starting at sector M of the disk" and "write to N sectors, starting at sector M of the disk" commands, with the - byte-stream, block-array, or record-oriented - file system code, some or all of which may be running in some privileged-mode section of the OS, translating block offsets within the file to physical sector numbers on the disk. (Or logical block numbers, if the disk itself maps logical block numbers to physical sector numbers; there may be further mapping with virtualized storage, etc..) Guy Harris (talk) 10:17, 23 September 2023 (UTC)[reply]

I've always read it as comparing record-oriented file systems with byte oriented file systems; both are abstractions from the underlying hardware.

The reference to overhead is generic and the reference to tape is a sepecific example; on other devices the overhead takes other forms.

I believe that QIO in VMS is a low level interface used by higher levels of RMS, not an interface for normal applications.

It might be helpful to post a separate section for each issue. — Preceding unsigned comment added by Chatul (talk • contribs) 03:23, 26 September 2023 (UTC)[reply]

S/3x0 and z/Architecture, and the (non-UN*X) OSes running on them, are special cases, given CKD drives, as I think the record structures were designed around those drives. (Although I have the impression that there are no physical CKD drives any more, just a CKD drive abstraction implemented by firmware/software atop fixed-block drives.)

For systems with fixed-block drives, including DEC systems and the hardware atop which UN*X systems and Windows run, there's a block-array file system abstraction atop which the byte-stream or record abstraction is built. Record-oriented file systems support certain forms of structuring of the data in those blocks, while byte-stream file systems can write arbitrarily-structured binary data to those blocks - that includes structuring as fixed-length records, variable-length records, variable-length records with fixed control fields, and indexed versions of those structures.

So I see the main difference being that OSes with byte-stream file APIs provide a lower-level abstraction atop which record-oriented files can be implemented, whereas OSes with record-oriented APIs don't provide that lower-level abstraction. It's not as if the latter systems can provide facilities that the former can't also provide. I.e., it's not a question of the on-disk file system layers being different, except to the extent that the on-disk file system may provide a way to associate a record format and, for fixed-length records, a record size with a file, even though the on-disk file system provides a block-array abstraction. Instead, it's a question of what file access APIs are available; the notion of "record-oriented" vs. "byte-stream" is really above the file system when the file system is viewed as providing an abstraction of a file as an array of bytes with metadata.

An OS with byte-stream oriented APIs may, or may not, provide a library that provides a record-oriented abstraction. A system with record-oriented APIs has the advantage that programmers who want to use record-structured files don't have to get a third-party library or write their own library, but "a system with record-oriented APIs" could be a system in which the lowest-level APIs available to programmers are byte-stream APIs and that includes a record-oriented library.

Yes, on VMS direct QIO access to files from user-mode code is, as far as I know, not allowed. On RSX-11M, a program can probably issue those QIOs (RSX-11M has to run on machines that have only kernel and user mode, and, as far as I know, didn't stuff RMS into supervisor mode if running on a machine with supervisor mode; RSX-11M Plus might have supported only machines with supervisor mode and may have put RMS in superviso-rmode code). However, I have the impression they're not recommended and not documented for use on files.

RMS does, according to VMS Software's RMS reference manual, support "block I/O", which appears to provide access to the block-array layer. However, if you do block writes to a file, RMS will mark the hints it maintains for the number of records in the file and the number of user data bytes in the file as being invalid. So VMS, at least, doesn't seem to offer a "pure" record-oriented abstraction.

So I'd says that "record-oriented" vs. "byte-stream" are, if you don't have hardware/firmware that enforces a certain organization of data more complicated than "fixed-size blocks", largely differences between APIs rather than between file systems, except to the extent thata file system provides metadata that implementations of record-oriented APIs might use.

For example, an ODS-5 VFS could probably be written for some UN*X, in which case byte streams would be read from or written to files. If the UN*X in question has an "extended attributes" API, it could allow reading and writing per-file metadata, and an RMS implementation could be written for that UN*X.

Similarly, it might be possible to write a VMS XQP or ACP (pluggable file system) that supported some UN*X file system that supported extended attributes, and RMS might be able to use that file system (UFS, HFS+, APFS, ZFS, ${pick_your_linux_file_system}, etc.) mostly or completely transparently. Guy Harris (talk) 05:28, 26 September 2023 (UTC)[reply]

It seems I have some gaps in my knowledge. I believe CKD first appeared with S/360 and previous disks were sectorized. I don’t understand the reasoning that led to this decision, but I think record-oriented file system is a natural fit, but I don’t know much about filesystems on the 707x or other early IBM computers, to say nothing of non-IBM systems. I started my career together with S/360, so record-oriented systems felt natural and byte-oriented filesystems very unnatural. Trying to emulate IBM’s record-oriented system in Linux for Iron-Spring PL/I I frequently found myself wishing for CKD disks to simplify other-than-sequential access to variable length records. there’s a natural fit with blocks determined by hardware. Peter Flass (talk) 13:42, 26 September 2023 (UTC)[reply]

Yes, CKD first appeared on the S/360, and most previous disks were sectorized. On the 1301 and 1302 each cylinder had a format track that controlled block sizes. You could have a mix of block sizes on a track, but each track in the cylinder had the same mix.

And, no, the underlying hardware of contemporary DASD fo z is no longer CKD; the subsystem simulates CKD or FBA on SCSI drives with a different geometry.

While CKD simplified some things, it had an overhead cost. -- Shmuel (Seymour J.) Metz Username:Chatul (talk) 14:53, 26 September 2023 (UTC)[reply]

IBM had record oriented file systems before CKD. GE never had CKD, yet GEFRC in GECOS was record oriented. AFAIK, DEC never had CKD, yet RMS is record oriented.

Both byte oriented and record oriented are abstractions, and IBM has implemented byte oriented on top of record oriented.

Access methods and file systems are closely related, but not identical. In OS/360, there were multiple access methods for Physical Sequential (PS) files and in z/OS there were (HFS is dead) multiple file systems for byte-oriented access. In Linux there are still multiple file systems. -- Shmuel (Seymour J.) Metz Username:Chatul (talk) 14:53, 26 September 2023 (UTC)[reply]

Trying to emulate IBM’s record-oriented system in Linux for Iron-Spring PL/I I frequently found myself wishing for CKD disks to simplify other-than-sequential access to variable length records. The advantage of doing so on VMS would have been that somebody at DEC already may have wished for something to to simplify other-than-sequential access to variable length records, so you wouldn't have had to solve that problem, DEC already did it for you.

I.e., this isn't a problem of a byte-stream-oriented file system, it's a problem with the disk hardware not supporting variable-length physical records.

IBM had record oriented file systems before CKD. GE never had CKD, yet GEFRC in GECOS was record oriented. AFAIK, DEC never had CKD, yet RMS is record oriented. Yes, that's why I said "S/3x0 is different" and discussed DEC systems, where the service the hardware provides is "a disk is an array of fixed-length physical blocks"; the UNIX byte-stream APIs were originally implemented on DEC hardware, and the other hardware on which UN*Xes run include, with the exception(?) of S/3x0 and z/Architecture, secondary storage providing the "a disk is an array of fixed-length physical blocks" service. (I don't know what mechanism the drivers used and use on various UN*Xes for S/3x0 and z/Architecture, including Linux; I suspect they just write fixed-length physical records.)

Both byte oriented and record oriented are abstractions, and IBM has implemented byte oriented on top of record oriented. And libraries for UN*Xes and, presumably, Windows have implemented record-oriented on top of byte-oriented. Byte-oriented and record-oriented are characteristics of APIs, not of file systems. Some operating systems provide a byte-oriented API atop which record-oriented APIs can be implemented (UN*Xes, Windows); other operating systems provide an API that includes block-oriented and record-oriented APIs and that support byte-stream text files using the record-orinted APIs (VMS).

Access methods and file systems are closely related, but not identical. Yes. For the file systems (in the "on-disk data structures and lowest-level OS code" sense) on UN*Xes, Windows, and VMS, the file system code provides either block-oriented (VMS) or byte-stream oriented (UN*Xes, Windows) access.

For VMS, directly using file system code requires special privileges, so user-mode code has to pass through what might be considered an access method, namely RMS. One access method is "block I/O", which appears to provide direct access to the block-oriented layer.

For UN*Xes and Windows, access methods can be built atop the byte-stream API.

At least in the case of VMS, ODS-5 (and ODS-2) may store metadata such as the record format (fixed-length, variable-length, variable-length with fixed control) and record length information, for the benefit of access methods, but that's the only connection between the file system and access method.

So perhaps this page should be renamed "record-oriented access method".

In Linux there are still multiple file systems. In UN*Xes since at least SunOS 2.0, there have been mechanisms into which various various file system implementations can be plugged; not all of those implementations even work atop local storage devices. The non-specialized ones (for example, not /proc or /sys or other oddball ones) provide a byte-stream service; access methods rarely, if ever, have to care which particular file system they're using. This is likely to continue to be the case, for a variety of reasons, for the near-term future; I don't see "multiple file systems" going away soon. Guy Harris (talk) 19:53, 26 September 2023 (UTC)[reply]

I like using access methods in the title, but some might view it as IBM-centric. What other vendors used the term?

A file system is associated with an API, although the API between phisical file system and logical file system might only be accessible by the access method. -- Shmuel (Seymour J.) Metz Username:Chatul (talk) 12:56, 27 September 2023 (UTC)[reply]

The term is used at least once in the RMS manual:

The record access mode (RAC) field indicates the method of retrieving or inserting records in the file; that is, whether records are read (or written) sequentially, directly, or by record file address. Only one access method can be specified for any single record operation, but you can change the record access mode between record operations.

but (if text search in PDFs can be trusted) that appears to be the only ocurrence of that phrase, and they also refer to it as an access mode.

I can't speak for other OS vendors who provided record access methods or vendors who provided record-oriented access libraries for OSes with lower-level file access APIs such as UN*Xes.

Another possibility would be "Record-oriented file"; Record-oriented file is currently a redirect to this page. This page has relatively few references, and none that appear to establish the notion that record orientation is a characteristic of a file system rather than of a file or of an API.

The various programming interfaces for files are rather different from OS to OS:

I have the impression that OS/360's access methods constructed channel programs, making EXCP and XDAP the lowest-level API, but, in OS/360 and its successors, were the access methods trusted not to go outside the bounds of a data set, did either of those SVCs(?) reject channel programs that went outside the bounds of the data set, or was something else done?
In RSX-11M and VMS, QIO was the lowest-level file API, with QIOs being used for "open file", "create file", "rename file", etc..
In Windows NT, the interface that individual file system operations offer to the generic file system layer is somewhat QIO-like (which isn't surprising), but, as far as I know, the kernel-trap-level API isn't "submit an I/O Request Packet", it's calls such as NTCreateFile and byte-stream-oriented NtReadFile() and NtWriteFile() calls, atop which the byte-stream-oritned Windows file access API runs. I'm not sure where in the privileged-mode code the byte-stream abstraction is implemented; it probably involves the privileged-mode cache manager, as is the case in UN*Xes.
In UN*Xes, the interface that individual file system operations offer to the generic file system layer is OS-dependent, but they are generally byte-stream-oriented, and procedural; the user-mode APIs are implemented atop that. Guy Harris (talk) 21:10, 27 September 2023 (UTC)[reply]

:*I have the impression that OS/360's access methods constructed channel programs, making EXCP and XDAP the lowest-level API, but, in OS/360 and its successors, were the access methods trusted not to go outside the bounds of a data set, did either of those SVCs(?) reject channel programs that went outside the bounds of the data set, or was something else done?

Somewhere in the system - I would assume IOS rather than the access methods - "When an EXCP request is submitted for a DASD device MVS gets the seek address from the IOB, validates it against the DEB to verify it is contained within an extent of the dataset. If the seek address is not contained within a valid dataset extent the request is rejected. If the seek address is valid then MVS builds a SEEK CCW using the IOB seek address. A SET FILE MASK CCW is then chained on to the SEEK CCW. The SET FILE MASK specifies what types of CCW commands may follow....There may only be one SET FILE MASK command in a CCW sequence. This keeps the user CCW program from accessing tracks outside the dataset extents. It also enforces read-only for datasets opened for input. "^[1]

Presumably you meant http://tommysprinkle.com/mvssp/category/excp-io/excp-io-introduction/ for the reference. Guy Harris (talk) 09:07, 28 September 2023 (UTC)[reply]

Or http://tommysprinkle.com/mvssp/category/excp-io/, for the entire sequence.

So "EXCP" is short for "take this channel program and insert it after some channel commands you construct to position the device to the address I gave you, as long as that address is valid, and then limit what this program can do, and execute the resulting channel program", rather than just "run this channel program". So something in between "run this channel program as is" and "do whatever is necessary to read N blocks starting at this logical block number within this file", or the byte-stream equivalent, the latter being the sort of I/O interface provided in the other OSes I mentioned. Guy Harris (talk) 09:17, 28 September 2023 (UTC)[reply]

No, EXCP is short for EXecute Channel Program and is an access method in its own right, as is the later EXCPVR. XDAP is just a front end for EXCP. Even in OS/360 EXCP does more than validate and run the channel program. Relevant to this discussion is that for both DASD and tape it prepends a CCW that limits access. For DASD it also prepends a seek (Block MX only) or does a stand-alone seek (selector only). In MVS with newer DASD it has additional functions.

MVS does have a service called STARTIO that is closer to the bare metal, but it still does a lot more than run the channel program. -- Shmuel (Seymour J.) Metz Username:Chatul (talk) 14:07, 28 September 2023 (UTC)[reply]

References

^ Sprinkle, Tommy. "EXCP I/O – Introduction". MVS/SYSPROG.

Terminology

I've been sort of following the discussion over the name of this article. It seems to me that a "filesystem" has two parts - how the data is stored on disk, and the API offered to the users. IMO "record-oriented" vs. "byte-oriented" (or stream) is determined by the API. In OS/360, before the term "filesystem" was invented, The API is structured to read "records", of various types. You could certainly treat the data as a stream, but it would be a lot more work. Likewise, in Linux systems you can impose a record structure on top of an unstructured byte stream, but it requires effort. RO vs. BO is a data structuring convention. Peter Flass (talk) 14:50, 28 September 2023 (UTC)[reply]

[1] Sprinkle, Tommy. "EXCP I/O – Introduction". MVS/SYSPROG.

[1]