Skip to main content

Aside Reads Don't Access Allocation Structures

This page is a generated reference surface for selective reading. It exists to keep the learner apps guide-first while still preserving source access.

Learning objectives

  • Explain the main ideas and vocabulary in Aside Reads Don't Access Allocation Structures.
  • Work through the source examples for Aside Reads Don't Access Allocation Structures without depending on raw chunk order.
  • Use Aside Reads Don't Access Allocation Structures as selective reference when learner modules point back to Ostep.

Prerequisites

  • None curated yet.

Module targets

  • module-04-file-systems-io

AI companion modes

  • Explain simply
  • Socratic tutor
  • Quiz me
  • Challenge my understanding
  • Diagnose my confusion
  • Generate extra practice
  • Revision mode
  • Connect forward / backward

Source-of-truth note

This unit is anchored to Ostep and the source chapter "Aside Reads Don't Access Allocation Structures". Use external resources only to clarify, extend, or modernize details without replacing the chapter's conceptual spine.

External enrichment

No chapter-specific enrichment resources are curated yet. Add them in the unit manifest when a source clearly improves learning.

Source provenance

  • Primary source: Ostep
  • Source chapter: Aside Reads Don't Access Allocation Structures
  • Raw source file: 199-aside-readsdon-taccessallocationstructures.md

Merged source

Aside Reads Don't Access Allocation Structures

ASIDE: READSDON'TACCESSALLOCATIONSTRUCTURES

We've seen many students get confused by allocation structures such as bitmaps. In particular, many often think that when you are simply reading a file, and not allocating any new blocks, that the bitmap will still be consulted. This is not true! Allocation structures, such as bitmaps, are only accessed when allocation is needed. The inodes, directories, and indirect blocks have all the information they need to complete a read request; there is no need to make sure a block is allocated when the inode already points to it.

At some point, the file will be closed. There is much less work to be done here; clearly, the file descriptor should be deallocated, but for now, that is all the FS really needs to do. No disk I/Os take place.

A depiction of this entire process is found in Figure 40.3 (time increases downward). In the figure, the open causes numerous reads to take place in order to finally locate the inode of the file. Afterwards, reading each block requires the file system to first consult the inode, then read the block, and then update the inode's last-accessed-time field with a write.

Spend some time and try to understand what is going on.

Also note that the amount of I/O generated by the open is proportional to the length of the pathname. For each additional directory in the path, we have to read its inode as well as its data. Making this worse would be the presence of large directories; here, we only have to read one block to get the contents of a directory, whereas with a large directory, we might have to read many data blocks to find the desired entry. Yes, life can get pretty bad when reading a file; as you're about to find out, writing out a file (and especially, creating a new one) is even worse.

Writing to Disk

Writing to a file is a similar process. First, the file must be opened (as above). Then, the application can issuewrite()calls to update the file with new contents. Finally, the file is closed.

Unlike reading, writing to the file may also allocate a block (unless the block is being overwritten, for example). When writing out a new file, each write not only has to write data to disk but has to first decide which block to allocate to the file and thus update other structures of the disk accordingly (e.g., the data bitmap and inode). Thus, each write to a file logically generates five I/Os: one to read the data bitmap (which is then updated to mark the newly-allocated block as used), one to write the bitmap (to reflect its new state to disk), two more to read and then write the inode (which is updated with the new block's location), and finally one to write the actual block itself.

The amount of write traffic is even worse when one considers a simple and common operation such as file creation. To create a file, the file system must not only allocate an inode, but also allocate space within read read read read create read (/foo/bar) write

write read write write read read

write() write

write write read read

write() write

write write read read

write() write

write write

Figure 40.4:File Creation Timeline (Time Increasing Downward) the directory containing the new file. The total amount of I/O traffic to do so is quite high: one read to the inode bitmap (to find a free inode), one write to the inode bitmap (to mark it allocated), one write to the new inode itself (to initialize it), one to the data of the directory (to link the high-level name of the file to its inode number), and one read and write to the directory inode to update it. If the directory needs to grow to accommodate the new entry, additional I/Os (i.e., to the data bitmap, and the new directory block) will be needed too. All that just to create a file!

Let's look at a specific example, where the file/foo/baris created, and three blocks are written to it. Figure 40.4 shows what happens during theopen()(which creates the file) and during each of three 4KB writes.

In the figure, reads and writes to the disk are grouped under which system call caused them to occur, and the rough ordering they might take place in goes from top to bottom of the figure. You can see how much work it is to create the file: 10 I/Os in this case, to walk the pathname and then finally create the file. You can also see that each allocating write costs 5 I/Os: a pair to read and update the inode, another pair to read and update the data bitmap, and then finally the write of the data itself.

How can a file system accomplish any of this with reasonable efficiency?

THECRUX: HOWTOREDUCEFILESYSTEMI/O COSTS

Even the simplest of operations like opening, reading, or writing a file incurs a huge number of I/O operations, scattered over the disk. What can a file system do to reduce the high costs of doing so many I/Os?