Native Linux File Systems
Every native Linux filesystem implements a basic set of common concepts that were derived from those originally developed for Unix. (Native means that the filesystems were either developed originally for Linux or were first developed for other operating systems and then rewritten so that they would have functions and performance on Linux comparable or superior to those of filesystems originally developed for Linux.)
Several Linux native filesystems are currently in widespread use, including ext2, ext3, ReiserFS, JFS and XFS. Additional native filesystems are in various stages of development.
These filesystems differ from the DOS/Windows filesystems in a number of ways including
- allowing important system folders to span multiple partitions and multiple hard drives
- adding additional information about files, including ownership and permissions
- establishing a number of standard folders for holding important components of the operating system.
Linux”s first filesystem was minix, which was borrowed from the Minix OS. This filesystem because it was an efficient and relatively bug-free piece of existing software that postponed the need to design a new filesystem from scratch.
However, minix was not well suited for use on Linux hard disks for several reasons, including its maximum partition size of only 64MB, its short filenames and its single timestamp. But minix can be useful for floppy disks and RAM disks because its low overhead can sometimes allow more files to be stored than is possible with other Linux filesystems.
Journal Linux File Systems
The lack of a journaling filesystem was often cited as one of the major factors holding back the widespread adoption of Linux at the enterprise level. However, this objection is no longer valid, as there are now four such filesystems from which to choose.
Journaling filesystems offer several important advantages over static filesystems, such as ext2. In particular, if the system is halted without a proper shutdown, they guarantee consistency of the data and eliminate the need for a long and complex filesystem check during rebooting. The term journaling derives its name from the fact that a special file called a journal is used to keep track of the data that has been written to the hard disk.
In the case of conventional filesystems, disk checks during rebooting after a power failure or other system crash can take many minutes, or even hours for large hard disk drives with capacities of hundreds of gigabytes. Moreover, if an inconsistency in the data is found, it is sometimes necessary for human intervention in order to answer complicated questions about how to fix certain filesystem problems. Such downtime can be very costly with big systems used by large organizations.
In the case of a journaling filesystem, if power supply to the computer is suddenly interrupted, a given set of updates will have either been fully committed to the filesystem (i.e., written to the hard disk), in which case there is not a problem, and the filesystem can be used immediately, or the updates will have been marked as not yet fully committed, in which case the file system driver can read the journal and fix any inconsistencies that occurred.
This is far quicker than a scan of the entire hard disk, and it guarantees that the structure of the filesystem is always self-consistent. With a journaling filesystem, a computer can usually be rebooted in just a few seconds after a system crash, and although some data might be lost, at least it will not take many minutes or hours to discover this fact.
Hard Disk Input / Output (I/O)
Examination of the data structures on a hard disk will show the following areas:
- Superblock – (Note: use the fstyp -v /dev/dsk/* command to report contents)
- Inode List (Note: use the newfs -i or mkfs command to change the number of inodes)
- Data blocks
Each inode contains the following information:
- Type of file and it”s permissions, etc
- Quantity of physical links to the file
- Byte size
- Array of block addresses
- The first few block addresses are used for data storage. Additional block addresses store indirect blocks, which point at arrays containing pointers to further data blocks. Each inode contains 12 direct block pointers and 3 indirect block pointers.
- Generation number (incremented each time the inode is re-used)
- Access time stamp
- Modification time stamp
- Change time stamp
- Number of sectors
- Shadow inode location: (used for ACLs [Access Control Lists]).