Purpose of this post is to give systems administrators an overview and a comparison of File Systems available on Linux.
Linux File Systems
One of the most interesting features of the Linux OS is its variety of file systems. File systems can be defined and built on a partition basis. VFAT, ext2, ext3, ext4 and Reiser file systems can co-exist on the same Linux system, along with several other file systems and raw partitions.
Your choice of which one to use then becomes based on supportability, reliability, security and performance. Oracle generally does not certify its products against file systems but does certify operating systems. However, for some Linux distributions, Oracle might choose to have certifications on different filesystems. Depending on the version, Linux may include ext2, ext3, ext4, btrfs and NFS-based storage systems (e.g. NetApp).
Recommended File SystemsM
There are various file systems available for Linux OS:
- The ext2, ext3, ext4 file systems are robust. ext2 was the default file system under the 2.2 kernel. ext3 is simply the enhanced ext2 filesystem with a journaling feature. ext3 is the default filesystem for RHEL 3 and 4. ext4 was developed as the successor of ext3. It provides features for large filesystems, performance, increased limits, and reliability.
- Oracle Cluster File System (OCFS) is a shared file system designed specifically for Oracle Real Application Cluster (RAC). OCFS eliminates the requirement for Oracle database files to be linked to logical drivers. OCFS volumes can span one shared disk or multiple shared disks for redundancy and performance enhancements.
- OCFS2 is the next generation of the Oracle Cluster File System for Linux. It is an extent based, POSIX compliant file system. Unlike the previous release (OCFS), OCFS2 is a general-purpose file system that can be used for shared Oracle home installations making management of Oracle Real Application Cluster (RAC) installations even easier.
- XFS is designed for high scalability and provides near native I/O performance even when the file system spans multiple storage devices
In summary, the recommended filesystems are:
- Single node: Any filesystem that is supported by the Linux vendor. Note that any filesystem issues are need to be resolved by the Linux vendor.
- Multi-node (RAC): OCFS, raw, NFS-based storage systems (e.g. NetApp).
File System Characteristics
When choosing a file system, performance is not the most important point. For example if there is a risk that data can get corrupted, lost or compromised, a fast file system should not be used. Oracle does not support files on file systems that do not have a write-through-cache capability. The file system must acknowledge the write operations. For example Standard NFS is UDP based, which is a network protocol that does not include an acknowledgment mechanism. One vendor that supplies a supported network file system is Network Appliance, and they are using a modified NFS protocol.
There are security requirements as well. Oracle RDBMS and the database files require special file permissions, which are not available on some file systems (such as VFAT). If the specific file permissions are not set properly, the Oracle RDBMS does not function properly. Data files should be accessible only for the database owner . The database server should be able to control all other file and data access.
Journaling is a popular characteristic. The major benefit is that the changes to the file system are recorded in a journal file. If the server crashes or shuts down without synchronizing the disk, the journal file can be applied to the file system. Integrity checks and recovery for such file systems are very fast. This is quite noticeable during the system boot up. The fsck command checks journaled file systems more quickly compared to non-journaled file systems.
Single Node (local) filesystems
ext2, ext3, ext4
– The ext2, ext3 and ext4 file systems are closely related.
– ext2 can be converted to ext3.
– ext3 can be mounted as an ext2 file system.
– ext3 is a journaled file system.
– ext3 has several performance enhancements to ext2.
– ext3 can be mounted as ext4.
– ext4 has all features provided by ext3 and provides features for larger filesystems, performance, increased limits, reliability.
ReiserFS
– It is the default file system for Novell/SuSE Linux. On RedHat distributions it is not installed by default, but the necessary packages are on the distribution media.
– ReiserFS is currently under maintenance mode with SuSE/Novell.
btrfs
– From Btrfs wiki: Btrfs is a new copy on write (CoW) filesystem for Linux aimed at implementing advanced features while focusing on fault tolerance, repair and easy administration.
– Jointly developed at Oracle, Red Hat, Fujitsu, Intel, SUSE, STRATO and many others, Btrfs is licensed under the GPL and open for contribution from anyone.
See https://btrfs.wiki.kernel.org for further information.
XFS
XFS is designed for high scalability and provides near native I/O performance even when the file system spans multiple storage devices. See XFS Filesystem on Oracle Linux for more details.
Multi-node (shared / clustered) filesystems
Raw Partitions
– Raw partitions have been considered the high performance solution.
– Raw reads and write do not use the OS buffer cache.
– Raw reads and writes can move larger buffers than file system I/Os.
– Raw requires more experienced administration.
Oracle Cluster File System (OCFS)
– Oracle Cluster File System is designed for use with RAC. Oracle supports OCFS for use with databases files. OCFS is not a journaled file system, but does have very good performance metrics.
– Its performance is less than 5% slower compared raw devices and in most tests only 2% slower.
– Starting with OCFS vers. 1.0.14-1, OCFS support the Async I/O.
OCFS2
OCFS2 is the next generation of the Oracle Cluster File System for Linux. It is an extent based, POSIX compliant file system. Unlike the previous release (OCFS), OCFS2 is a general-purpose file system that can be used for shared Oracle home installations making management of Oracle Real Application Cluster (RAC) installations even easier. Among the new features and benefits are:
- Node and architecture local files using Context Dependent Symbolic Links (CDSL).
- Network based pluggable DLM.
- Improved journaling/node recovery using the Linux Kernel “JBD” subsystem.
- Improved performance of meta-data operations (space allocation, locking, etc).
- Improved data caching/locking (for files such as oracle binaries, libraries, etc)
See https://oss.oracle.com/projects/ocfs2/ for more information.