Before you start
Objectives: Familiarize yourself with typical directory structure on Linux systems and how Linux sees everything as a file.
Prerequisites: no prerequisites.
Key terms: file, system, directory, folder, inside, linux, root, hard, link, bin, applications, lib, sbin
Concept of Files on Linux System
In general UNIX is designed to treat everything like a file. As far as the operating system is concerned, everything is a file. This means that devices and users are also seen as files on the file system. For example, video card, sound card, parallel ports, hard drive, all those devices are treated as files. This concept simplifies a lot of things, because operating system knows how to work with files. Operating system knows how to open a file, read from it or write to it, and then close a file.
Because UNIX treats everything like a file, there are more file types than we are used to see on Windows operating systems. In Windows environments there are actually only two general types of files. Those types are directories and normal files. Directory is a special type of file that holds pointers to other files or other directories. This type of file helps us organize other files on our file system. The other type of file in Windows, which we call a normal file, is anything that is not a directory. For example, text files, executables, images, binaries, etc, all those are normal files.
Because of the fact that UNIX treats everything as a file, including devices, we need a way in which different files will be able to accept input and give output in different formats. That’s why we have additional file types in Linux systems. We refer to them as special files.
The first special file in Linux that we will talk about is link. A link points to either another entry or another piece of the file system. Links can be hard links or symbolic links (sym links). Hard links point to some piece of information on the file system. Basically, it is a duplicate entry that points to the specific piece of data. We can have multiple hard links that all point to the same data on our file system. In our example, we have one hard link which point to some file on our file system. We have also added another hard link which points to the same file.
Hard links that point to the same piece of data are all equally the same. If we remove one hard link, the other hard link is just as valid as the original was. The file is never deleted until all of the hard links in a file system are deleted. When compared to sym links, hard links are not that commonly used. In fact, we will typically use sym links on Linux systems. Sym link is an entry in the file system that points to another entry in the file system, and in that way points to some file on the file system.
With sym links we can point to a piece of data that resides on a different volume or on different file system. Symbolic links are similar to shortcuts in Windows.
Another special type of file is a character file. Character file accepts input one character at a time (one byte at a time). Examples of devices that character files would point to are keyboard, video card, sound card, serial port, etc.
Another type of file is block file. Block files point to devices that accept input one block at a time (group of bytes of time). Examples of block devices are storage devices, like hard drives, CD-ROM drives, DVD drives, etc.
Another type of file is pipe, often called named pipe. Pipes are files that allow us to send information between applications. Each application connects to one end of the pipe and sends information back and forth.
Another type of file on Linux systems is a socket . Socket is similar to pipe, but this time we are talking about communication over network.
When working with files in Linux remember that all file names are case sensitive.
The File system on Linux operating systems works a bit differently when compared to Windows OS. The first thing we need to know is that it has only one root directory. On Windows systems every volume has its own root directory. For example, let’s say that we have two partitions on our Windows system. In that case we will have two root folders, one for every partition, and we will mark them with backslash (\). So, if we have C: drive, we will have C:\ root folder on it, and if we have D: drive, we will have D:\ root folder on it. On Linux systems we use the / character to represent the root directory.
On Linux systems we use Unified Directory Hierarchy, which means that we have one root directory. Other directories are then put inside of that one root directories. If we have multiple volumes, partitions or DVD-ROM drives, we simply mount them to some folder in that unified directory structure.
Filesystem Hierarchy Standard
All this is supported by the Filesystem Hierarchy Standard or FHS. This standard specifies what goes where in the Linux file system. At the time of writing this article, the current version of FHS is 2.3, which was announced on January 29, 2004. To check for new version go to http://www.pathname.com/fhs/.
Linux Directory Structure
FHS defines several things. First, it defines that in our root directory we have “bin“, “sbin” and “lib” directories. “bin” is intended for executables, so binaries and applications or links to applications will be here. Applications themselves can reside inside of some other folder on the file system, but they will execute from bin. The users will typically always have access to the /bin, so in that way we enable all users on our computer to access applications which are located in that folder.
“sbin” or system bin is intended for system administrative applications. Normal users will typically not be able to access the /sbin path. However root users will be able to access that path.
“lib” or libraries is intended for code which gets linked to by binaries when they are executed from /bin or /sbin or some other location. These are shared program libraries. In Windows environment those libraries are DLLs. Besides libraries, inside of lib we can also have applications themselves. We can install them in separate directories inside of lib and then link to them inside of bin or sbin folder. One important directory inside of “lib” is the “modules” directory which contains linkable libraries for the kernel itself. The /bin, /sbin, and /lib will typically contain things from the operating system itself. Those things are core utilities that everybody in the system will use.
Another important directory is “usr“, which is not intended for user files. Instead it is intended for programs and applications, but those kind of programs that common users are going to be using. Inside of “usr” we will also have “bin”, “sbin” and “lib” folders. Another folder that we will find inside of “usr” folder is the “local” folder. Inside of “local” folder we will also have “bin”, “sbin” and “lib” folders. “bin”, “sbin” and “lib” folders have the same function on every location, but it depends on what types of files and applications are stored in them based on the location of those folders. The /bin, /sbin and /lib folders contain core system utilities. The /usr/bin, /usr/sbin, /usr/lib directories are intended for end user applications and binaries. The “local” folder inside of /usr is intended to not be usable over the network. As you know, we can mount any folder over the network. With “local” we ensure that the applications that we install inside of the /usr/local are not available over the nework.
Another folder that we should mention is called “opt” which is intended for optional software that is not packed inside of specific Linux distribution. The “opt” folder is also intended to be used locally. Some implementations might also include the /usr/doc subdirectory (or /usr/share/doc subdirectory) to hold documentation accessible to all users.
Other directories that we will typically see are “root” and “home“. The “root” and “home” directories are home directories for users. The /root is the directory for the root administrative user. Don’t confuse /root with the root of the system (/). Inside of the “home” directory we will have all of the home folders for all of our users. By default users will only have read and write access to their own home directory.
Another folder that we will see is the “boot” directory, which contains everything that is required to boot the operating system. That includes the kernel and boot loader files.
Another folder which is required during the boot is”initrd“. It holds the initial RAM drive image.
Other directories which we will find are “media” and “mnt“. “media” and “mnt” are intended for removable media, like CD-ROM, DVD-ROM, Floppy, etc. When we insert DVD, the DVD would be mounted inside a “media” folder in a form of a folder. That way we simply navigate to the /media/dvdrom/ and we can access content on the DVD. Originally this was designed to be inside the “mnt” directory, but as of version 2.3 this is changed to “media”. “mnt” is still there and is used to hold mount points to other volumes or other temporary volumes.
Another directory that we have to know about is the “srv” directory. This folder can contain things related to, for example, FTP server or Web server, etc.
We also have a “var” folder which was designed for variable data, data that changes a lot. Inside of “var” we will find folders like log, cache, mail, spool, etc. Originally, ftp and www used to be inside of “var” folder.
Another important directory is the “tmp“, which is intended for temporary files. The operating system and applications write temporary files in that folder. Typically it is safe to delete files from that folder it they are not currently used by some application.
Another folder is “etc” which holds configuration files specific to the system.
We should also mention some special directories on our file system. These are different from those that we talked about until now. The first one is the “dev” directory which holds all of device files and special files like block files, character files, etc. We also have two more folders which are more virtual than real. They don’t exist in reality, they are not real files. They are dynamically generated and they report what is going on to the system. Those two folders are “proc” and “sys“. As you know, Linux sees everything as a file. This works in the same way. The “proc” contains real time system state and process state information. This includes information about CPU and memory usage. Because this folder contains real system state, we should be very careful when editing files inside that folder. If we don’t know what we are doing, we shouldn’t edit files inside of that folder.
The “sys” directory was added with kernel 2.6, and it shows more specific system information. “proc” folder contains more process specific information, while “sys” contains specific system information. Both of these folders are virtual, and they don’t actually exist in the real world.
Linux has unified filesystem. It has one root directory, and everything exists inside of it. We can mount volumes to empty directors inside of the unified file system. The file system has certain structure. That structure is determined by FHS or Filesystem Hierarchy Standard. Linux sees everything as a file. File system is case sensitive.