Introduction to Storage Area Network (SAN)

Before you start

Objectives: Learn what is SAN, typical terms used with SAN, and what is clustering.

Prerequisites: you should know what is Fiber Channel.

Key terms: SAN, Initiator, Target, storage device, Fiber Channel, Clustering


 What is SAN

SAN is very different from classic PATA or SATA storage devices.  Instead of installing the hard disk drive inside of the system, we connect the SAN to the system using an external connection trough a special type of network. SANs are expensive to implement, but are commonly used in large enterprises. The benefit of SAN is that it allow us to transfer data at very high rate, and allows us to use various clustering technologies to make our data highly available. SAN is typically used in server environments.

How SAN Works

SAN allows our computer (typically a server) to connect to various types of remote storage devices, typically called targets. This includes hard disk drives, RAID arrays, tape disk drives, etc. Although we connect to those device trough a network connection, we are not talking about Network Attached Storage (NAS) here. SAN works in a different way than NAS.

SANs usually use Fiber Channel to connect servers to storage devices trough a type of network connection. This can be accomplished in different ways. For example, we can buy a Fiber Channel card and install it on our server. Then we connect our server with a Fiber Channel cable to a storage device, like Fiber Channel RAID array. However, with SANs we typically use Fiber Channel switch (FCS) and connect then connect Fiber Channel devices to it. This way we can have multiple RAID arrays, as well as multiple servers, all connected to the same Fiber Channel switch.

The biggest difference between SAN and NAS is the way in which we connect our system to the storage devices. With NAS, we attach our system to the network with the Ethernet cable, and it functions as a server on the network. When we want to connect our client to the NAS server, we use protocol like Server Message Block (SMB), or Network File System (NFS). Basically, we use a network protocol to attach to NAS. This is different from what we do with SAN. With SAN, we connect our server with the storage devices using fiber optic cabling and fiber optic switch, and the servers see storage devices as if they were directly attached to the system. In other words, with SAN, our servers will “think” that the RAID array that we have connected to the Fiber Channel switch is installed right in the system. That’s because Fiber Channel Storage Area Networks use Block-Level I/O. This is the same way in which SCSI hard disk drives work if they were installed directly in the server. In fact, if our sever wants to use SAN storage, it uses SCSI type commands, just as it would use them as if the drive was actually attached locally.

When dealing with SAN, we typically say that the server is an Initiator and that the SAN storage device is the Target. So, the Initiator establishes the connection with the storage device, which we call the Target. With SAN, we typically have multiple Initiators and multiple Targets. If multiple Initiators want to write to the same file on the same Target at the same time, we have a problem. That’s why with SAN we need to have access control. By default, with SAN only one Initiator can access one Target at a time. Usually, we don’t want to set up SAN that way, because this is not very efficient. Instead, we can set up clustering software that controls multi-initiator access to the same device. This allows us to have two or more servers connected to the same Target. The clustering software is used to control access, by controlling which Initiator is accessing which Target at a time. The clustering software will make sure that two Initiators are not trying to communicate to the same target at the same time.

The biggest benefit of clustering is that if one server becomes inoperative, the other systems in the cluster, can take over for the one that went down. Users won’t se e no difference in their access. The storage (data) remains intact, since it was not directly attached to the server that went down. Also, the server that went down can be replaced without service interruption.