Sunday, October 3, 2010

OCFS Cluster File System

OCFS stands for Oracle Cluster File System. It is a shared disk file system developed by Oracle Corporation and released under the GNU General Public License.

The first version of OCFS was developed with the main focus to accommodate oracle database files for clustered databases. Because of that it was not an POSIX compliant file system. With version 2 the POSIX features were included.

OCFS2 (version 2) was integrated into the version 2.6.16 of Linux kernel. Initially, it was marked as “experimental” (Alpha-test) code. This restriction was removed in Linux version 2.6.19. With kernel version 2.6.29 more features have been included into ocfs2 especially access control lists and quota.[2]

OCFS2 uses a distributed lock manager which resembles the OpenVMS DLM but is much simpler.

Hardware Requirement

* Shared Storage, accessible over SAN between cluster nodes.

* HBA for Fiber SAN on each node.

* Network Connectivity for hearbeat between servers.

OS Installation

Regular installation of RHEL 5.x. 64 bit with the following configuration.

1. SELinux must be disabled. 2. Time Zone - Saudi Time. (GMT +3)

Packages

1. Gnome for Graphics desktop.

2. Development libraries

3. Internet tools - GUI and text based

4. Editors - GUI and text based

Partitioning the HDD

1. 200 MB for /boot partition

2. 5 GB for /var partition

3. 5 GB for /tmp partition

4. Rest of the space to / partition.

The server should be up to date with the latest patches from RedHat Network.

Installation of OCFS2 Kernel Module and Tools

OCFS2 Kernel modules and tools can be downloaded from the Oracle web sites.

OCFS2 Kernel Module:

http://oss.oracle.com/projects/ocfs2/files/RedHat/RHEL5/x86_64/1.4.1-1/2.6.18-128.1.1.el5

http://oss.oracle.com/projects/ocfs2/files/RedHat/RHEL5/x86_64/1.4.1-1/2.6.18-128.1.10.el5/

Note that 2.6.18-128.1.1.el5 should match the current running kernel on the server. A new OCFS2 Kernel package should be downloaded and installed each time the kernel is updated to a new version.

OCFS2 Tools

http://oss.oracle.com/projects/ocfs2-tools/dist/files/RedHat/RHEL5/x86_64/1.4.1-1/ocfs2-tools-1.4.1-1.el5.x86_64.rpm

OCFS2 Console

http://oss.oracle.com/projects/ocfs2-tools/dist/files/RedHat/RHEL5/x86_64/1.4.1-1/ocfs2console-1.4.1-1.el5.x86_64.rpm

OCFS2 Tools and Console depends on several other packages which are normally available on a default RedHat Linux installation, except for VTE (A terminal emulator) package.

So in order to satisfy the dependencies of OCFS2 you have to install the vte package using

yum install vte

After completing the VTE installation start the OCFS2 installation using regular RPM installation procedure.

rpm -ivh ocfs2-2.6.18-92.128.1.1.el5-1.4.1-1.el5.x86_64.rpm

rpm -ivh ocfs2console-1.4.1-1.el5.x86_64.rpm

rpm -ivh ocfs2-tools-1.4.1-1.el5.x86_64.rpm

This will copy the necessary files to its corresponding locations.

Following are the important tools and files that are used frequently

/etc/init.d/o2cb /sbin/mkfs.ext2 /etc/ocfs2/cluster.conf (Need to create this Folder and file manually)

OCFS2 Configuration

It is assumed that the shared SAN storage is connected to the cluster nodes and is available as /dev/sdb. This document will cover installation of only two node (node1 and node2) ocfs2 cluster.

Following are the steps required to configure the cluster nodes.

Create the folder /etc/ocfs2

mkdir /etc/ocfs2

Create the cluster configuration file /etc/ocfs2/cluster.conf and add the following contents:

cluster:

      node_count = 2
name = vmsanstorage

node:

      ip_port = 7777
ip_address = 172.x.x.x
number = 1
name = mc1.ocfs
cluster = vmsanstorage

node:

      ip_port = 7777
ip_address = 172.x.x.x
number =2
name = mc2.ocfs
cluster = vmsanstorage

Note that the:

* Node name should match the “hostname” of corresponding server.

* Node number should be unique for each member.

* Cluster name for each node should match the “name” field in “cluster:” section.

* “node_count” field in “cluster:” section should match the number of nodes.

O2CB cluster service configuration

The o2cb cluster service can be configured using:

/etc/init.d/o2cb configure (This command will show the following dialogs)

Configuring the O2CB driver

This will configure the on-boot properties of the O2CB driver.

The following questions will determine whether the driver is loaded onboot.

The current values will be shown in brackets ('[]').

Hitting without typing an answer will keep that current value.

Ctrl-C will abort.

Load O2CB driver on boot (y/n) [n]: y

Cluster stack backing O2CB [o2cb]:

Cluster to start on boot (Enter “none” to clear) [ocfs2]: ocfs2

Specify heartbeat dead threshold (>=7) [31]:

Specify network idle timeout in ms (>=5000) [30000]:

Specify network keepalive delay in ms (>=1000) [2000]:

Specify network reconnect delay in ms (>=2000) [2000]:

Writing O2CB configuration: OK

Loading filesystem “ocfs2_dlmfs”: OK

Mounting ocfs2_dlmfs filesystem at /dlm: OK

Starting O2CB cluster ocfs2: OK

Note that the driver should be loaded while booting and the “Cluster to start” should match the cluster name, in our case “ocfs2”.

As a best practice it is adviced to reboot the server after successfully completing the above configuration.

Formating and Mounting the shared file system

Before we can start using the shared filesystem, we have to format the shared device using OCFS2 filesystem.

Following command will format the filesystem with ocfs2 and will set some additional features.

mkfs.ocfs2 -T mail -L ocfs-mnt –fs-features=backup-super,sparse,unwritten -M cluster /dev/sdb

Where :

-T mail

Specify how the filesystem is going to be used, so that mkfs.ocfs2 can chose optimal filesystem parameters for that use.

“mail” option is a ppropriate for file systems which will have many meta data updates. Creates a larger journal.

-L ocfs-mnt

Sets the volume label for the filesystem. It will used instead of device named to identify the block device in /etc/fstab

–fs-features=backup-super,sparse,unwritten

Turn specific file system features on or off.

backup-super

Create backup super blocks for this volume

sparse

Enable support for sparse files. With this, OCFS2 can avoid allocating (and zeroing) data to fill holes

unwritten

Enable unwritten extents support. With this turned on, an application can request that a range of clusters be pre-allo-cated within a file.

-M cluster

Defines if the filesystem is local or clustered. Cluster is used by default.

/dev/sdb

Block device that need to be formated.

Note: The default mkfs.ocfs2 option covers only 4 node cluster. In case if you have more nodes you have to specify number of node slots using -N number-of-node-slots.

We are ready to mount the new filesystem once the format operation is completed successfully.

You may mount the new filesystem using the following command. It is assumed that the mount point (/mnt) exists already.

mount /dev/sdb /mnt

If the mount operation was successfully completed you can add the following entry to /etc/fstab for automatic mounting during the bootup process.

LABEL=ocfs-mnt /mnt ocfs2 rw,_netdev,heartbeat=local 0 0

Test the newly added fstab entry by rebooting the server. The server should mount /dev/sdb automatically to /mnt. You can verify this using ” df ” command after reboot.

No comments: