Linux SAN Multipathing
There are a lot of SAN multipathing solutions on Linux at the moment. Two of them are discussesed in this blog. The first one is device mapper multipathing that is a failover and load balancing solution with a lot of configuration options. The second one (mdadm multipathing) is just a failover solution with manuel re-anable of a failed path. The advantage of mdadm multiphating is that it is very easy to configure.
Before using a multipathing solution for a production environment on Linux it is also important to determine if the used solution is supportet with the used Hardware. For example HP doesn’t support the Device Mapper Multipathing solution on their servers yet.
Device Mapper Multipathing
Procedure for configuring the system with DM-Multipath:
- Install device-mapper-multipath rpm
- Edit the multipath.conf configuration file:
- comment out the default blacklist
- change any of the existing defaults as needed
- Start the multipath daemons
- Create the multipath device with the multipath
Install Device Mapper Multipath
# rpm -ivh device-mapper-multipath-0.4.7-8.el5.i386.rpm warning: device-mapper-multipath-0.4.7-8.el5.i386.rpm: Header V3 DSA signature: Preparing... ########################################### [100%] 1:device-mapper-multipath########################################### [100%]
Initial Configuration
Set user_friendly_name. The devices will be created as /dev/mapper/mpath[n]. Uncomment the blacklist.
# vim /etc/multipath.conf #blacklist { # devnode "*" #} defaults { user_friendly_names yes path_grouping_policy multibus }
Load the needed modul and the startup service.
# modprobe dm-multipath # /etc/init.d/multipathd start # chkconfig multipathd on
Print out the multipathed device.
# multipath -v2 or # multipath -v3
Configuration
Configure device type in config file.
# cat /sys/block/sda/device/vendor HP # cat /sys/block/sda/device/model HSV200 # vim /etc/multipath.conf devices { device { vendor "HP" product "HSV200" path_grouping_policy multibus no_path_retry "5" } }
Configure multipath device in config file.
# cat /var/lib/multipath/bindings # Format: # alias wwid # mpath0 3600508b400070aac0000900000080000 # vim /etc/multipath.conf multipaths { multipath { wwid 3600508b400070aac0000900000080000 alias mpath0 path_grouping_policy multibus path_checker readsector0 path_selector "round-robin 0" failback "5" rr_weight priorities no_path_retry "5" } }
Set not mutipathed devices on the blacklist. (f.e. local Raid-Devices, Volume Groups)
# vim /etc/multipath.conf devnode_blacklist { devnode "^cciss!c[0-9]d[0-9]*" devnode "^vg*" }
Show Configured Multipaths.
# dmsetup ls --target=multipath mpath0 (253, 1) # multipath -ll mpath0 (3600508b400070aac0000900000080000) dm-1 HP,HSV200 [size=10G][features=1 queue_if_no_path][hwhandler=0] \_ round-robin 0 [prio=4][active] \_ 0:0:0:1 sda 8:0 [active][ready] \_ 0:0:1:1 sdb 8:16 [active][ready] \_ 1:0:0:1 sdc 8:32 [active][ready] \_ 1:0:1:1 sdd 8:48 [active][ready]
Format and mount Device
Fdisk cannot be used with /dev/mapper/[dev_name] devices. Use fdisk on the underlying disks and execute the following command when device-mapper multipath maps the device to create a /dev/mapper/mpath[n] device for the partition.
# fdisk /dev/sda # kpartx -a /dev/mapper/mpath0 # ls /dev/mapper/* mpath0 mpath0p1 # mkfs.ext3 /dev/mapper/mpath0p1 # mount /dev/mapper/mpath0p1 /mnt/san
After that /dev/mapper/mpath0p1 is the first partition on the multipathed device.
Multipathing with mdadm on Linux
The md multipathing solution is only a failover solution what means that only one path is used at one time and no load balancing is made.
Start the MD Multipathing Service
# chkconfig mdmpd on # /etc/init.d/mdmpd start
On the first Node (if it is a shared device)
Make Label on Disk
# fdisk /dev/sda Disk /dev/sdt: 42.9 GB, 42949672960 bytes 64 heads, 32 sectors/track, 40960 cylinders Units = cylinders of 2048 * 512 = 1048576 bytes Device Boot Start End Blocks Id System /dev/sdt1 1 40960 41943024 fd Linux raid autodetect # partprobe
Bind multiple paths together
# mdadm --create /dev/md4 --level=multipath --raid-devices=4 /dev/sdq1 /dev/sdr1 /dev/sds1 /dev/sdt1
Get UUID
# mdadm --detail /dev/md4 UUID : b13031b5:64c5868f:1e68b273:cb36724e
Set md configuration in config file
# vim /etc/mdadm.conf # Multiple Paths to RAC SAN DEVICE /dev/sd[qrst]1 ARRAY /dev/md4 uuid=b13031b5:64c5868f:1e68b273:cb36724e # cat /proc/mdstat
On the second Node (Copy the /etc/mdadm.conf from the first node)
# mdadm -As # cat /proc/mdstat
Restore a failed path
# mdadm /dev/md1 -f /dev/sdt1 -r /dev/sdt1 -a /dev/sdt1
ASM Disk not shown in Oracle Universal Installer (OUI) or DBCA
Terms:
Operating System: Enterprise Linux 4 U5 (RHEL4 U5)
Oracle: 10.2.0.1
Problem:
While installing the ASM Instance with Oracle Universal Installer the ASM Disk, created with oracleasm createdisk, is not shown.
Solution:
Define the Scanorder in /etc/sysconfig/oracleasm config file. For example, if the used multipathing device is /dev/md1, you have to force the ASMlib to scan the /dev/md* paths before the /dev/sd* paths.
# vim /etc/sysconfig/oracleasm # ORACLEASM_SCANORDER: Matching patterns to order disk scanning ORACLEASM_SCANORDER="md sd"
Also make sure that the needed packages are installed for using ASM with ASMlib.
Make sure that the needed packages are installed.
- oracleasmlib-2.0 – the ASM libraries
- oracleasm-support-2.0 – utilities needed to administer ASMLib
- oracleasm – a kernel module for the ASM library
More Inofs:
Metalink Note:394956.1
Grid Control Error: Agent unreachable
Terms:
Operating System: Enterprise Linux 4 U5 (RHEL4 U5)
Oracle: 10.2.0.2
Problem:
Communication from the Oracle Management Service host to the Agent host failed. Agent crashes after a few hour of correct working.
$ cd /u01/app/oracle/product/10.2.0/agent10g/xen1.pool/sysman/log/ $ more emagent.trc 2007-11-15 06:06:56 Thread-4112513952 ERROR util.files: ERROR: nmeufos_new: failed in lfiopn on file:
/u01/app/oracle/product/10.2.0/agent10g/xen1.pool/sysman/emd/agntstmp.txt.erro r = 24 (Too many open files) 2007-11-15 06:06:56 Thread-4112513952 ERROR pingManager: Error in updating the agent time stamp file 2007-11-15 06:06:59 Thread-4112513952 ERROR fetchlets.healthCheck: GIM-00104: file not found LEM-00031: file not found; arguments: [lempgmh] [lmserr] LEM-00033: file not found; arguments: [lempgfm] [Couldn't open message file] LEM-00031: file not found; arguments: [lempgmh] [lmserr]
Solution:
Important is the message about too many open files. This is caused by operating system running out of open file handles limit (default for linux is 1024).
You need to increase the file descriptor soft limit per shell to 4096 from default 1024.
# vim /etc/security/limits.conf #oracle soft nofile 1024 oracle soft nofile 4096 oracle hard nofile 65536 # sysctl -p
There are also a few infos about this problem from Oracle:
http://download-west.oracle.com/docs/cd/B16240_01/doc/relnotes.102/b31594/toc.htm
Oracle Clusterware Installation – Timed out waiting for the CRS stack to start
Terms:
Operating System: Enterprise Linux 4 U5 (RHEL4 U5)
Oracle: 10.2.0.1
Clusterware: 10.2.0.1
Cluster Interconnect: Ethernet (private connection)
Problem:
While running the root.sh script on the last cluster node, during the CLusterware Installation, the following error message occurs.
Clean remove Oracle Clusterware (CRS) 10GR2 from a RHEL4
The problem with Oracle Clusterware (also known as Cluster Ready Service – CRS) is, that there is no built-in mechanism from Oracle to clean remove the Clusterware and all of it’s files distributed over the OS filesystem. The follwowing example script removes the Oracle Clusterware completely. The operating system is a RHEL4 U5.
This script has to be edited for personal use. $ORA_CRS_HOME is here under /opt/oracle.
########### script ############### #!/bin/bash echo echo "Remove the Oracle Clusterware Service ?" echo echo "Enter y[yes] or n[no] to exit"read comit if [ $comit == "n" ]; then echo "Exit from Script without any change..." exit 1 else echo "Start to Shutdown and Remove Oracle Clusterware ..." echo /etc/init.d/init.evmd stop /etc/init.d/init.evmd disable /etc/init.d/init.cssd stop /etc/init.d/init.cssd disable /etc/init.d/init.crsd stop /etc/init.d/init.crsd disable /etc/init.d/init.crs stop /etc/init.d/init.crs disable rm -rf /etc/oracle /etc/oraInst.loc /etc/oratab rm -rf /etc/init.d/init.crsd /etc/init.d/init.crs /etc/init.d/init.cssd /etc/init.d/init.evmd
rm -rf /etc/rc2.d/K96init.crs /etc/rc2.d/S96init.crs etc/rc3.d/K96init.crs \ /etc/rc3.d/S96init.crs /etc/rc4.d/K96init.crs /etc/rc4.d/S96init.crs \ /etc/rc5.d/K96init.crs /etc/rc5.d/S96init.crs /etc/rc.d/rc0.d/K96init.crs \ /etc/rc.d/rc1.d/K96init.crs /etc/rc.d/rc6.d/K96init.crs /etc/rc.d/rc4.d/K96init.crs cp /etc/inittab.orig /etc/inittab rm -rf /etc/inittab.crs /etc/inittab.no_crs rm -rf /tmp/* rm -rf /tmp/.oracle rm -rf /usr/local/bin/dbhome /usr/local/bin/oraenv /usr/local/bin/coraenv rm -rf /var/tmp/.oracle rm -rf /opt/oracle/* echo echo "Remove on one Node the Shared Devices" echo "rm -rf /u03/oracrs/*" echo fi ########### end script ###############
After running that script on a system it should be possible to reinstall Oracle Clusterware without any problems.
-
Recent
- Linux – Repair Bootloader / Change Boot device path
- Join RedHat Linux to Microsoft Active Directory
- HP-UX Increase Veritas cluster filesystem (CFS) online
- MC/Serviceguard Cluster – Replace Quorum Server
- HP-UX Integrity Virtual Machines (Integrity VM)
- MC/Serviceguard Cluster on HP-UX 11.31
- HP-UX 11i comfortable shell environment
- Xen Guest (DomU) Installation
- Linux SAN Multipathing (HP Storage)
- Linux Network Bonding
- Linux SAN Multipathing
- ASM Disk not shown in Oracle Universal Installer (OUI) or DBCA
-
Links
-
Archives
- December 2011 (1)
- July 2010 (1)
- April 2010 (1)
- August 2009 (1)
- October 2008 (1)
- August 2008 (1)
- May 2008 (1)
- March 2008 (1)
- February 2008 (1)
- December 2007 (1)
- November 2007 (5)
-
Categories
-
RSS
Entries RSS
Comments RSS