Thomas Vogt’s IT Blog

knowledge is power …

Grid Control Error: Agent unreachable

Terms:

Operating System: Enterprise Linux 4 U5 (RHEL4 U5)

Oracle: 10.2.0.2

Problem:

Communication from the Oracle Management Service host to the Agent host failed. Agent crashes after a few hour of correct working.

$ cd /u01/app/oracle/product/10.2.0/agent10g/xen1.pool/sysman/log/
$ more emagent.trc

2007-11-15 06:06:56 Thread-4112513952 ERROR util.files: ERROR: nmeufos_new: failed in lfiopn on file:
/u01/app/oracle/product/10.2.0/agent10g/xen1.pool/sysman/emd/agntstmp.txt.erro
r = 24 (Too many open files)
2007-11-15 06:06:56 Thread-4112513952 ERROR pingManager: Error in updating the agent time stamp file
2007-11-15 06:06:59 Thread-4112513952 ERROR fetchlets.healthCheck: GIM-00104: file not found
LEM-00031: file not found; arguments: [lempgmh] [lmserr]
LEM-00033: file not found; arguments: [lempgfm] [Couldn't open message file]
LEM-00031: file not found; arguments: [lempgmh] [lmserr]

Solution:

Important is the message about too many open files. This is caused by operating system running out of open file handles limit (default for linux is 1024).

You need to increase the file descriptor soft limit per shell to 4096 from default 1024.

# vim /etc/security/limits.conf

#oracle              soft    nofile  1024
oracle              soft    nofile  4096
oracle              hard    nofile  65536

# sysctl -p

There are also a few infos about this problem from Oracle:

http://download-west.oracle.com/docs/cd/B16240_01/doc/relnotes.102/b31594/toc.htm

November 27, 2007 Posted by | Linux, Oracle, RAC | 1 Comment

Oracle Clusterware Installation – Timed out waiting for the CRS stack to start

Terms:

Operating System: Enterprise Linux 4 U5 (RHEL4 U5)

Oracle: 10.2.0.1

Clusterware: 10.2.0.1

Cluster Interconnect: Ethernet (private connection)

Problem:

While running the root.sh script on the last cluster node, during the CLusterware Installation, the following error message occurs.

November 27, 2007 Posted by | Clusterware, Linux, Oracle, RAC | 1 Comment

Clean remove Oracle Clusterware (CRS) 10GR2 from a RHEL4

The problem with Oracle Clusterware (also known as Cluster Ready Service – CRS) is, that there is no built-in mechanism from Oracle to clean remove the Clusterware and all of it’s files distributed over the OS filesystem. The follwowing example script removes the Oracle Clusterware completely. The operating system is a RHEL4 U5.

This script has to be edited for personal use. $ORA_CRS_HOME is here under /opt/oracle.

########### script ###############

#!/bin/bash

echo

echo "Remove the Oracle Clusterware Service ?"

echo

echo "Enter y[yes] or n[no] to exit"read comit

if [ $comit == "n" ]; then

echo "Exit from Script without any change..."

exit 1

else

echo "Start to Shutdown and Remove Oracle Clusterware ..."

echo

/etc/init.d/init.evmd stop

/etc/init.d/init.evmd disable

/etc/init.d/init.cssd stop

/etc/init.d/init.cssd disable

/etc/init.d/init.crsd stop

/etc/init.d/init.crsd disable

/etc/init.d/init.crs stop

/etc/init.d/init.crs disable

rm -rf /etc/oracle /etc/oraInst.loc /etc/oratab

rm -rf /etc/init.d/init.crsd /etc/init.d/init.crs /etc/init.d/init.cssd /etc/init.d/init.evmd
rm -rf /etc/rc2.d/K96init.crs /etc/rc2.d/S96init.crs etc/rc3.d/K96init.crs \

/etc/rc3.d/S96init.crs /etc/rc4.d/K96init.crs /etc/rc4.d/S96init.crs \

/etc/rc5.d/K96init.crs /etc/rc5.d/S96init.crs /etc/rc.d/rc0.d/K96init.crs \

/etc/rc.d/rc1.d/K96init.crs /etc/rc.d/rc6.d/K96init.crs /etc/rc.d/rc4.d/K96init.crs

cp /etc/inittab.orig /etc/inittab

rm -rf /etc/inittab.crs /etc/inittab.no_crs

rm -rf /tmp/*

rm -rf /tmp/.oracle

rm -rf /usr/local/bin/dbhome /usr/local/bin/oraenv /usr/local/bin/coraenv

rm -rf /var/tmp/.oracle

rm -rf /opt/oracle/*

echo

echo "Remove on one Node the Shared Devices"

echo "rm -rf /u03/oracrs/*"

echo

fi

########### end script ###############

After running that script on a system it should be possible to reinstall Oracle Clusterware without any problems.

November 21, 2007 Posted by | Clusterware, Linux, Oracle, RAC | 4 Comments