Thomas Vogt’s IT Blog

knowledge is power …

Grid Control Error: Agent unreachable

Terms:

Operating System: Enterprise Linux 4 U5 (RHEL4 U5)

Oracle: 10.2.0.2

Problem:

Communication from the Oracle Management Service host to the Agent host failed. Agent crashes after a few hour of correct working.

$ cd /u01/app/oracle/product/10.2.0/agent10g/xen1.pool/sysman/log/
$ more emagent.trc

2007-11-15 06:06:56 Thread-4112513952 ERROR util.files: ERROR: nmeufos_new: failed in lfiopn on file:
/u01/app/oracle/product/10.2.0/agent10g/xen1.pool/sysman/emd/agntstmp.txt.erro
r = 24 (Too many open files)
2007-11-15 06:06:56 Thread-4112513952 ERROR pingManager: Error in updating the agent time stamp file
2007-11-15 06:06:59 Thread-4112513952 ERROR fetchlets.healthCheck: GIM-00104: file not found
LEM-00031: file not found; arguments: [lempgmh] [lmserr]
LEM-00033: file not found; arguments: [lempgfm] [Couldn't open message file]
LEM-00031: file not found; arguments: [lempgmh] [lmserr]

Solution:

Important is the message about too many open files. This is caused by operating system running out of open file handles limit (default for linux is 1024).

You need to increase the file descriptor soft limit per shell to 4096 from default 1024.

# vim /etc/security/limits.conf

#oracle              soft    nofile  1024
oracle              soft    nofile  4096
oracle              hard    nofile  65536

# sysctl -p

There are also a few infos about this problem from Oracle:

http://download-west.oracle.com/docs/cd/B16240_01/doc/relnotes.102/b31594/toc.htm

November 27, 2007 - Posted by thomasvogt | Linux, Oracle, RAC | | No Comments Yet

No comments yet.

Leave a comment