Saturday, January 5, 2019

ORA-29740: evicted by instance number | Linux rp_filter

Issue encountered on Redhat 7.5 and Oracle 12.2 RAC.
Red Hat Enterprise Linux Server release 7.5 (Maipo)

After installation & creating oracle DB, only one instance can be started at the same time.

Red hat increased by default security against "IP spoofing from Distributed Denial-of-service (DDos)", this is was blocking KSXPPING (LMS cross instance pings activity).


Error received in the alertlog as below:

KSXPPING: KSXP selected for Ping
2019-01-03T17:58:11.401836+01:00
Errors in file /u02/base/diag/rdbms/ikeja/ikeja1/trace/ikeja1_lmon_365438.trc  (incident=57777) (PDBNAME=CDB$ROOT):
ORA-29740: evicted by instance number 2, group incarnation 12
Incident details in: /u02/base/diag/rdbms/ikeja/ikeja1/incident/incdir_57777/ikeja1_lmon_365438_i57777.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
2019-01-03T17:58:11.955366+01:00
Errors in file /u02/base/diag/rdbms/ikeja/ikeja1/trace/ikeja1_lmon_365438.trc:
ORA-29740: evicted by instance number 2, group incarnation 12
Errors in file /u02/base/diag/rdbms/ikeja/ikeja1/trace/ikeja1_lmon_365438.trc  (incident=57778) (PDBNAME=CDB$ROOT):
ORA-29740 [] [] [] [] [] [] [] [] [] [] [] []
Incident details in: /u02/base/diag/rdbms/ikeja/ikeja1/incident/incdir_57778/ikeja1_lmon_365438_i57778.trc
2019-01-03T17:58:12.007832+01:00
IPC Send timeout to 2.3 inc 10 for msg type 65522 from opid 26
2019-01-03T17:58:12.011091+01:00
IPC Send timeout to 2.4 inc 10 for msg type 65522 from opid 27
2019-01-03T17:58:12.026169+01:00
IPC Send timeout to 2.6 inc 10 for msg type 65522 from opid 29
2019-01-03T17:58:12.031554+01:00
IPC Send timeout to 2.2 inc 10 for msg type 65522 from opid 25
2019-01-03T17:58:12.178083+01:00
LCK0 (ospid: 366513): terminating the instance due to error 481
2019-01-03T17:58:23.195079+01:00
Instance terminated by LCK0, pid = 366513

ikea1[](/u02/base/diag/rdbms/ikeja/ikeja1/trace)$ oerr ora 29740
29740, 00000, "evicted by instance number %s, group incarnation %s"
// *Cause: This instance was evicted from the group by another instance of the
//         cluster database group for one of several reasons, which may
//         include a communications error in the cluster and failure to issue
//         a heartbeat to the control file.
// *Action: Check the trace files of other active instances in the cluster
//          group for indications of errors that caused a reconfiguration.
ikea1[](/u02/base/diag/rdbms/ikeja/ikeja1/trace)$


Solution: is to reduce the level of security by setting rp_filter=2
Current values are:
$ sysctl -a|grep rp_filter|grep 'haip'
..
net.ipv4.conf.eth-haip_70.arp_filter = 0
net.ipv4.conf.eth-haip_70.rp_filter = 1
net.ipv4.conf.eth-haip_71.arp_filter = 0
net.ipv4.conf.eth-haip_71.rp_filter = 1

change value online with -w option of sysctl
sudo sysctl -w net.ipv4.conf.eth-haip_70.rp_filter=2
sudo sysctl -w net.ipv4.conf.eth-haip_71.rp_filter=2

to make it permanent, update using tuned or /etc/sysctl.d or /etc/sysctl.conf file.

HTH