ORA-00700: soft internal error, arguments: [main_6a], [3], [Invalid IP addresses in cellinit.ora file]

Last weekend,I have successfully upgraded the quarter exadata software version from 11.2.1.2.6 to 11.2.2.4.2
Before upgrade:

#imagehistory
Version : 11.2.1.2.6
Image activation date : 2010-07-19 15:01:17 +0300
Imaging mode : patch
Imaging status : success

After upgrade:
Active image version: 11.2.2.4.2.111221
Active image activated: 2012-04-07 12:52:12 +0300
Active image status: success
Active system partition on device: /dev/md6
Active software partition on device: /dev/md8

But on the cell side,celld was not able to start successfully.
MS and RS service were started,but cellsrv was not.

# service celld status/restart
Getting the state of RS services…
running
Starting CELLSRV services…
The STARTUP of CELLSRV services was not successful. Error: Start Failed
By the way in the log file

Incident 9 created, dump file: /opt/oracle/cell11.2.2.4.2_LINUX.X64_111221/log/diag/asm/cell/exa1cel03/incident/incdir_9/svtrc_9688_0_i9.trc
ORA-00700: soft internal error, arguments: [main_6a], [3], [Invalid IP addresses in cellinit.ora file], [], [], [], [], [], [], [], [], []

This ORA-00700 error also was emailed by each cell storage server.

I checked the each cell

CellCLI>ALTER CELL VALIDATE CONFIGURATION

CELL-02653: Cell configuration check encountered the following issues:
Check Exadata configuration via ipconf utility
Verifying of Exadata configuration file /opt/oracle.cellos/cell.conf
Loopback rds ping for bondib0(192.168.6.73) : FAILED
Error. Overall status of verification of Exadata configuration file: FAILED
[INFO] The ipconf check may generate a failure for temporary inability to reach NTP or DNS server. You may ignore this alert, if the NTP or DNS servers are valid and available.
[INFO] You may ignore this alert, if the NTP or DNS servers are valid and available.
[INFO] As root user run /usr/local/bin/ipconf -verify -semantic to verify consistent network configurations.

[root@exa1cel02 config]# /usr/local/bin/ipconf -verify -semantic
Verifying of Exadata configuration file /opt/oracle.cellos/cell.conf
Loopback rds ping for bondib0(192.168.6.73) : FAILED
Error. Overall status of verification of Exadata configuration file: FAILED

Something was wrong with cell configuration.
I checked cell IPs both cellinit.ora and ifconfig -a
There was no inconsistency wth IPs.

I suspected IB switch didnt work properly.

When I verified the IB switches topology.

</pre>
./verify-topology -t quarterrack

[ DB Machine Infiniband Cabling Topology Verification Tool ]
 [Version 11.2.2.4.2 ]

--------------- Quarter Rack Exadata V2 Cabling Check---------

Check if all hosts have 2 CAs to different switches..................[SUCCESS]
 Leaf switch check: cardinality and even distribution.................[SUCCESS]
 Check if each rack has an valid internal ring........................[SUCCESS]
<pre>

Everything seems ok.

So where is the problem,why the cells raised Invalid IP addresses error ?

Before upgrade:
version of the ibswitch software is :

# nm2version
NM2-36p version: 1.0.1-1
Build time: Sep 14 2009 12:52:51
ComExpress info:
Manufacturing Date: 2009.05.05
Serial Number: “NCD3X0178”
Hardware Revision: 0x0006
Firmware Revision: 0x0102

But
Users of FW version 1.0.1 will need to upgrade to 1.1.3 or 1.1.4 before upgrading to 1.3.3.
So I first upgraded 1.1.3 and then 1.3.3 successfully.

when I checked the master of the switch.

root@exa1sw-ib2 ~]# getmaster</pre>
Local SM not enabled
 20120407 20:41:52 No Master SubnetManager seen in the system
<pre>

The problem was here.

So I reconfigured the SM configuration with 2 IB switches
root@exa1sw-ib2 ~]# setsmpriority 0

Current SM settings:
smpriority 0
controlled_handover TRUE
subnet_prefix 0xfe80000000000000
[root@exa1sw-ib2 ~]#
[root@exa1sw-ib2 ~]# disablesm
Stopping partitiond daemon.
/usr/local/util/partitiond is already stopped
Stopping IB Subnet Manager.[FAILED]
[root@exa1sw-ib2 ~]# enableasm
-bash: enableasm: command not found
[root@exa1sw-ib2 ~]# enablesm
Starting IB Subnet Manager.[ OK ]
Starting partitiond daemon.[ OK ]
[root@exa1sw-ib2 ~]# getmaster
Local SM enabled and running
20120407 21:44:19 Master SubnetManager on sm lid 0 sm guid 0x2128469ea1a0a0 :
[root@exa1sw-ib2 ~]#

This solved the problem.

CellCLI> alter cell validate configuration;
Cell exa1cel02 successfully altered

CellCLI> alter cell validate configuration;
Cell exa1cel01 successfully altered

Unfourtunaly this was a bug
New Bug 13937466 – ORA-00700: SOFT INTERNAL ERROR, ARGUMENTS: [MAIN_6A], [3], [INVALID IP ADDRESSES has been created for the issue.

and also Doc ID 1341062.1 helps me.

ugurcan

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s