LATEST TOPICS

Oracle 12c RAC: Solving ‘Grid Infrastructure Management Repository connection error’

While trying to query the management repository path using OCLUMON utility, I was receiving the following error.

---//
---// error while accessing management repository //---
---//
myracserver1 {/home/oracle}: $GRID_HOME/bin/oclumon manage -get reppath
Connection Error. Could not get RepPath.
myracserver1 {/home/oracle}:

Trying to issue dumpnodeview command, generated following errors.

---//
---// error while running dumpnodeview //---
---//
myracserver1 {/home/oracle}: $GRID_HOME/bin/oclumon dumpnodeview -allnodes
CRS-9118-Grid Infrastructure Management Repository connection error
 ORA-12514: TNS:listener does not currently know of service requested in connect descriptor

From the error, it is evident that OCLUMON utility is not able to establish a connection with the management repository database. However, there is not enough information available in the error messages to draw any conclusion. We need to find more details related to this error.

We know that, it is the Cluster Logger Service (ologgerd) which communicates with the management repository; which means we can find additional details about this error in the ologgerd (Cluseter Logger Service) trace file. To be able to find the ologgerd service trace file, we need to first identify the node in which ologgerd service is running (We can have one ologgerd service per 32 cluster nodes). We can do that using OCLUMON utility as shown below.

---//
---// locating node hosting the logger service //---
---//
myracserver1 {/home/oracle}: $GRID_HOME/bin/oclumon manage -get mylogger

Logger = myracserver2

We have the ologgerd service running on node myracserver2 and we need to login to that node, to view the trace logs related to the connection error. Let’s login to the cluster node myracserver2.

---//
---// login to node hosting ologgerd service //---
---//
myracserver1 {/home/oracle}: ssh myracserver2
Last login: Tue Feb 23 03:06:33 2016 from myracserver1
myracserver2 {/home/oracle}:

---//
---// verify ologgerd service is running there //---
---//
myracserver2 {/home/oracle}: ps -ef | grep ologgerd
root      2814     1  1 Jan26 ?        08:09:35 /app/grid/12.1.0.2/bin/ologgerd -M -d /app/grid/12.1.0.2/crf/db/myracserver2
oracle   32564 32235  0 09:37 pts/0    00:00:00 grep ologgerd

As we know, with Oracle 12c all the Clusterware logs are centralized under the $ADR_BASE/diag/crs/`hostname`/crs location, we can find the ologgerd service trace files under this centralized log directory. Let’s find the ologgerd service trace file

---//
---// querying adrci to find the trace home //---
---//
myracserver2 {/home/oracle}: adrci

ADRCI: Release 12.1.0.2.0 - Production on Fri Feb 26 09:39:39 2016

Copyright (c) 1982, 2014, Oracle and/or its affiliates.  All rights reserved.

ADR base = "/app/oracle"
adrci> show homes
ADR Homes:
diag/rdbms/_mgmtdb/-MGMTDB
diag/tnslsnr/myracserver2/mgmtlsnr
diag/crs/myracserver2/crs
adrci> exit

---//
---// moving to clusterware trace directory //--- 
---//
myracserver2 {/home/oracle}: cd /app/oracle/diag/crs/myracserver2/crs/trace
myracserver2 {/app/oracle/diag/crs/myracserver2/crs/trace}:

---//
---// locating ologgerd service trace file //---
---//
myracserver2 {/app/oracle/diag/crs/myracserver2/crs/trace}: ls -lrt ologgerd.trc
-rw-rw---- 1 root oinstall 804345 Feb 26 09:36 ologgerd.trc

We have now located the ologgerd service trace file, lets find out the error details from this trace file.

---//
---// viewing additional error details from ologgerd.trc //---
---//
2016-02-26 09:47:36.795608 :  CRFOCI:2608355072: crfoci_conn_create: Connection string discoverd
2016-02-26 09:47:36.820389 :  CRFOCI:2608355072: crfoci_conn_create: Trying to connect to (DESCRIPTION = (ADDRESS = (PROTOCOL = tcp)(HOST = 192.168.230.10)(PORT = 1521))(CONNECT_DATA = (SERVICE_NAME = my_rac_cluster))).
2016-02-26 09:51:10.616175 :  CRFOCI:2608355072: crfoci_conn_create: Connection string discoverd
2016-02-26 09:51:10.659797 :  CRFOCI:2608355072: crfoci_conn_create: Trying to connect to (DESCRIPTION = (ADDRESS = (PROTOCOL = tcp)(HOST = 192.168.230.10)(PORT = 1521))(CONNECT_DATA = (SERVICE_NAME = my_rac_cluster))).
2016-02-26 09:51:10.807416 :  CRFOCI:2608355072: crfoci_checkerr: Error - OCI_ERROR ORA-12514: TNS:listener does not currently know of service requested in connect descriptor

Now, we have more details related to the connection error. It looks like, ologgerd service is not able to establish a database connection to the management repository as the listener is not aware of the service_name (my_rac_cluster) used to establish the connection.

Oracle uses the management repository pluggable database name (named after the cluster_name) as the service name for establishing repository database connection, this can be verified by viewing the repository database configuration as shown below.

---//
---// checking repository database configuration //---
---//
myracserver2 {/home/oracle}: srvctl config mgmtdb
Database unique name: _mgmtdb
Database name:
Oracle home: 
Oracle user: oracle
Spfile: /data/clusterfiles/_mgmtdb/spfile-MGMTDB.ora
Password file:
Domain:
Start options: open
Stop options: immediate
Database role: PRIMARY
Management policy: AUTOMATIC
Type: Management
PDB name: my_rac_cluster		---> repository pluggable database name
PDB service: my_rac_cluster		---> service_name used for database connection
Cluster name: my-rac-cluster
Database instance: -MGMTDB

We need to make sure that the PDB service (my_rac_cluster) is up and running and is registered with repository listener (MGMTLSNR). Let’s check if the service is registered with the management listener

---//
---// check if PDB service is registered with MGMTLSNR //---
---//
myracserver1 {/home/oracle}: lsnrctl status MGMTLSNR | grep -i MY_RAC_CLUSTER
myracserver1 {/home/oracle}: 

Looks like the PDB service is not registered with the management listener MGMTLSNR, due to which ologgerd service was not able to establish a connection with the management repository database. We can also query the management repository database, to check if the service exist in the database as shown below.

---//
---// checking repository database for PDB service existance //---
---//
SQL> show con_name

CON_NAME
------------------------------
CDB$ROOT
SQL> show pdbs

    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 MY_RAC_CLUSTER                 READ WRITE NO

---//		 
---// switch to repository database //---		 
---//
SQL> alter session set container=MY_RAC_CLUSTER;

Session altered.

---//
---// check if the PDB service exists //---
---//
SQL> select name from dba_services;

no rows selected

Looks like the PDB service doesn’t exist in the repository database. Lets create the missing PDB service in the repository database as shown below.

Note: In general the PDB service is created by default when we install the Clusterware stack. However, if we relocate the repository to its own dedicated location manually, there is a possibility that the PDB service will not be created by default.

---//
---// make sure we are logged in to the repository pluggable database //---
---//
SQL> show con_name

CON_NAME
------------------------------
MY_RAC_CLUSTER

---//
---// create the missing PDB service //---
---//
SQL> exec dbms_service.create_service('my_rac_cluster','my_rac_cluster');

PL/SQL procedure successfully completed.

SQL> select name from dba_services;

NAME
----------------------------------------------------------------
my_rac_cluster

---//
---// start the PDB service //---
---//
SQL> exec dbms_service.start_service('my_rac_cluster');

PL/SQL procedure successfully completed.

We have created and started the missing PDB service. Let’s validate if the service is registered with management listener MGMTLSNR.

---//
---// check if PDB service is now registered with MGMTLSNR //---
---//
myracserver1 {/home/oracle}: lsnrctl status MGMTLSNR | grep -i MY_RAC_CLUSTER
Service "my_rac_cluster" has 1 instance(s).
myracserver1 {/home/oracle}:

PDB service is now started and is registered with the management listener. Now, we should be able to query the repository database. Let’s check if can query the repository database

---//
---// validate if repository can be queried //---
---//
myracserver1 {/home/oracle}: $GRID_HOME/bin/oclumon manage -get reppath
Connection Error. Could not get RepPath.

We are still receiving the same error. Let’s again check the ologgerd service trace file to find out more details about this error.

---//
---// viewing additional error details from ologgerd.trc //---
---//
myracserver2 {/app/oracle/diag/crs/myracserver2/crs/trace}:
2016-02-26 10:36:35.375007 :  CRFOCI:2608355072: crfoci_conn_create: OCISessionGetfailed
2016-02-26 10:36:35.440982 : default:2608355072: crflogdbora_getdbconn:crfoci_conn_create failed
2016-02-26 10:36:35.440991 : default:2608355072: crflogdbora_getdbconn: errbuf ORA-28002: the password will expire within 7 days

2016-02-26 10:36:35.440999 : CRFLOGD:2608355072: crflogdb_getRepPath: Unable to connect to managment database
2016-02-26 10:36:35.441002 : CRFLOGD:2608355072: crflogdb_getRepPath: Error :
2016-02-26 10:36:35.441008 : CRFLOGD:2608355072: crflogui_domanage: get Rep Path Failed

We have different issue now. This one is related to repository user’s password expiration. By default repository user’s password expiry period is 180 days and is managed automatically by the ologgerd service, which monitors the password expiration and reset it once it is expired. In our case, the password is not yet expired and Oracle is warning about the password state. We can either wait for ologgerd to reset the password automatically or we can manually reset the password using the mgmtca tool as shown below.

---//
---// resetting management repository password //---
---//
myracserver1 {/home/oracle}: $GRID_HOME/bin/mgmtca

Let’s check if we are able to query the management repository now.

---//
---// querying management repository //---
---//
myracserver1 {/home/oracle}: $GRID_HOME/bin/oclumon manage -get reppath

CHM Repository Path = /data/clusterfiles/_MGMTDB/datafile/o1_mf_sysmgmtd__374325064041_.dbf
myracserver1 {/home/oracle}:

Yes, we are now able to query the repository database. Problem is resolved !!

Footnote: If we are receiving connection error while querying management repository, the first things to check are the repository PDB service (which is used to establish the connection with repository database) status and the repository user’s password (ideally managed automatically by cluster logger service) status and obviously make sure MGMTDB and MGMTLSNR are up and running.

Hope you find this information useful.

No Responses
%d bloggers like this:
Visit Us On LinkedinVisit Us On TwitterVisit Us On Google PlusCheck Our Feed