Recently I faced this error. The scenario took place after a full RAC reboot process. The difference with error I had already documented here was that it was useless to restart each RAC service. Either way it was sending the placement error for all RAC related services. In this case I took a look at the Metalink Note: 726925.1 "srvctl start instance fails with PRKP-1001; srvctl trace shows error connecting to CRSD".
First I checked the alert.log file related to the cluster operation. It reported nothing unusual, everything seemed to be normal.
Next I took a look at the output of the command:
crsctl check crs
This command reported everything was working just as normal. Definitely it had to do with the syncronization at startup time. It is pretty weired since a normal node reboot should not lead to such inconsistency. I must point out that the environment used was 10gR2 (10.2.0.1.0) on RHEL4 (Red Hat Enterprise Linux AS release 4 (Nahant Update 3) 2.6.16 xenU (32-bit)), this was a scenario faced while I was teaching the RAC 10g course for Oracle. Since this environment is not patched at the start of the course, I would not be surprised to find out that this is due to an already filed bug.
The procedure was to kill (as root) all crsd.bin process on all participating nodes, then a simple crs_stop -all / crs_start -all was just enough to put everything back to normal.
Everything Changes
1 week ago
No comments:
Post a Comment