Oracle by Madrid: 2008

Tuesday, November 11, 2008

What Happens during a hot backup?

There are a couple of myths around the hot backup or online backup process:

Myth #1: The hot backup generates "a lot" of redo information
Myth #2: The archivelog mode "dramatically slows down" the database
Myth #3: When a hot backup is in progress the target datafile is frozen.

There are two ways to generate a hot backup, the first one is by a user managed backup and the second one with recovery manager. The database is required to be in archivelog mode for it to be able to perform an online backup. Both ways to perform an online backup work in a similar way, as I will further explain later, rman is more efficient than the user managed backup.

User managed backup

alter tablespace ts_name begin backup;

When the command is issued, at this point in time a checkpoint is performed against the target tablespace; then the datafile header is frozen, so no more updates are allowed on it (the datafile header), this is for the database to know which was the last time the tablespace had a consistent image of the data.

The datafiles with the backup in progress will still allow read/write operations just as a regular datafile, I/O activity is not frozen.

Each time a row is modified, not only the row, but the complete block is recorded to the redo log file, this will only happen the first time the block is modified, subsequent transactions on the block will only record the transaction just as normal.

During the user managed backup process the "fractured block" event may be present. Let's remember that the oracle block is the minimum IO unit, and an oracle block is made out of several OS blocks; let's assume a block size of 8K and an OS block of 512b, this will give 16 OS blocks. If during the backup process of a block there is a write operation on the block then the backup will contain a before image and an after image of the oracle block, the complete block in the backup media will be corrupt. This is normal, consistency is not guaranteed on the backup, that is why the header must be frozen to mark the point where the recovery process will have to start, and that is why oracle record a complete block image on the redo log file.

At the time the alter tablespace ts_name end backup; command is issued then the backup process is finished and the datafile header resumes its regular IO activity.

Recovery manager backup
The same process happens when a rman backup takes place, the only difference is that rman better handles the fractured block issue, it doesn't write block fragments or partial blocks to the backup, it writes the complete consistent block image to the backup media. So recovery manager doesn't need to record the complete block to the redo log file.

Some further comments on the rman case, rman doesn't freeze the datafile header, it continues to checkpoint just as regular, but it does perform a checkpoint to the tablespace.

From my perspective, the user managed backup (UMB) is a backup method that is less frequently seen on production environments, since Oracle 9i Rel 2 most DBA's considered rman as part of the regular backup/recover strategy, it performs better that the UMB, it is able to perform a block level backup, meanwhile in the UMB the whole datafile must be backup even if a lot of clean blocks are present.

Some advices for the people who still use UMB, don't let the BEGIN BACKUP run for long periods of time, it is very likely that the more time it takes to perform the backup, the more blocks are likely to change, which may generate more blocks contents to be written to the redo log files.

Monday, November 10, 2008

libawt.so: libXp.so.6: cannot open shared object file

During the installation of Oracle 10g Rel 2 on Linux this error may appear because the required library is not installed by default. Before installing Oracle, you must make sure all required packages according to the installation guide are present.

There are two libraries which are not installed by default in OEL5, libXp-1.0.0 and compat-db, these files are found on Disk3 in OEL 5.2.

The command used to install the required RPM's on OEL 5.2 is:

rpm -ivh compat-db-4.2.52-5.1.i386.rpm compat-libstdc++-296-2.96-138.i386.rpm compat-libstdc++-33-3.2.3-61.i386.rpm gcc-4.1.2-42.el5.i386.rpm gcc-c++-4.1.2-42.el5.i386.rpm glibc-devel-2.5-24.i386.rpm glibc-headers-2.5-24.i386.rpm libgomp-4.1.2-42.el5.i386.rpm libstdc++-devel-4.1.2-42.el5.i386.rpm libXpm-devel-3.5.5-3.i386.rpm
done libXp-1.0.0-8.1.el5.i386.rpm

This assumes all RPM's are present on a stage area.

Thursday, August 28, 2008

Oracle Forums Down After Upgrade

It was about three months ago when the OTN launched a better release. It had to be rolled back with all the implied consequences. People who had participated for the new release saw their work simply disappear.

For the second time a new attempt to release the forum was made, it fairly worked for one week. Afterwards the forum had to be shutdown once again. What is going on with the oracle team who had assembled the forum? Is this really an escalable application?

The attempted version was pretty nice. The editor functionality was improved, there were reward points for the people who really aported valuable knowledge to the forum.

However it happenend that this solution simply doesn't work. It is very nice, and it would be even nicer if this solution worked, but ... the big BUT ...

"Oracle forums is experiencing technical difficulty. We are aware of the issue and are working as quick as possible to correct the issue. Please try again in a few moments.
We apologize for any inconvenience this may have caused.
To speak with an Oracle sales representative: 1.800.ORACLE1.
To contact Oracle Corporate Headquarters from anywhere in the world: 1.650.506.7000.
To get technical support in the United States: 1.800.633.0738."

I like the philosophy of this forum, this way people like Sybrand "senior RTFM answers" Bakker can be easily spotted and automatically self-banned. This goes for all the similar inflated egos posters who are just playing a bullying games with junior level posters.

I wish Oracle Corporation really invested time, effort and money in its flag ship forum, forums.oracle.com

Wednesday, August 13, 2008

ORA-01460: Unimplemeted conversion requested

When performing a SQL*Loader of a BLOB file an error showed up:

Column Name Position Len Term Encl Datatype
------------------------------ ---------- ----- ---- ---- ---------------------
ID FIRST 5 INTEGER
NAME NEXT 50 , CHARACTER
IMAGE DERIVED * EOF CHARACTER
Dynamic LOBFILE. Filename in field NAME

Record 1: Rejected - Error on table IMAGES.
ORA-01460: unimplemented or unreasonable conversion requested

Record 2: Rejected - Error on table IMAGES.
ORA-01460: unimplemented or unreasonable conversion requested

Record 3: Rejected - Error on table IMAGES.
ORA-01460: unimplemented or unreasonable conversion requested

The original controlfile used had this format:

LOAD DATA
INFILE *
INTO TABLE IMAGES
REPLACE
FIELDS TERMINATED BY ','
(
id INTEGER(5),
name CHAR(50),
image LOBFILE (name) TERMINATED BY EOF
)
BEGINDATA
001,OracleHQ1.jpg
002,oracleHQ2.jpg
003,OracleHQ3.jpg

The problem was because a data format was forced from the controlfile, just by changing the data type definition and letting SQL*Loader to have it define solved the problem, so the control file after the modification was:

LOAD DATA
INFILE *
INTO TABLE image_table
REPLACE
FIELDS TERMINATED BY ','
(
image_id,
file_name,
image_data LOBFILE (file_name) TERMINATED BY EOF
)
BEGINDATA
001,OracleHQ1.jpg
002,oracleHQ2.jpg
003,OracleHQ3.jpg

Tuesday, July 29, 2008

ORA-06544 when performing network expdp between 11g and 10g

When I was trying to perform a network export datapump between 11g and 10g I found an ORA-06544 error.

[oracle@alpha admin]$ impdp SCOTT/TIGER tables=emp_10g,dept_10g network_link=db10g

Import: Release 11.1.0.6.0 - Production on Tuesday, 29 July, 2008 22:37:01

Copyright (c) 2003, 2007, Oracle. All rights reserved.

Connected to: Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORA-39006: internal error
ORA-39113: Unable to determine database version
ORA-04052: error occurred when looking up remote object SYS.DBMS_UTILITY@DB10G
ORA-00604: error occurred at recursive SQL level 3
ORA-06544: PL/SQL: internal error, arguments: [55916], [], [], [], [], [], [], []
ORA-06553: PLS-801: internal error [55916]
ORA-02063: preceding 2 lines from DB10G

ORA-39097: Data Pump job encountered unexpected error -4052

This error is an interoperability error due to the bug 4511371, which is fixed by applying the 10.1.0.5.0 patchset for 10gR1 and 10.2.0.2.0 patchset for 10gR2.

Ref. Metalink note 4511371.8

Thursday, July 24, 2008

ORA-39001 When executing Export DataPump on RAC

While I was performing a data pump test export from one of the demo accounts a very weired error showed up. I have looked for it at the web, metalink, forums, and it seems to be the first case.

Scenario
A simple expdp proces trying to export whatever data shows up this:

-bash-3.00$ expdp sh/sh dumpfile=shSchema directory='default_dp_dest' schemas=sh
Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, Real Application Clusters, OLAP and Data Mining options
ORA-39001: invalid argument value
ORA-39000: bad dump file specification
ORA-31641: unable to create dump file "/home/oracle/default_dp_dest/shSchema.dmp"
ORA-27041: unable to open file
Linux Error: 22: Invalid argument
Additional information: 2

I have already made some expdp testing, and it has worked, the only difference during this testing is that it was made on a RAC environment.

When taking a look at the background dump dest trace files I found this:

-bash-3.00$ more rdbkdi1_dm00_31645.trc
/u01/app/admin/RDBKDI/bdump/rdbkdi1_dm00_31645.trc
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, Real Application Clusters, OLAP and Data Mining options
ORACLE_HOME = /u01/app/oracle/product/10.2.0/db_1
System name: Linux
Node name: eg6886
Release: 2.6.16-xenU
Version: #2 SMP Tue Oct 24 16:33:50 CDT 2006
Machine: i686
Instance name: RDBKDI1
Redo thread mounted by this instance: 1
Oracle process number: 39
Unix process pid: 31645, image: oracle@eg6886 (DM00)
*** ACTION NAME:(EXPHR01) 2008-07-24 19:38:22.781
*** MODULE NAME:(Data Pump Master) 2008-07-24 19:38:22.781
*** SERVICE NAME:(SYS$USERS) 2008-07-24 19:38:22.781
*** SESSION ID:(125.291) 2008-07-24 19:38:22.781
kswsdlaqsub: unexpected exception err=604, err2=24010
kswsdlaqsub: unexpected exception err=604, err2=24010

If ever someone finds this error solution or reference to it I'll be thankful to hear about it.

Tuesday, July 22, 2008

OMS version not checked yet ...

When a grid control agent was installed during a RAC setup an error showed up at the end of the agent configuration assistant.

The agent was started successfully, but it was not possible to force the agent to have the XML files uploaded to the grid server, it was when the dreaded error "EMD upload error: uploadXMLFiles skipped :: OMS version not checked yet.." showed up.

The problem basically was due to a password typing error during the configuration phase.

The Troubleshooting Procedure Outline was:

Setup the Agent Home environment
export AGENT_HOME=/u01/app/oracle/pruduct/10.2.0/agent10g
export PATH=$AGENT_HOME/bin:$PATH

Startup the Agent
emctl start agent

Clear the Agent status
emctl clearstate agent

Reset Credentials
emctl secure agent

Retry and verify synchronization operations
emctl upload agent
emctl status agent

###
### EM Agent Troubleshooting log
###

$ export AGENT_HOME=/u01/app/oracle/product/10.2.0/agent10g
$ export PATH=$AGENT_HOME/bin:$PATH

$ # Start agent
$ emctl start agent
Oracle Enterprise Manager 10g Release 10.2.0.1.0.
Copyright (c) 1996, 2005 Oracle Corporation. All rights reserved.
Starting agent ......... started.
$
$
$ # Trying to force XML files uploading
mct$ emctl upload agent
Oracle Enterprise Manager 10g Release 10.2.0.1.0.
Copyright (c) 1996, 2005 Oracle Corporation. All rights reserved.
---------------------------------------------------------------
EMD upload error: uploadXMLFiles skipped :: OMS version not checked yet..

$ # Clear EM Agent status
$ emctl clearstate agent
Oracle Enterprise Manager 10g Release 10.2.0.1.0.
Copyright (c) 1996, 2005 Oracle Corporation. All rights reserved.
EMD clearstate completed successfully

$ # Configuring OMS Credentials
$ emctl secure agent
Oracle Enterprise Manager 10g Release 10.2.0.1.0.
Copyright (c) 1996, 2005 Oracle Corporation. All rights reserved.
Enter Agent Registration password : *******
Agent successfully stopped... Done.
Securing agent... Started.
Requesting an HTTPS Upload URL from the OMS... Done.
Requesting an Oracle Wallet and Agent Key from the OMS... Done.
Check if HTTPS Upload URL is accessible from the agent... Done.
Configuring Agent for HTTPS in CENTRAL_AGENT mode... Done.
EMD_URL set in /u01/app/oracle/product/10.2.0/agent10g/eg6881.us.oracle.com/sysm
an/config/emd.properties
Securing agent... Successful.
Agent successfully restarted... Done.

$ # Retrying operations
$ emctl status agent
Oracle Enterprise Manager 10g Release 10.2.0.1.0.
Copyright (c) 1996, 2005 Oracle Corporation. All rights reserved.
---------------------------------------------------------------
Agent Version : 10.2.0.1.0
OMS Version : 10.2.0.1.0
Protocol Version : 10.2.0.0.0
Agent Home : /u01/app/oracle/product/10.2.0/agent10g/eg6881.us.oracle.com
Agent binaries : /u01/app/oracle/product/10.2.0/agent10g
Agent Process ID : 5307
Parent Process ID : 5260
Agent URL : https://eg6881.us.oracle.com:3872/emd/main
Repository URL : https://eg6876.us.oracle.com:1159/em/upload
Started at : 2008-07-22 15:40:37
Started by user : oracle
Last Reload : 2008-07-22 15:40:37
Last successful upload : 2008-07-22 15:40:58
Total Megabytes of XML files uploaded so far : 0.79
Number of XML files pending upload : 18
Size of XML files pending upload(MB) : 1.86
Available disk space on upload filesystem : 27.54%
Last successful heartbeat to OMS : 2008-07-22 15:40:43
---------------------------------------------------------------
Agent is Running and Ready

$ emctl upload agent
Oracle Enterprise Manager 10g Release 10.2.0.1.0.
Copyright (c) 1996, 2005 Oracle Corporation. All rights reserved.
---------------------------------------------------------------
EMD upload completed successfully

$ emctl status agent
Oracle Enterprise Manager 10g Release 10.2.0.1.0.
Copyright (c) 1996, 2005 Oracle Corporation. All rights reserved.
---------------------------------------------------------------
Agent Version : 10.2.0.1.0
OMS Version : 10.2.0.1.0
Protocol Version : 10.2.0.0.0
Agent Home : /u01/app/oracle/product/10.2.0/agent10g/eg6881.us.oracle.com
Agent binaries : /u01/app/oracle/product/10.2.0/agent10g
Agent Process ID : 5307
Parent Process ID : 5260
Agent URL : https://eg6881.us.oracle.com:3872/emd/main
Repository URL : https://eg6876.us.oracle.com:1159/em/upload
Started at : 2008-07-22 15:40:37
Started by user : oracle
Last Reload : 2008-07-22 15:40:37
Last successful upload : 2008-07-22 15:41:13
Total Megabytes of XML files uploaded so far : 2.65
Number of XML files pending upload : 1
Size of XML files pending upload(MB) : 0.03
Available disk space on upload filesystem : 27.54%
Last successful heartbeat to OMS : 2008-07-22 15:40:43
---------------------------------------------------------------
Agent is Running and Ready
$

Friday, May 23, 2008

Blog not properly displayed in Firefox

Recently, without a warning, my blog site stopped being properly displayed, when drilling down on the circumstances that prevented it from properly displaying at least, so far, I have found that when it is displayed with Firefox browsers the left navigation bar 'disappears'. If displayed with Internet Explorer it is properly displayed.

I had not realized about this fact, but when Eduardo Legatti asked me what happened with the left navigation bar it was when I entered as a regular non authenticated user, I saw something happened with my Blog Archive and my Topics menus, those were gone. Eduardo told me he uses also Firefox.

In my personal opinion my default choice is Firefox, just like 24% of the blog users, not Internet Explorer, unless due to VPN restrictions or bank access requirement. I have tried to change my layout, and unsuccessfully it behaves the same. So the workaround in the mean time is to click on the item menu or use Internet Explorer :(

Some people have reported differences between firefox and internet explorer [Resolving Browser Display Discrepancies Between IE and Firefox].

... After a couple of hours after publishing this post the left navigation bar is back again, it looks like there was some issue wiht the cache. I have removed the cache from firefox, but I am not sure this was the actual solution.

Thursday, May 22, 2008

Missing libraries when installing EM Grid Control

When I was trying to install and setup an Enterprise Manager Grid Control 10g R2 (10.2.0.1.0) on an Oracle Enterprise Linux 4 (OEL4) platform, I surprisingly found I didn't meet the package list install requirement, it stated there were three libraries missing, I did install and reinstall them from the OEL4 disks, unsuccessfully, since the error still persist. I should have STWF (Searched The Web First) for further cases where something curious like this had happened to somebody( Linux OEM Grid Control 10.2.0.1.0 installation issue), and yes, it happened to this individual who was very kind in documenting this at the Oracle forums.

The error is found with these three missing RPM's:

compat-libstdc++-296-2.96-132.7.2 (failed)
libstdc++devel-3.4.3-22.1 (failed)
openmotif-21-2.1.30-11 (failed)

It can be seen that openmotif-21 is misspelled, it should read openmotif21, without the hyphen.
I did further search at Metalink [Note:343364.1, Prerequisites and Install Information for EM 10g Grid Control Components on Red Hat EL 4.0 Update 2 (and higher updates) Platforms] and I found that when installing EM Grid Control with a fresh database on 10g, it always shows this library error. Experience always comes when it is no longer required. The installer has a flaw, so it will show an error even if the three missing libraries are installed on the system, it is because a bad list of RPM's in the installer. This error can be user checked and proceed with installation, it will succeed. Just, this error is ever found, I suggest to proceed with caution, and make sure the list of RPM's meet the minimum requirements, if this manual check is successful, the missing RPM's error can be skipped.

rpm -q glibc glibc-2.3.4-2.13
rpm -q make make-3.80-5
rpm -q binutils binutils-2.15.92.0.2-15
rpm -q gcc gcc-3.4.4-2
rpm -q libaio libaio-0.3.103-3
rpm -q glibc-common glibc-common-2.3.4-2.13
rpm -q setarch setarch-1.6-1
rpm -q pdksh pdksh-5.2.14-30.3
rpm -q openmotif21 openmotif21-2.1.30-11.RHEL4.4
rpm -q sysstat sysstat-5.0.5-1
rpm -q gnome-libs gnome-libs-1.4.1.2.90-44.1
rpm -q libstdc++ libstdc++-3.4.4-2
rpm -q libstdc++-devel libstdc++-devel-3.4.4-2
rpm -q compat-libstdc++-296 compat-libstdc++-296-2.96-132.7.2
rpm -q compat-db compat-db-4.1.25-9
rpm -q control-center control-center-2.8.0-12.rhel4.2
rpm -q xscreensaver xscreensaver-4.18-5.rhel4.9

Enjoy!

Thursday, May 15, 2008

CRS-0223: Resource 'ResourceName' has placement error

While I was attempting to configure a RAC environment, I noticed some services were started and some other weren't, when I tried to start the services, the 'placement' error showed up, this was due because when trying to start the crs with the crsctl start crs command on both nodes, there was not enough time between the start process in node 1 and node2, so the VIP address has switched its control from node 1 to node2, and when I tried to bring back the service to node 1 it by explicitly starting it on node 1 the error showed up.

The Oracle error manual states:

Oracle Error : CRS-0223: Resource '%s' has placement error.
Cause: There was no host available on which to failover/start the resource based on the Placement Policy for the resource.
Action: Check the target host for the resource and restart the resource using the crs_start command.

$ ./crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....A1.inst application ONLINE OFFLINE
ora....A2.inst application ONLINE ONLINE eg3704
ora.RDBHRA.db application ONLINE ONLINE eg3704
ora....SM1.asm application ONLINE OFFLINE
ora....03.lsnr application ONLINE OFFLINE
ora.eg3703.gsd application ONLINE ONLINE eg3703
ora.eg3703.ons application ONLINE ONLINE eg3703
ora.eg3703.vip application ONLINE ONLINE eg3704
ora....SM2.asm application ONLINE ONLINE eg3704
ora....04.lsnr application ONLINE ONLINE eg3704
ora.eg3704.gsd application ONLINE ONLINE eg3704
ora.eg3704.ons application ONLINE ONLINE eg3704
ora.eg3704.vip application ONLINE ONLINE eg3704

[oracle@eg3703 bin]$ ./crs_start ora.eg3703.LISTENER_EG3703.lsnr
eg3703 : CRS-1018: Resource ora.eg3703.vip (application) is already running on eg3704
eg3704 : CRS-1019: Resource ora.eg3703.LISTENER_EG3703.lsnr (application) cannot run on eg3704

CRS-0223: Resource 'ora.eg3703.LISTENER_EG3703.lsnr' has placement error.
[oracle@eg3703 bin]$ ./crs_start ora.eg3703.ASM1.asm
eg3703 : CRS-1018: Resource ora.eg3703.vip (application) is already running on eg3704
eg3704 : CRS-1019: Resource ora.eg3703.ASM1.asm (application) cannot run on eg3704

CRS-0223: Resource 'ora.eg3703.ASM1.asm' has placement error.

At first hand, the procedure was to shutdown services and start them in order, so when that each service could take its place.

Name Type Target State Host
------------------------------------------------------------
ora....A1.inst application ONLINE ONLINE eg3703
ora....A2.inst application ONLINE ONLINE eg3704
ora.RDBHRA.db application ONLINE ONLINE eg3703
ora....SM1.asm application ONLINE ONLINE eg3703
ora....03.lsnr application ONLINE ONLINE eg3703
ora.eg3703.gsd application ONLINE ONLINE eg3703
ora.eg3703.ons application ONLINE ONLINE eg3703
ora.eg3703.vip application ONLINE ONLINE eg3703
ora....SM2.asm application ONLINE ONLINE eg3704
ora....04.lsnr application ONLINE ONLINE eg3704
ora.eg3704.gsd application ONLINE ONLINE eg3704
ora.eg3704.ons application ONLINE ONLINE eg3704
ora.eg3704.vip application ONLINE ONLINE eg3704

Adding a voting disk online fails

This happened on a RHEL4/Ora10gR2 Ver. 10.2.0.1.0.
In a cluster environment it is important to keep RAC availability at maximum and avoid unnecessary either planned or unplanned down times. That is why the online commands and functionality are for.

When attempting to add a Voting disk to my RAC configuration an error, "Cluster is not in a ready state for online disk addition" after issuing this command:

$ sudo /u01/crs1020/bin/crsctl add css votedisk /dev/raw/raw8

It was not possible to perform this task, even though the manual states it is possible, after a bit of research in metalink I found this metalink note:"ADDING VOTING DISK ONLINE CRASH THE CRS" This is due to the bug 4898020 and the platforms affected are, so far, Linux at the 10.2.0.1.0 release, One more reason to keep updated with the latest patchset available.The only workaround for this is to shutdown the crs and retry the operation with the force flag

So in my case the complete procedure to perform this task that could have been directly performed online was:

1. Stop all application processes

$ /u01/crs1020/bin/srvctl stop nodeapps -n eg7910
$ /u01/crs1020/bin/srvctl stop nodeapps -n eg7909

2. Verify services were down

$ /u01/crs1020/bin/crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....H1.inst application OFFLINE OFFLINE
ora....H2.inst application OFFLINE OFFLINE
ora.RDBHRH.db application OFFLINE OFFLINE
ora....SM1.asm application OFFLINE OFFLINE
ora....09.lsnr application OFFLINE OFFLINE
ora.eg7909.gsd application OFFLINE OFFLINE
ora.eg7909.ons application OFFLINE OFFLINE
ora.eg7909.vip application OFFLINE OFFLINE
ora....SM2.asm application OFFLINE OFFLINE
ora....10.lsnr application OFFLINE OFFLINE
ora.eg7910.gsd application OFFLINE OFFLINE
ora.eg7910.ons application OFFLINE OFFLINE
ora.eg7910.vip application OFFLINE OFFLINE

3. Stop CRS on both nodes

$ sudo /u01/crs1020/bin/crsctl stop crs
Stopping resources.
Successfully stopped CRS resources
Stopping CSSD.
Shutting down CSS daemon.
Shutdown request successfully issued.

$ ssh eg7910 sudo /u01/crs1020/bin/crsctl stop crs
Stopping resources.
Successfully stopped CRS resources
Stopping CSSD.
Shutting down CSS daemon.
Shutdown request successfully issued.

4. Add the votedisks with the force flag.

$ sudo /u01/crs1020/bin/crsctl add css votedisk /dev/raw/raw8 -force
Now formatting voting disk: /dev/raw/raw8
successful addition of votedisk /dev/raw/raw8.

$ sudo /u01/crs1020/bin/crsctl add css votedisk /dev/raw/raw9 -force
Now formatting voting disk: /dev/raw/raw9
successful addition of votedisk /dev/raw/raw9.

5. Verify the voting disks were added

$ /u01/crs1020/bin/crsctl query css votedisk
0. 0 /dev/raw/raw7
1. 0 /dev/raw/raw8
2. 0 /dev/raw/raw9
located 3 votedisk(s).

6. Restart CRS and applications.

$ sudo /u01/crs1020/bin/crsctl start crs
Attempting to start CRS stack
The CRS stack will be started shortly
$ ssh eg7910 sudo /u01/crs1020/bin/crsctl start crs
Attempting to start CRS stack
The CRS stack will be started shortly

This should have happened on line, and this could mean a momentarily loss of RAC availability.

Thursday, May 08, 2008

How to manually remove a service

When performing an Oracle deinstall, sometimes services are left behind, such as the TNSListener Service. In order for you to get rid of it, there are two options, first one, remove it from the regedit, this choice must be carefully executed, you don't want to remove other service.
1. Navigate to HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services
2. Locate the service name, right click it and select Delete

It is always advisable to have a regedit backup prior to performing this task.

The second choice is the sc command
sc delete

This command is also used to manually create services
sc create

The complete syntax options for the sc command line utility is:
DESCRIPTION:
SC is a command line program used for communicating with the
NT Service Controller and services.
USAGE:
sc

Wednesday, May 07, 2008

Why is Varchar2 oversizing bad?

Apparently the same space consumption would be the same if a Varchar2(4) is used versus a Varchar2(4000), if you require one byte, then one byte will be allocated, and performance should be the same, but when taking a look at the performance statistics there are some surprises that arise.
When fetching information from Oracle, buffers and fetching structures are being prepared beforehand to be able to manage the maximum column lenght, applications such as toad may run out of memory trying to allocate resources to retrieve the information from an unnecessarily oversized varchar2 column.

Let's analyze this case:

Table 1: Varchar2LenghtDemo

SQL> desc Varchar2LenghtDemo
Name                                      Null?    Type
----------------------------------------- -------- ----------------------
ID                                                 NUMBER(10)
VCCOL                                              VARCHAR2(4000)
VCCOL2                                             VARCHAR2(4000)
VCCOL3                                             VARCHAR2(4000)
VCCOL4                                             VARCHAR2(4000)
VCCOL5                                             VARCHAR2(4000)
VCCOL6                                             VARCHAR2(4000)
VCCOL7                                             VARCHAR2(4000)
VCCOL8                                             VARCHAR2(4000)
VCCOL9                                             VARCHAR2(4000)
VCCOL10                                            VARCHAR2(4000)
VCCOL11                                            VARCHAR2(4000)
VCCOL12                                            VARCHAR2(4000)
VCCOL13                                            VARCHAR2(4000)
VCCOL14                                            VARCHAR2(4000)
VCCOL15                                            VARCHAR2(4000)
VCCOL16                                            VARCHAR2(4000)
VCCOL17                                            VARCHAR2(4000)
VCCOL18                                            VARCHAR2(4000)
VCCOL19                                            VARCHAR2(4000)
VCCOL20                                            VARCHAR2(4000)

23:39:55 SQL> select count(*) from Varchar2LenghtDemo;

COUNT(*)
----------
 8388608

Elapsed: 00:00:10.18

Statistics
----------------------------------------------------------
        5  recursive calls
        0  db block gets
    55719  consistent gets
    55669  physical reads
        0  redo size
      414  bytes sent via SQL*Net to client
      385  bytes received via SQL*Net from client
        2  SQL*Net roundtrips to/from client
        0  sorts (memory)
        0  sorts (disk)
        1  rows processed

If we compare the statistics from one table and the other:

SQL> desc Varchar2LenghtDemoShort
Name                                      Null?    Type
----------------------------------------- -------- --------------------

ID                                                 NUMBER(10)
VCCOL                                              VARCHAR2(4)
VCCOL2                                             VARCHAR2(4)
VCCOL3                                             VARCHAR2(4)
VCCOL4                                             VARCHAR2(4)
VCCOL5                                             VARCHAR2(4)
VCCOL6                                             VARCHAR2(4)
VCCOL7                                             VARCHAR2(4)
VCCOL8                                             VARCHAR2(4)
VCCOL9                                             VARCHAR2(4)
VCCOL10                                            VARCHAR2(4)
VCCOL11                                            VARCHAR2(4)
VCCOL12                                            VARCHAR2(4)
VCCOL13                                            VARCHAR2(4)
VCCOL14                                            VARCHAR2(4)
VCCOL15                                            VARCHAR2(4)
VCCOL16                                            VARCHAR2(4)
VCCOL17                                            VARCHAR2(4)
VCCOL18                                            VARCHAR2(4)
VCCOL19                                            VARCHAR2(4)
VCCOL20                                            VARCHAR2(4)

23:43:02 SQL> r
1* select count(*) from Varchar2LenghtDemoShort

COUNT(*)
----------
 8388608

Elapsed: 00:00:05.50

Statistics
----------------------------------------------------------
        0  recursive calls
        0  db block gets
    55756  consistent gets
    22077  physical reads
        0  redo size
      414  bytes sent via SQL*Net to client
      385  bytes received via SQL*Net from client
        2  SQL*Net roundtrips to/from client
        0  sorts (memory)
        0  sorts (disk)
        1  rows processed

Almost all statistics are the same, except for these two:

Operation: Select count(*) from table

Table          Rows      Time Elapsed  Phy Rds
Varchar(4)     8,388,608  00:00:05.50   22,077
Varchar(4000)  8,388,608  00:00:10.18   55,669

Since the number of physical reads is meaningful different a question arises:
How many rows per block? Is the row density the same?

Table          Avg Rows per Block
Varchar(4)         150.99
Varchar(4000)      150.99

Quite surprising, Row density is the same, but physical blocks processed are doubled.

Conclusion

Even though Varchar2 columns allow a variable number of characters and it will just use the number of bytes required, it is absolutely better to properly size the actual requirements, otherwise a meaningful performance issue unnecessarily will be created.

Thursday, May 01, 2008

Dump an Oracle Block to the trace

This is the script I use to dump an Oracle block.

set pagesize 0
set def =
set def &
column header_block new_value m_block
column header_file new_value m_file

select
header_file,
header_block + &m_offset header_block
from
dba_segments
where
segment_name = upper('&m_segment')
and nvl(partition_name,'xxx') = nvl(upper('&m_partition'),'xxx')
and owner = upper('&m_owner')
;

alter system dump datafile &m_file block min &m_block block max &m_block;

Wednesday, April 02, 2008

Oracle Index Cluster Storage

A Cluster Index is an object oriented to gain performance when frequent master/detail relation ships are found between two tables. This is the fastest way to perform a join between two tables, but there are storage issues when this storage structure is used. The purpose of this article is to analyze the way Oracle reserves storage for a Cluster Index

Create cluster and verify Cluster storage

SQL> l
1 create cluster cluster_emp
2 (department_id NUMBER(4))
3* index

SQL> create index i_cluster_emp on cluster cluster_emp;

Index created.

SQL> r
1 create table emp cluster cluster_emp(department_id)
2 as
3* select employee_id, first_name, last_name, hire_date, salary, department_id from employees

Table created.

SQL> create table dept
2 cluster cluster_emp(department_id)
3 as
4 select * from departments;

Table created.

Verify where data is stored

SQL> r
1 select owner, segment_name, file_id, block_id, blocks, bytes
2 from dba_extents
3 where segment_name in ('I_CLUSTER_EMP','CLUSTER_EMP')
4* order by segment_name, extent_id

OWNER SEGMENT_NAME FILE_ID BLOCK_ID BLOCKS BYTES
-------- ------------- ---------- ---------- ---------- ----------
HR CLUSTER_EMP 4 3825 8 65536
HR CLUSTER_EMP 4 1817 8 65536
HR CLUSTER_EMP 4 3817 8 65536
HR CLUSTER_EMP 4 3809 8 65536
HR I_CLUSTER_EMP 4 3833 8 65536

Determine how many rows are retrieved for each department Id when join is performed

SQL> R
1 SELECT D.DEPARTMENT_ID, COUNT(*)
2 FROM EMP E, DEPT D
3 WHERE E.DEPARTMENT_ID = D.DEPARTMENT_ID
4 GROUP BY D.DEPARTMENT_ID
5* ORDER BY D.DEPARTMENT_ID

DEPARTMENT_ID COUNT(*)
------------- ----------
10 1
20 2
30 6
40 1
50 45
60 5
70 1
80 34
90 3
100 6
110 2

11 rows selected.

Display where specific rows are stored.

SQL> R
1 SELECT D.DEPARTMENT_ID,
2 DBMS_ROWID.ROWID_RELATIVE_FNO(D.ROWID)||':'||DBMS_ROWID.ROWID_BLOCK_NUMBER(D.ROWID) DEPT_ROW,
3 DBMS_ROWID.ROWID_RELATIVE_FNO(E.ROWID)||':'||DBMS_ROWID.ROWID_BLOCK_NUMBER(E.ROWID) EMP_ROW,
4 COUNT(*)
5 FROM DEPT D, EMP E
6 WHERE D.DEPARTMENT_ID = E.DEPARTMENT_ID
7 GROUP BY
8 D.DEPARTMENT_ID,
9 DBMS_ROWID.ROWID_RELATIVE_FNO(D.ROWID)||':'||DBMS_ROWID.ROWID_BLOCK_NUMBER(D.ROWID),
10 DBMS_ROWID.ROWID_RELATIVE_FNO(E.ROWID)||':'||DBMS_ROWID.ROWID_BLOCK_NUMBER(E.ROWID)
11* ORDER BY DEPARTMENT_ID

DEPARTMENT_ID DEPT_ROW EMP_ROW COUNT(*)
------------- -------- -------- ----------
10 4:1821 4:1821 1
20 4:1822 4:1822 2
30 4:3829 4:3829 6
40 4:1823 4:1823 1
50 4:3830 4:3830 45
60 4:3832 4:3832 5
70 4:1824 4:1824 1
80 4:1819 4:1819 34
90 4:3831 4:3831 3
100 4:3828 4:3828 6
110 4:1817 4:1817 2

11 rows selected.

There it can be seen that for each department Id one block is assigned, this will make Oracle cluster to require more storage than that required to store individual independent tables. On the other hand if few employee rows are assigned to a department, a lot of block space will be wasted. Transactional operations over the individual cluster participant tables will make rows to be physically reallocated, thus making Oracle clusters suitable for fast join operations, but slow for transactional tables.

Block dumped to verify physical row storage from the first two cluster rows.

C:\Oracle\product\10.1.0\admin\alpha\udump>more alpha_ora_6088.trc
Dump file c:\oracle\product\10.1.0\admin\alpha\udump\alpha_ora_6088.trc
Wed Apr 02 01:12:30 2008
ORACLE V10.1.0.5.0 - Production vsnsta=0
vsnsql=13 vsnxtr=3
Oracle Database 10g Enterprise Edition Release 10.1.0.5.0 - Production
With the Partitioning, OLAP and Data Mining options
Windows XP Version V5.1 Service Pack 2
CPU : 2 - type 586
Process Affinity : 0x00000000
Memory (Avail/Total): Ph:1077M/2046M, Ph+PgF:3110M/3938M, VA:1701M/2047M
Instance name: alpha

Redo thread mounted by this instance: 1

Oracle process number: 13

Windows thread id: 6088, image: ORACLE.EXE (SHAD)

*** 2008-04-02 01:12:30.265
*** ACTION NAME:() 2008-04-02 01:12:30.250
*** MODULE NAME:(SQL*Plus) 2008-04-02 01:12:30.250
*** SERVICE NAME:(SYS$USERS) 2008-04-02 01:12:30.250
*** SESSION ID:(152.1209) 2008-04-02 01:12:30.250
Start dump data blocks tsn: 4 file#: 4 minblk 1821 maxblk 1822
buffer tsn: 4 rdba: 0x0100071d (4/1821)
scn: 0x0000.001deadc seq: 0x01 flg: 0x06 tail: 0xeadc0601
frmt: 0x02 chkval: 0x1b2c type: 0x06=trans data
Block header dump: 0x0100071d
Object id on Block? Y
seg/obj: 0xc312 csc: 0x00.1dea96 itc: 2 flg: E typ: 1 - DATA
brn: 1 bdba: 0x1000ef1 ver: 0x01 opc: 0
inc: 0 exflg: 0

Itl Xid Uba Flag Lck Scn/Fsc
0x01 0x0001.021.0000051a 0x008005bb.02be.34 C--- 0 scn 0x0000.001dea72
0x02 0x0003.00a.000008e7 0x00800526.04cd.12 --U- 1 fsc 0x0000.001deadc

data_block_dump,data header at 0x6051264
===============
tsiz: 0x1f98
hsiz: 0x20
pbl: 0x06051264
bdba: 0x0100071d
76543210
flag=--------
ntab=3
nrow=3
frre=-1
fsbo=0x20
fseo=0x1f47
avsp=0x1f27
tosp=0x1f27
0xe:pti[0] nrow=1 offs=0
0x12:pti[1] nrow=1 offs=1
0x16:pti[2] nrow=1 offs=2
0x1a:pri[0] offs=0x1f82
0x1c:pri[1] offs=0x1f60
0x1e:pri[2] offs=0x1f47
block_row_dump:
tab 0, row 0, @0x1f82
tl: 22 fb: K-H-FL-- lb: 0x0 cc: 1
curc: 2 comc: 2 pk: 0x0100071d.0 nk: 0x0100071d.0
col 0: [ 2] c1 0b
tab 1, row 0, @0x1f60
tl: 34 fb: -CH-FL-- lb: 0x0 cc: 5 cki: 0
col 0: [ 2] c2 03
col 1: [ 8] 4a 65 6e 6e 69 66 65 72
col 2: [ 6] 57 68 61 6c 65 6e
col 3: [ 7] 77 bb 09 11 01 01 01
col 4: [ 2] c2 2d
tab 2, row 0, @0x1f47
tl: 25 fb: -CH-FL-- lb: 0x2 cc: 3 cki: 0
col 0: [14] 41 64 6d 69 6e 69 73 74 72 61 74 69 6f 6e
col 1: [ 2] c2 03
col 2: [ 2] c2 12
end_of_block_dump
buffer tsn: 4 rdba: 0x0100071e (4/1822)
scn: 0x0000.001deadc seq: 0x01 flg: 0x06 tail: 0xeadc0601
frmt: 0x02 chkval: 0x028c type: 0x06=trans data
Block header dump: 0x0100071e
Object id on Block? Y
seg/obj: 0xc312 csc: 0x00.1dea96 itc: 2 flg: E typ: 1 - DATA
brn: 1 bdba: 0x1000ef1 ver: 0x01 opc: 0
inc: 0 exflg: 0

Itl Xid Uba Flag Lck Scn/Fsc
0x01 0x0001.02a.0000051a 0x008005bb.02be.36 C--- 0 scn 0x0000.001dea73
0x02 0x0003.00a.000008e7 0x00800526.04cd.13 --U- 1 fsc 0x0000.001deadc

data_block_dump,data header at 0x6051264
===============
tsiz: 0x1f98
hsiz: 0x22
pbl: 0x06051264
bdba: 0x0100071e
76543210
flag=--------
ntab=3
nrow=4
frre=-1
fsbo=0x22
fseo=0x1f2c
avsp=0x1f0a
tosp=0x1f0a
0xe:pti[0] nrow=1 offs=0
0x12:pti[1] nrow=2 offs=1
0x16:pti[2] nrow=1 offs=3
0x1a:pri[0] offs=0x1f82
0x1c:pri[1] offs=0x1f5c
0x1e:pri[2] offs=0x1f41
0x20:pri[3] offs=0x1f2c
block_row_dump:
tab 0, row 0, @0x1f82
tl: 22 fb: K-H-FL-- lb: 0x0 cc: 1
curc: 3 comc: 3 pk: 0x0100071e.0 nk: 0x0100071e.0
col 0: [ 2] c1 15
tab 1, row 0, @0x1f5c
tl: 38 fb: -CH-FL-- lb: 0x0 cc: 5 cki: 0
col 0: [ 3] c2 03 02
col 1: [ 7] 4d 69 63 68 61 65 6c
col 2: [ 9] 48 61 72 74 73 74 65 69 6e
col 3: [ 7] 77 c4 02 11 01 01 01
col 4: [ 3] c3 02 1f
tab 1, row 1, @0x1f41
tl: 27 fb: -CH-FL-- lb: 0x0 cc: 5 cki: 0
col 0: [ 3] c2 03 03
col 1: [ 3] 50 61 74
col 2: [ 3] 46 61 79
col 3: [ 7] 77 c5 08 11 01 01 01
col 4: [ 2] c2 3d
tab 2, row 0, @0x1f2c
tl: 21 fb: -CH-FL-- lb: 0x2 cc: 3 cki: 0
col 0: [ 9] 4d 61 72 6b 65 74 69 6e 67
col 1: [ 3] c2 03 02
col 2: [ 2] c2 13
end_of_block_dump
End dump data blocks tsn: 4 file#: 4 minblk 1821 maxblk 1822

C:\Oracle\product\10.1.0\admin\alpha\udump>

Saturday, March 22, 2008

Data Dictionary Statistics Gathering

Normally data dictionary statistics in 9i is not required unless performance issues are detected. Metalink note where data dictionary gathering is discouraged (245051.1) dates back from 16-APR-2004, almost four years ago. In 10g data dictionary statistics are collected just the same as any other schema object. There are official documents that suggest the statistics gathering in 9i (216550.1). It is commonly suggested to gather data dictionary statistics in the Oracle Applications environments. In fact, when performing a 9i database upgrade in an applications database, it is listed as a post upgrade step.

In 10g this is an official recommendation ML (457926.1).

Known issues when data dictionary statistics are present in sys schema.
It is suggested to gather data dictionary statistics in 9i under direct oracle support advice when performance problems are detected.
In catpach for 9.2.0.2, 9.2.0.3 and 9.2.0.4 there was an note documenting the issue that the catpatch process took too much time due to the existing of statistics
This document doesn't state that data dictionary statistics musn't be collected, it only suggest data dictionary statistics must be deleted prior to the upgrade process and once it finishes statistics can be collected once more.

In fact the official suggestion states this:
dbms_stats.delete_schema_stats('SYS'); -- prior to perform the upgrade process
dbms_stats.gather_schema_stats('SYS'); -- once the process has finished

Ref. Gathering Data Dictionary Statistics
Doc ID: Note:245051.1

There is anohter documented issue, listed back on 14-JAN-2005, which states that querying dba_free_space hangs if data dictionary statistics are collected, but the metalink note never states that statistics are not suggested, but it rather suggests a procedure to solve this issue.
Another documented issue dates back to 20-Aug-2003, and reported issues when working with portal 3.0.9.8.1 if data dictionary were present.

Identify if data dictionary statistics have been collected.
If there are performance problems due to large data dictionary access, it is advisable to check if data dictionary statistics have been collected. It can be easily displayed by querying data dictionary tables belonging to sys.

select table_name, last_analyzed
from dba_tables where owner='SYS'
and table_name='FET$';

This other query is useful to know how many sys tables have statistics:

select distinct trunc(last_analyzed), count(*)
from dba_tables
where owner='SYS'
group by trunc(last_analyzed)

Upgrade guide.

If collecting data dictionary statistics during the upgrade process is a time consuming task, it is advisable to collect them prior to perform the upgrade task. Both in Oracle 8i and 9i, the procedure that must be run is DBMS_STATS.GATHER_SCHEMA_STATS, in fact the procedure suggested is by means of EXEC DBMS_STATS.GATHER_DICTIONARY_STATS, which was added starting DB10gR1.

References and documentation.

Complete Checklist for Manual Upgrades to 10gR2
Doc ID: Note:316889.1

Oracle Applications Release 11i with Oracle9i Release 2 (9.2.0)
Doc ID: Note:216550.1

Select COUNT(*) from DBA_FREE_SPACE Hangs
Doc ID: Note:121729.1

Initial Portal Page Request Takes A Lot of Time
Doc ID: Note:198471.1

Gather Optimizer Statistics For Sys And System
Doc ID: Note:457926.1

Is ANALYZE on the Data Dictionary Supported (TABLES OWNED BY SYS)?
Doc ID: Note:35272.1

Ref. How to quickly verify whether data dictionary statistics has been collected
Doc ID: Note:333175.1

Thursday, February 28, 2008

java.lang.Exception: Exception in sending Request :: null

When accessing Enterprise Manager DB Control Console a couple of error messages display, right after providing the sys credentials, the console appears with a down arrow and at the header it says:

java.lang.Exception: Exception in sending Request :: null

If you try to move to another tab, such as the performance tab, it asks for credentials and responses:

Io exception: Unknown host specified

These are symptoms of a not properly configured Agent Time Zone.

Troubleshooting Agent Time Zone

This case happened to me on a Windows XP Professional in Mexican Spanish after a timezone change. I notice a similar error shows up on databases affected by the automatic time zone changes after synchronization with the time.windows.com, which recently has included two time zones for Mexico - one named old and the other named new-

Troubleshooting
0. Consider these files when performing the troubleshooting phase:
emd.properties
supportedtzs.lst

1. Stop the console
\bin\emctl stop dbconsole

2. Backup the file located at ORACLE_HOME/hostname_SID/sysman/config/emd.properties

3. Edit the emd.properties file and look for the agentTZRegion (generally appears at the end of the file)

At the file emd.properties, located at the ORACLE_HOME/hostname_instanceName/sysman/config

change the value

agentTZRegion=GMT To agentTZRegion=America/Mexico_City

Evidently, this is for the particular case of Mexico City, but similar issues could apply to other time zones.
For other timezones affected, such as the one corresponding to Egypt, change the value of this parameter from GMT+2 to Egypt which is included in the supportedtzs.lst file.
So the parameter will be agentTZRegion=Egypt

4. When agent is unable to find a proper time zone it will adopt GMT, so it could be the value registered so far. Change this value by the value corresponding to the OS time zone, this time zone should be one listed at the ORACLE_HOME/sysman/admin/nsupportedtzs.lst file.

5. Execute this command
emctl resetTZ agent
emctl config agent getTZ

After issuing the resetTZ command a similar issue like this one may appear:
To complete this process, you must either:

connect to the database served by this DBConsole as user 'sysman', and execute:

SQL> exec mgmt_target.set_agent_tzrgn('pc06.oracle.com:3938','America/Mexico_City')

-- or --

connect to the database served by this DBConsole as user 'sys', and execute:

SQL> alter session set current_schema = SYSMAN;
SQL> exec mgmt_target.set_agent_tzrgn('pc06.oracle.com:3938','America/Mexico_City')

In order for these commands to be successful, you are required the agent to have registered some values at the EM repository. Check this query, there should be similar information displayed when connected as sysman:
SQL> select target_name, target_type from mgmt_targets;

TARGET_NAME TARGET_TYPE
------------------------------------
pc06.oracle.com host
orcl.oracle.com oracle_database
pc06.oracle.com:3938 oracle_emd
Management Services and Repository oracle_emrep
LISTENER_pc06.oracle.com oracle_listener

6. The previous command will ask to perform some actions at the sysman repository level. Execute the reset at the repository level by:
SQL> alter session set current_schema = SYSMAN;

SQL> exec mgmt_target.set_agent_tzrgn('hostname:3938','TimeZone');

This command should be successful, otherwise it could be because the agent hasn't ever started and it has never registered, even with the wrong TZ at the repository.

In order for you to verify this has ever run and the agent is properly registered, issue this query as sysman:

SQL> select target_name, target_type from mgmt_targets;
The target with the default port 3938 is the target we are looking for.

The target related to the port 3938 should be listed. Otherwise try to start the agent so it can register this target. If agent is not starting, please veriy at the logs what could be a reason that prevents agent from starting. Most of the times it is because of a wrong specified

6. Try to login to the dbconsole and check if the error still exists.

7. Now start the EM Console, the problem should be fixed by now.

Notes.
The value of the time zone is a valid value listed at ORACLE_HOME/sysman/admin/nsupportedtzs.lst Make sure the time zone matches that of the host computer.

if when executing the emdctl command an error related to java appears, it is because the right java version is not being invoked. Make sure the path environment variable properly includes the jdk environment provided at the same Oracle Home where the database is related to.

path=%PATH%;C:\Oracle\product\10.2.0\db_1\jdk\jre\bin;C:\Oracle\product\10.2.0\db_1\jdk\bin
path=%path%;C:\Oracle\product\10.2.0\db_1\jdk\jre\bin\client

References.

Problem: Startup Agent: ORA-20233: Invalid agent name when running mgmt_target.set_agent_tzrgn procedure in repository
Doc ID: Note:388280.1 --> This note has particularly helped me to solve my particular issue.

Problem: Startup Agent: EM Agent will not start due to Timezone mismatch (Daylight Savings changes for Australia)
Doc ID: Note:362888.1 --> some useful ideas were taken from this note

Problem: Startup Agent: Agent Fails to Start due to Incorrect Timezone File Used as Pointed by ORA_TZFILE
Doc ID: Note:409121.1

Problem: Startup Agent: ORA-20233: Invalid agent name when running mgmt_target.set_agent_tzrgn procedure in repository
Doc ID: Note:388280.1

Saturday, February 09, 2008

Enable Automatic logon on Windows 2003

Verbatim from the Microsoft Article Id. 324737

Warning Serious problems might occur if you modify the registry incorrectly by using Registry Editor or by using another method. These problems might require that you reinstall your operating system. Microsoft cannot guarantee that these problems can be solved. Modify the registry at your own risk.
To use Registry Editor (Regedt32.exe) to turn on automatic logon, follow these steps:

Click Start, and then click Run.

In the Open box, type Regedt32.exe, and then press ENTER.

Locate the following subkey in the registry:

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Winlogon

Double-click the DefaultUserName entry, type your user name, and then click OK.

Double-click the DefaultPassword entry, type your password, and then click OK.NOTE: If the DefaultPassword value does not exist, it must be added. To add the value, follow these steps:

a.	On the Edit menu, click New, and then point to String Value.
b.	Type DefaultPassword, and then press ENTER.
c.	Double-click DefaultPassword.
d.	In the Edit String dialog, type your password and then click OK.

NOTE: If no DefaultPassword string is specified, Windows automatically changes the value of the AutoAdminLogon key from 1 (true) to 0 (false), disabling the AutoAdminLogon feature.

On the Edit menu, click New, and then point to String Value.

Type AutoAdminLogon, and then press ENTER.

Double-click AutoAdminLogon.

In the Edit String dialog box, type 1 and then click OK.

10.

Quit Registry Editor.

11.

Click Start, click Shutdown, and then type a reason in the Comment text box.

12.

Click OK to turn off your computer.

13.

Restart your computer. You can now log on automatically.

Notes To bypass the AutoAdminLogon process and to log on as a different user, hold down the SHIFT key after you log off or after Windows restarts.

Registry change will not work if the “Logon Banner” is defined on the server either by a Group Policy object (GPO) or by a local policy. When policy is changed to not impact server, the feature works as expected.

An interactive console logon that has a different user on the server changes the DefaultUserName registry entry as the last logged on user indicator. AutoAdminLogon relies on the DefaultUserName entry to match the user and the password. Therefore, AutoAdminLogon may fail. You may configure a shutdown script to set the correct DefaultUserName entry for AutoAdminLogonAs. For more information, click the following article number to view the article in the Microsoft Knowledge Base:

119364 (http://support.microsoft.com/kb/119364/) AutoAdminLogon loses DefaultUserName

Tuesday, February 05, 2008

Issues Installing Patchset 10.2.0.3.0 on Windows XP

When attempting to install the Patchset 10.2.0.3.0 upgrade OUI complainted about a msvcr71.dll file in use. All Oracle services were shutdown, as directed by the Installation Guide. So this file shouldn't be in use by any Oracle executable. So where the culprit is? I found it by means of the Process Explorer, this is an utility that can be downloaded from the microsoft site, by navigating through the url: http://www.sysinternals.com/, the Microsoft Technet site can be reache, a couple of clicks and you're done, you can download it. A useful tool is the ListDLL too.

I found a svchost.exe process grabbing the msvcr71.dll file, I just killed it and the Installation process just went on to the end. I should point out that killing an OS process is a very serious matter, don't kill it unless you are absolutely sure what you are killing. DLL's shouldn't be locked by third party processes, so Oracle can just smoothly perform the upgrade. By means of the graphical tool previously mentioned I managed to know which process was locking the DLL.

Side Note, msvcr71.dll: is a module containing standard C library functions such asprintf, memcpy, and cos. It is a part of the Microsoft C RuntimeLibrary.

Thursday, January 31, 2008

Orastack

The orastack utility is a tool provided by oracle to set the stack size reserved or commited on a per thread basis in the Oracle Server on Windows Operating Systems. I had to use it because on a 10gR2 Winx32 database frequent ORA-04030 errors displayed along with TNS-12518.

The database must be shutdown prior to issue the command, otherwise this error will show up:

C:\Oracle\product\10.2.0\db_1\BIN>orastack oracle.exe 524288

Couldn't open file with CreateFile()

GetLastError() == 32

The default value is 1024K, it can be reduced to 512K, but it is not recommended to reduce the value below 300K, otherwise the ORA-01331 may start to show up.

Syntax:

orastack {no arguments} will display a useful README 1st.
orastack executableName will display the current settings
orastack executableName newValue will reset the current value to the specified newValue

ORA-04030 After 10gR2 Upgrade on Windows 2003

I have faced this exact situation, a large awful Error stack with errors such as:
ORA-04030, TNS-12518.

The particular environment where I faced these issues was:

Oracle 10gR2 (10.2.0.1.0) for Windows x32

Windows 2003 Enterprise Edition 32 bits

It was Right after migrating a user processes intensive application from 9iR2 (9.2.0.8.0) to the above referred 10gR2 release.
I had documented this issue with the metalik Note: 10gR2 Dedicated Connections Intermittently Fail with TNS-12518 Doc ID: Note:371983.1

One of the recommendations was to reduce memory consumption, and among one of the solutions was the use of orastack.exe to achieve this reduction. Right now I am documenting the issues and drawbacks of performing this procedure.

Aonother recommendation is to reduce the SGA consumption, this was achieved by reducing the values of the SGA_TARGET and SGA_MAX_SIZE instance parameters.

Update 11/Feb/2008

The orastack utility was helpful in reducing both, the Oracle and tnslsnr executables stack size. Additionally, a reduction in users' contention greatly helped in reducing the frequency of ORA-04030 errors, it doesn't mean this error was completely wiped out, I am still waiting for the 10.2.0.4.0 patchset to assess if I can completely get rid of this issue.

Wednesday, January 30, 2008

Forums Code of Conduct

Having actively participated at the Oracle forums, I have collected some useful recommendations for making life at forums easier for everybody.

1. Have a meaningful subject line. The title should properly summarize the problem you are facing, this will attract people who may already have had a similar issue more easily.

2. Give details about versions and technologies (DB, OS, working environment). Depending on the version combinations a suitable solution can or can't be found, if not properly specified the answer scope may be unnecessarily wide and blind guesses will start to be posted.

3. Give detailed error messages. Posting monosyllabic or partially specified error messages or being non specific won't bring you specific answers for sure.

4. If possible provide steps to reproduce the problem. A problem could be reproduced by the posters and configuration issues could be easily spotted. Lab environments could be easily set up.

5. Search before you post. Don't try to reinvent the wheel when someone else has already done it for you. A suitable answer can be found, a 10 minute goggling effort could find you an answer.

6. Get a name. Anonymous users are like ghosts without a name. user615770, for example, says nothing about a person. Everybody can take a few minutes to give themselves a real id, this will make posters more responsible for what they are postings, since they a real person with a real name and they are not hiding behind the mask of a generic and anonymous User Id.

7. If you solve it on your own, share the solution and mark problems with [Solved] in your thread title. There are people who not only feedback if the solutions provided helped in solving their issue, they never track their threads (specially seen when people mark their threads as Urgent). Thinking about other people who are looking for a similar solution, they never know if the provided steps lead to a successful or an unsuccessful result, or even if another workaround was found.

8. Never judge people because of their english level. Most of the people at the international forums aren't English native speakers (including myself), and they do their best to make themselves clear and understandable, let's help them instead of laughing at them, particularly if posting on a technical forum, which should not be used as an English language level assessment tool.

9. Never use discriminatory comments, nor references to ethnic groups or religions, nor make any comment that could be offensive to anybody, in a technical forum, strictly technical comments are never offensive to anybody.

10. Kill Uppercase and exclamation marks. This is considered as yelling and it could be offensive for some people, including me. 'Please' is always an advisable and suggestible keyword.

11. Think of your threads and contributions as if they were written on stone. More people than you could imagine is reading or will read the thread, and each time this thread is read people will read it as is.

12. RTFM is not an answer. I have seen several times self-qualified senior level whatever providing this kind of answers. RTFM is equal as saying "don't bother me, stop @%##~ me", and this is not an acceptable technical answer, specially for those who pretend to be self qualified as senior.

13. Address the community with respect. Self explanatory.

Thanks to Shay Shmeltzer's Blog out of which, this code of conduct proposal is based.

Issues found when upgrading 9.2.0 to 10.2.0

Note: I strongly encourage people to read the Upgrade guide, read metalink Notes, get involved with 10g new features and setup a testing environment prior to perform the actual upgrade on a production environment.

ORA-00093: _shared_pool_reserved_min_alloc must be between 4000 and 0

In my particular case, the reason why I faced this issue was because I enabled ASMM and I didn't declare a value for SGA_TARGET. All parameters managed by ASMM were set to zero, but I forgot to explicitly set SGA_TARGET and SGA_MAX_SIZE. After setting suitable values for these parameters I was able to start my database. _shared_pool_reserved_min_alloc is a hidden parameter and this error appears due to ASMM missconfiguration.

This was on a manually upgraded database. I followed the below mentioned documentation, which was of great help.

Upgrade Guide
Oracle 10g New Features

Metalink Notes:

Complete checklist for manual upgrades of Oracle databases: 421191.1

Heap size nK exceeds notification threshold (2048K) warning
An error similar to this one is frequently seen at the alert.log file. Several queries which used to work without any issue, suddenly start to be reported at the alert.log file without apparent reason.
Memory Notification: Library Cache Object loaded into SGA
Heap size 11369K exceeds notification threshold (2048K)
Details in trace file d:\oracle\admin\aonmxdbf\udump\aonmxdbf_ora_3624.trc
KGL object name :EXPLAIN PLAN SET STATEMENT_ID = 'TRACE156742' INTO DBO.IW_PLAN_TABLE FOR Select

This is actually not an error, and the database is not about to crash, it only means a Heap size is greater than the value registered as a threshold. This is an issue on 10.2.0.1.0, this issue has been fixed on later patchsets. If you want to get rid of this warning, then an internal parameter has to be modified to a value greated than 2M. This should be made from a sql plus prompt connected as sysdba:

SQL> alter system set "_kgl_large_heap_warning_threshold"=8388608 scope=spfile ;

ORA-04030: out of process memory
An error similar to the one shown below is written at the alert.log file, a process has run out of memory. The error manual is funny, according to it you don't have to do any thing.

ORA-04030: out of process memory when trying to allocate string bytes (string,string)
Cause: Operating system process private memory has been exhausted
Action: none

Errors in file d:\oracle\admin\aonmxdbf\udump\aonmxdbf_ora_3176.trc:
ORA-04030: out of process memory when trying to allocate 66172 bytes (pga heap,kco buffer)
ORA-07445: exception encountered: core dump [ACCESS_VIOLATION] [unable_to_trans_pc] [PC:0x60504019] [ADDR:0x4]
[UNABLE_TO_WRITE] []

This error showed up when an end user tried to access an application with a view ranked as one of the top queries in the database. I have increased the PGA_AGGREGATE_TARGET instance parameter and I have enabled ASMM.

I should further clarify, this happened on a Windows 2003 environment, and I found out that the boot didn't have the /3G switch required to access more than 2G RAM memory. So I have set it too since the memory requirements were very close to the 2G limit.

multi(0)disk(0)rdisk(0)partition(1)\WINDOWS="Windows Server 2003, Enterprise" /fastdetect /3GB

The ORA-04030 was accompanied by TNS-12518 (TNS:listener could not hand off client connection)

It happened intermitently and when workload increased. As detailed on my other post this was due to excesive resources allocated by all the new features for this release. It is documented at the metalink note 371983.1.

Instance parameters compatible and log_archive_format
Once the database was upgraded, according to the install guide, the compatible parameter remained at 9.2.0.0.0, once the upgrade process was stabilized the parameter changed to 10.2.0.1.0, this made instance parameter such as log_archive_format to have a mandatory different format: it must contain the archive format mask %s %r %t.