Friday, February 19, 2010

Error in Agent trying to install Grid Control

An error showed when I tried to install Oracle Grid Control 1o.2.0 on a Windows 2003 SE x32. The error was pretty much the same as the error shown with the regular Enterprise Manager Console on this platform, it has to do with the time zone issue I have already reported some time ago in this blog.

Sometimes because of the timeout involved during the installation process, it may happen that after a second attempt the Agent can be installed, afterwards then take a look at the log files, there may be some major issue that requires special attention.

I took a look at the \log directory and looked for the emdctl.trc file, there I found the following lines:

2010-02-19 15:30:52 Thread-1744 ERROR main: nmectla_agentctl: Error connecting to https://tango.oracle.com:3872/emd/main/. Returning status code 1
2010-02-19 15:30:53 Thread-2872 ERROR main: nmectl.c: nmectl_validateTZRegion, agentTZoffset =-360,and testTZoffset for GMT:0 do not match
2010-02-19 15:30:54 Thread-2872 ERROR main: nmectl.c: nmectl_validateTZRegion, agentTZoffset =-360,and testTZoffset for GMT:0 do not match

2010-02-19 15:49:34 Thread-2552 ERROR main: nmectla_agentctl: Error connecting to https://tango.oracle.com:3872/emd/main/. Returning status code 1
2010-02-19 15:49:44 Thread-2660 WARN http: snmehl_connect: connect failed to (tango.oracle.com:3872): No connection could be made because the target machine actively refused it. (error = 10061)



By the way, tango.oracle.com is a fictitious server and does not have anything to do with Oracle corp.

At the log file a particular error came to may attention, the TZ Error. This one has to do with the time zone changes that took place some time back. I will apply the latest patchset on top of it, so I am not too much concerned about fixing it at the time, but since I want to have a 'clean' install I worked around this by commenting the line found at the \config\emd.properties file (a routine backup is highly advisable) and commented the last line. I changed the time zone in the windows machine and by the time the assistant is re-run it takes the right Time Zone.

emd.properties

###HRM: agentTZRegion=GMT
agentTZRegion=America/Chicago


That's it, my installation took place and I can proceed with the next tasks.

Wednesday, February 17, 2010

My recent experience with VMWare


VMware ate my four CPU Cores

Today I noticed the high amount of CPU consumed by the vmware-authd.exe process. It raised the CPU consumption to 100% and leaves no CPU resources to any other process in the system.

The environment is a Windows 7 professional 64 bits with 4GB Ram and 4 cores (all of them at the top 100%).

The most recent post I found so far by googling is from Oct 19th 2009, and the author of the post states that a bug was filed (483679).

vmware-authd.exe is the executable for the VMware Authorization and Authentication Service for starting and accessing virtual machines. This process is required if you are not logged in with administrative privileges (which by the way is my case).

I shuted down this service. In the VMWare 2.0 edition the VMWare Auth. service is dependent from the VMWare Host Agent which provides remote command and administrative control over this VMWare Server host. After shutting down both processes the CPU monitor was back to normal.

This process is required in the VMWare Server 2.0 version, otherwise it will no be possible to launch the console. By shutting down and restarting the services the problem seems to be fixed. Most probably it has to do with a bug.


On the other hand, I don't have anything against the new Tomcat based console, but I miss the regular windows based console.


Host Unreachable


When trying to connect to my virtual machines through the network, they replied with a 'host unreachable' error, among other unpleasant related network errors.

The current environment I have:
  • DHCP on host real network adapter
  • Loopback adapter on real host
  • A fixed IP address through a bridged virtual network adapter

In this scenario the loopback adapter was required since I am installing an Oracle 11g Rel. 1 / 2 on this platform. The virtual machines were not visible or either were partially and intermittently visible.

I put the virtual network adapter to have a dynamic address provided by the DHCP server and it began to work. However I still need a fixed IP Address at the virtual machine since I am installing Oracle RDBMS Oracle 11g Rel 1 on top of it. So I installed a Loopback adapter inside the Virtual Machine.

I went through several Google references and most of them talked about de-installing and re-installing the VM protocol from the real network adapter, but it only lead me in this particular case to a waste of time and a server reboot. So far, everything is properly working.

When I pinged the host server I noticed a lot of time waiting for a reply, and when I tried to access a shared path from it my local machine replied with a timeout. Two issues were involved here, first the user at the host server is a domain user, not a local user, and the second one, it has to solve the hostname first, which took too long, so I added the address from the host to the hosts file at the virtual machine.

Network Communication to the VM was Deeeeeeadly Slooooooooooow

It is not enough eh!, well I tried to perfom file transfer from my host to the virtual machine, it happened that the performance was around 20Kb/s. so figure out what it was to transfer the Oracle XE installer executable (200,000 Kb), around 2:45 hrs to transfer the whole file, I don't want to image how long it would last to transfer the more than 1G file to install Ora11gR1 to the VM. Google you are the Geek's Nirvana, after googling a while I found a reference that stated some network parameters had to be configured, it was the advanced properties for the network adapter at the physical host.


At the advanced properties of the network adapter ther is one named Large Send Offload v2 (IPv4), this must turned off (disabled), it boosted the network performance to the Virtual Machine. A definition of what this parameter does can be found at this Microsoft Tech Note.

"DisableTaskOffload
Disables offloading of processor tasks to the network adapter. Offloading is designed to optimize performance of Windows 2000.

Network Driver Interface Specification (NDIS) 5.0 lets TCP take full advantage of intelligence in network adapters by letting the adapter do some of the tasks that the processor normally performs. Offloading these tasks to the network adapter leaves the processor free for tasks that only it can perform."


Addendum
A document that may help with some additional performance hints can be found here "VMWare ESX Server Performance Tuning Best Practices".

I go back to the VMWare, this VMWare session was very instructional for me.