Tuesday, November 11, 2008

What Happens during a hot backup?

There are a couple of myths around the hot backup or online backup process:

Myth #1: The hot backup generates "a lot" of redo information
Myth #2: The archivelog mode "dramatically slows down" the database
Myth #3: When a hot backup is in progress the target datafile is frozen.

There are two ways to generate a hot backup, the first one is by a user managed backup and the second one with recovery manager. The database is required to be in archivelog mode for it to be able to perform an online backup. Both ways to perform an online backup work in a similar way, as I will further explain later, rman is more efficient than the user managed backup.

User managed backup

alter tablespace ts_name begin backup;

When the command is issued, at this point in time a checkpoint is performed against the target tablespace; then the datafile header is frozen, so no more updates are allowed on it (the datafile header), this is for the database to know which was the last time the tablespace had a consistent image of the data.

The datafiles with the backup in progress will still allow read/write operations just as a regular datafile, I/O activity is not frozen.

Each time a row is modified, not only the row, but the complete block is recorded to the redo log file, this will only happen the first time the block is modified, subsequent transactions on the block will only record the transaction just as normal.

During the user managed backup process the "fractured block" event may be present. Let's remember that the oracle block is the minimum IO unit, and an oracle block is made out of several OS blocks; let's assume a block size of 8K and an OS block of 512b, this will give 16 OS blocks. If during the backup process of a block there is a write operation on the block then the backup will contain a before image and an after image of the oracle block, the complete block in the backup media will be corrupt. This is normal, consistency is not guaranteed on the backup, that is why the header must be frozen to mark the point where the recovery process will have to start, and that is why oracle record a complete block image on the redo log file.

At the time the alter tablespace ts_name end backup; command is issued then the backup process is finished and the datafile header resumes its regular IO activity.

Recovery manager backup
The same process happens when a rman backup takes place, the only difference is that rman better handles the fractured block issue, it doesn't write block fragments or partial blocks to the backup, it writes the complete consistent block image to the backup media. So recovery manager doesn't need to record the complete block to the redo log file.

Some further comments on the rman case, rman doesn't freeze the datafile header, it continues to checkpoint just as regular, but it does perform a checkpoint to the tablespace.

From my perspective, the user managed backup (UMB) is a backup method that is less frequently seen on production environments, since Oracle 9i Rel 2 most DBA's considered rman as part of the regular backup/recover strategy, it performs better that the UMB, it is able to perform a block level backup, meanwhile in the UMB the whole datafile must be backup even if a lot of clean blocks are present.

Some advices for the people who still use UMB, don't let the BEGIN BACKUP run for long periods of time, it is very likely that the more time it takes to perform the backup, the more blocks are likely to change, which may generate more blocks contents to be written to the redo log files.


Tom said...

I agree with some of your points and they could just be written wrong.

As to Myth#1. The Hot Backup itself does not generate a lot of redo. But if you have a lot of blocks that are changed while the tablespace is in hot backup mode, each block will be copied the first time it is changed. This is what can generate a lot of redo.

Myth#2. This can go back to Myth #1. If a lot of blocks are copied into the redo log files, they can fill up faster. If they cannot be archived fast enough, it can cause things to "hang".

I agree with Myth #3. The data blocks are still marked dirty and flushed out by the DB Writer.

The SCN in the data file header is not updated until after the tablespace that those datafiles belong to come out of hot backup mode. The current SCN's are still recorded in the redo. When you recover the tablespaces datafiles, it knows what SCN was "safe" and it looks and recovers any blocks past that point that need recovery using the archive log files that have the complete data blocks. You were mostly right and it could just be the way it was written.

Great blog by the way!

Dan Norris said...

I don't think I agree with your last comment about RMAN. It doesn't use the same mechanism as user-managed backup and the datafile header is not frozen and full blocks are not written to the redo logs during backup. You might want to verify that and post some sources (even if it is your own testing) so your claims can be validated.

Hector Rivera Madrid said...

Thank you Dan for taking the time to leave a comment.

That is right, the datafile header is not frozen during the backup time with rman, I will emphasize this on the original post. On the other way I didn't mention that rman writes the block contents to the redo log, all the opposite, "... So recovery manager doesn't need to record the complete block to the redo log file". rman better handles the fractured block issue, it writes a consistent block image to the backup media, so it doesn't require to write the complete image to the backup media.

Hector Rivera Madrid said...

Thanks Tom for your comment.

Yes, the hotbackup in a user managed backup will generate additional redo information due to the block information it has to record to the redo log files, but the amount of redo depends on the amount of blocks being modified, and since the block is only written once during the hot backup session, in a worst case scenario the amount of block dumped to the redo will be the same as the number of used blocks in a tablespace. So, it depends ...

Regarding Mith#2, I once met a DBA who was able to convince the company of not using the archive log mode because it caused a 'deadly slowdown' in database performance. When I took over the database I realize that the root problem was a badly tuned redo mechanism.

Dave said...

Hector Rivera Madrid is asking, What Happens during a hot backup? He is trying to debunk what he sees as common myths on this matter...

Arju said...

Good one. http://arjudba.blogspot.com/2009/08/what-happens-during-oracle-database-hot.html