Oracle DBA: Overview of Database Backup and Recovery Features

Overview of Database Backup and Recovery Features

In every database system, the possibility of a system or hardware failure always exists. If a failure occurs and affects the database, then the database must be recovered. The goals after a failure are to ensure that the effects of all committed transactions are reflected in the recovered database and to return to normal operation as quickly as possible while insulating users from problems caused by the failure.
Oracle provides various mechanisms for the following:

Database recovery required by different types of failures
Flexible recovery operations to suit any situation
Availability of data during backup and recovery operations so users of the system can continue to work

Types of Failures

Several circumstances can halt the operation of an Oracle database. The most common types of failure are described in the following table.

Failure	Description
User error	Requires a database to be recovered to a point in time before the error occurred. For example, a user could accidentally drop a table. To enable recovery from user errors and accommodate other unique recovery requirements, Oracle provides exact point-in-time recovery. For example, if a user accidentally drops a table, the database can be recovered to the instant in time before the table was dropped.
Statement failure	Occurs when there is a logical failure in the handling of a statement in an Oracle program. When statement failure occurs, any effects of the statement are automatically undone by Oracle and control is returned to the user.
Process failure	Results from a failure in a user process accessing Oracle, such as an abnormal disconnection or process termination. The background process PMON automatically detects the failed user process, rolls back the uncommitted transaction of the user process, and releases any resources that the process was using.
Instance failure	Occurs when a problem arises that prevents an instance from continuing work. Instance failure can result from a hardware problem such as a power outage, or a software problem such as an operating system failure. When an instance failure occurs, the data in the buffers of the system global area is not written to the datafiles. After an instance failure, Oracle automatically performs instance recovery. If one instance in a RAC environment fails, then another instance recovers the redo for the failed instance. In a single-instance database, or in a RAC database in which all instances fail, Oracle automatically applies all redo when you restart the database.
Media (disk) failure	An error can occur when trying to write or read a file on disk that is required to operate the database. A common example is a disk head failure, which causes the loss of all files on a disk drive. Different files can be affected by this type of disk failure, including the datafiles, the redo log files, and the control files. Also, because the database instance cannot continue to function properly, the data in the database buffers of the system global area cannot be permanently written to the datafiles. A disk failure requires you to restore lost files and then perform media recovery. Unlike instance recovery, media recovery must be initiated by the user. Media recovery updates restored datafiles so the information in them corresponds to the most recent time point before the disk failure, including the committed data in memory that was lost because of the failure.

Oracle provides for complete media recovery from all possible types of hardware failures, including disk failures. Options are provided so that a database can be completely recovered or partially recovered to a specific point in time.
If some datafiles are damaged in a disk failure but most of the database is intact and operational, the database can remain open while the required tablespaces are individually recovered. Therefore, undamaged portions of a database are available for normal use while damaged portions are being recovered.

Structures Used for Recovery

Oracle uses several structures to provide complete recovery from an instance or disk failure: the redo log, undo records, a control file, and database backups.

The Redo Log

The redo log is a set of files that protect altered database data in memory that has not been written to the datafiles. The redo log can consist of the online redo log and the archived redo log.

The online redo log is a set of two or more online redo log files that record all changes made to the database, including uncommitted and committed changes. Redo entries are temporarily stored in redo log buffers of the system global area, and the background process LGWR writes the redo entries sequentially to an online redo log file. LGWR writes redo entries continually, and it also writes a commit record every time a user process commits a transaction.
Optionally, filled online redo files can be manually or automatically archived before being reused, creating archived redo logs. To enable or disable archiving, set the database in one of the following modes:

ARCHIVELOG: The filled online redo log files are archived before they are reused in the cycle.
NOARCHIVELOG: The filled online redo log files are not archived.

In ARCHIVELOG mode, the database can be completely recovered from both instance and disk failure. The database can also be backed up while it is open and available for use. However, additional administrative operations are required to maintain the archived redo log.
If the database redo log operates in NOARCHIVELOG mode, then the database can be completely recovered from instance failure, but not from disk failure. Also, the database can be backed up only while it is completely closed. Because no archived redo log is created, no extra work is required by the database administrator.

Undo Records

Undo records are stored in undo tablespaces. Oracle uses the undo data for a variety of purposes, including accessing before-images of blocks changed in uncommitted transactions. During database recovery, Oracle applies all changes recorded in the redo log and then uses undo information to roll back any uncommitted transactions.

Control Files

The control files include information about the file structure of the database and the current log sequence number being written by LGWR. During normal recovery procedures, the information in a control file guides the automatic progression of the recovery operation.

Database Backups

Because one or more files can be physically damaged as the result of a disk failure, media recovery requires the restoration of the damaged files from the most recent operating system backup of a database. You can either back up the database files with Recovery Manager (RMAN), or use operating system utilities. RMAN is an Oracle utility that manages backup and recovery operations, creates backups of database files (datafiles, control files, and archived redo log files), and restores or recovers a database from backups.

Oracle DBA

Friday 14 November 2014

Overview of Database Backup and Recovery Features

Overview of Database Backup and Recovery Features

Types of Failures

Structures Used for Recovery

The Redo Log

Undo Records

Control Files

Database Backups

No comments:

Post a Comment