Overview of Database Backup and Recovery Features
In every database system, the possibility of a system or hardware failure always exists. If a failure occurs and affects the database, then the database must be recovered. The goals after a failure are to ensure that the effects of all committed transactions are reflected in the recovered database and to return to normal operation as quickly as possible while insulating users from problems caused by the failure.Oracle provides various mechanisms for the following:
-
Database recovery required by different types of failures
-
Flexible recovery operations to suit any situation
-
Availability of data during backup and recovery operations so users of the system can continue to work
Types of Failures
Several circumstances can halt the operation of an Oracle database. The most common types of failure are described in the following table.Failure | Description |
---|---|
User error | Requires a database to be recovered to a point in time before the error occurred. For example, a user could accidentally drop a table. To enable recovery from user errors and accommodate other unique recovery requirements, Oracle provides exact point-in-time recovery. For example, if a user accidentally drops a table, the database can be recovered to the instant in time before the table was dropped. |
Statement failure | Occurs when there is a logical failure in the handling of a statement in an Oracle program. When statement failure occurs, any effects of the statement are automatically undone by Oracle and control is returned to the user. |
Process failure | Results from a failure in a user process accessing Oracle, such as an abnormal disconnection or process termination. The background process PMON automatically detects the failed user process, rolls back the uncommitted transaction of the user process, and releases any resources that the process was using. |
Instance failure | Occurs when a
problem arises that prevents an instance from continuing work. Instance
failure can result from a hardware problem such as a power outage, or a
software problem such as an operating system failure. When an instance
failure occurs, the data in the buffers of the system global area is not
written to the datafiles.
After an instance failure, Oracle automatically performs instance recovery.
If one instance in a RAC environment fails, then another instance
recovers the redo for the failed instance. In a single-instance
database, or in a RAC database in which all instances fail, Oracle
automatically applies all redo when you restart the database. |
Media (disk) failure | An error can
occur when trying to write or read a file on disk that is required to
operate the database. A common example is a disk head failure, which
causes the loss of all files on a disk drive.
Different files can be affected by this type of disk failure,
including the datafiles, the redo log files, and the control files.
Also, because the database instance cannot continue to function
properly, the data in the database buffers of the system global area
cannot be permanently written to the datafiles. A disk failure requires you to restore lost files and then perform media recovery. Unlike instance recovery, media recovery must be initiated by the user. Media recovery updates restored datafiles so the information in them corresponds to the most recent time point before the disk failure, including the committed data in memory that was lost because of the failure. |
Oracle provides for complete media recovery from all possible types of hardware failures, including disk failures. Options are provided so that a database can be completely recovered or partially recovered to a specific point in time.
If some datafiles are damaged in a disk failure but most of the database is intact and operational, the database can remain open while the required tablespaces are individually recovered. Therefore, undamaged portions of a database are available for normal use while damaged portions are being recovered.
Structures Used for Recovery
Oracle uses several structures to provide complete recovery from an instance or disk failure: the redo log, undo records, a control file, and database backups.The Redo Log
The redo log is a set of files that protect altered database data in memory that has not been written to the datafiles. The redo log can consist of the online redo log and the archived redo log.The online redo log is a set of two or more online redo log files that record all changes made to the database, including uncommitted and committed changes. Redo entries are temporarily stored in redo log buffers of the system global area, and the background process LGWR writes the redo entries sequentially to an online redo log file. LGWR writes redo entries continually, and it also writes a commit record every time a user process commits a transaction.
Optionally, filled online redo files can be manually or automatically archived before being reused, creating archived redo logs. To enable or disable archiving, set the database in one of the following modes:
-
ARCHIVELOG
: The filled online redo log files are archived before they are reused in the cycle.
-
NOARCHIVELOG
: The filled online redo log files are not archived.
ARCHIVELOG
mode, the database can be completely
recovered from both instance and disk failure. The database can also be
backed up while it is open and available for use. However, additional
administrative operations are required to maintain the archived redo
log.If the database redo log operates in
NOARCHIVELOG
mode,
then the database can be completely recovered from instance failure, but
not from disk failure. Also, the database can be backed up only while
it is completely closed. Because no archived redo log is created, no
extra work is required by the database administrator.
No comments:
Post a Comment