Saturday 8 October 2022

BASIC ORACLE ARCHITECTURE

What is An Oracle Database?

Basically, there are two main components of Oracle database –– instance and database itself. An instance consists of some memory structures and the background processes, whereas a database refers to the disk resources. Figure 1 will show you the relationship.

Figure 1. Two main components of Oracle database

Instance

Database files themselves are useless without the memory structures and processes to interact with the database. Oracle defines the term instance as the memory structure and the background processes used to access data from a database. The memory structures and background processes contitute an instance. The memory structure itself consists of System Global Area (SGA), Program Global Area (PGA), and an optional area –– Software Area Code. In the other hand, the mandatory background processes are Database Writer (DBWn), Log Writer (LGWR), Checkpoint (CKPT), System Monitor (SMON), and Process Monitor (PMON). And another optional background processes are Archiver (ARCn), Recoverer (RECO), etc. Figure 2 will illustrate the relationship for those components on an instance.

Figure 2. The instance components

System Global Area

SGA is the primary memory structures. When Oracle DBAs talk about memory, they usually mean the SGA. This area is broken into a few of part memory –– Buffer Cache, Shared Pool, Redo Log Buffer, Large Pool, and Java Pool.

Buffer Cache

Buffer cache is used to stores the copies of data block that retrieved from datafiles. That is, when user retrieves data from database, the data will be stored in buffer cache. Its size can be manipulated via DB_CACHE_SIZE parameter in init.ora initialization parameter file.

Shared Pool

Shared pool is broken into two small part memories –– Library Cache and Dictionary Cache. The library cache is used to stores information about the commonly used SQL and PL/SQL statements; and is managed by a Least Recently Used (LRU) algorithm. It is also enables the sharing those statements among users. In the other hand, dictionary cache is used to stores information about object definitions in the database, such as columns, tables, indexes, users, privileges, etc.

The shared pool size can be set via SHARED_POOL_SIZE parameter in init.ora initialization parameter file.

Redo Log Buffer

Each DML statement (select, insert, update, and delete) executed by users will generates the redo entry. What is a redo entry? It is an information about all data changes made by users. That redo entry is stored in redo log buffer before it is written into the redo log files. To manipulate the size of redo log buffer, you can use the LOG_BUFFER parameter in init.ora initialization parameter file.

Large Pool

Large pool is an optional area of memory in the SGA. It is used to relieves the burden place on the shared pool. It is also used for I/O processes. The large pool size can be set by LARGE_POOL_SIZE parameter in init.ora initialization parameter file.

Java Pool

As its name, Java pool is used to services parsing of the Java commands. Its size can be set by JAVA_POOL_SIZE parameter in init.ora initialization parameter file.

Program Global Area

Although the result of SQL statemen parsing is stored in library cache, but the value of binding variable will be stored in PGA. Why? Because it must be private or not be shared among users. The PGA is also used for sort area.

Software Area Code

Software area code is a location in memory where the Oracle application software resides.

Oracle processes

There are two categories of processes that run with an Oracle database. They are mentioned below:

User processes
System processes

The following figure illustrates the relationship between user processes, server processes, PGA, and session:

The first interaction with the Oracle-based application comes from the user computer that creates a user process. The user process then communicates with the server process on the host computer. Here, PGA is used to store session specific information.

Oracle Background Processes

Oracle background processes is the processes behind the scene that work together with the memories.

DBWn

Database writer (DBWn) process is used to write data from buffer cache into the datafiles. Historically, the database writer is named DBWR. But since some of Oracle version allows us to have more than one database writer, the name is changed to DBWn, where n value is a number 0 to 9.

LGWR

Log writer (LGWR) process is similar to DBWn. It writes the redo entries from redo log buffer into the redo log files.

CKPT

Checkpoint (CKPT) is a process to give a signal to DBWn to writes data in the buffer cache into datafiles. It will also updates datafiles and control files header when log file switch occurs.

SMON

System Monitor (SMON) process is used to recover the system crash or instance failure by applying the entries in the redo log files to the datafiles.

PMON

Process Monitor (PMON) process is used to clean up work after failed processes by rolling back the transactions and releasing other resources.

ARCH

The ARCH background process is invoked when your database is running in ARCHIVELOG mode. If you are archiving your redo logs, the redo logs are touched by several background processes. First, the LGWR process copies the log_buffer contents to the online redo log files, and then the ARCH process copies the online redo log files to the archived redo log filesystem on UNIX. The ARCH process commonly offloads the most recent online redo log file whenever a log switch operation occurs in Oracle.

The figure 4: shows various components of SGA, Oracle background processes, and their interactions with control files, data files, Redo Log files, and archived redo logs.

Database

The database refers to disk resources, and is broken into two main structures –– Logical structures and Physical structures.

Logical Structures:~

Oracle database is divided into smaller logical units to manage, store, and retrieve data effeciently. The logical units are tablespace, segment, extent, and data block. Figure 5 will illustrate the relationships between those units.

Figure 5. The relationships between the Oracle logical structures

Tablespace

A Tablespace is a grouping logical database objects. A database must have one or more tablespaces. In the Figure 5, we have three tablespaces –– SYSTEM tablespace, Tablespace 1, and Tablespace 2. Tablespace is composed by one or more datafiles.

There are three types of tablespaces in Oracle:

Permanent tablespaces
Undo tablespaces
temporary tablespaces

Segment

A Tablespace is further broken into segments. A segment is used to stores same type of objects. That is, every table in the database will store into a specific segment (named Data Segment) and every index in the database will also store in its own segment (named Index Segment). The other segment types are Temporary Segment and Rollback Segment.
A segment is a container for objects (such as tables, views, packages . . . indexes). A segment consists of Extends.

There are 11 types of Segments in oracle 10g.

Table
Table Partition
Index
Index Partition
Cluster
Rollback
Deferred Rollback
Temporary
Cache
Lobsegment
Lobindex
Extent
A segment is further broken into extents. An extent consists of one or more data block. When the database object is enlarged, an extent will be allocated. Unlike a tablespace or a segment, an extent cannot be named. Space for a data on a hard disk is allocated in extends.
Data Block
A data block is the smallest unit of storage in the Oracle database. The data block size is a specific number of bytes within tablespace and it has the same number of bytes.
Physical Structures:~The physical structures are structures of an Oracle database (in this case the disk files) that are not directly manipulated by users. The physical structure consists of datafiles, redo log files, and control files.
Datafiles
A datafile is a file that correspondens with a tablespace. One datafile can be used by one tablespace, but one tablespace can has more than one datafiles. An Oracle databae include of a number of physical files called datafile.
Redo Log Files

A Redo Log is a file that is part of an Oracle Database. When a transaction is committed the transaction’s details in the redo log buffer is written in a redo log file. These files contain information that helps in recovery in the event of system failure.

The figure 6: shows three Redo Log groups. Each group consists of two members. The first member of each Redo Log group is stored in directory D1 and the second member is stored in directory D2.

Control Files

Control files are used to store information about physical structure of database. The control file is absolutely crucial to database operations. It contains the following types of information:

Database Information
Archive log history
Tablespace and datafile records
Redo threads
Database’s creation data
Database name
Current Archive information
Log records
Database Id which is unique to each Database

RMAN: Basic RMAN Commands

Oracle Recovery Manager (RMAN) satisfies the most pressing demands of performant, manageable backup and recovery, for all Oracle data formats.

RMAN provides a common interface, via command line and Enterprise Manager, for backup tasks across different host operating systems.

Below are some of the commonly used RMAN commands which you can run through RMAN command line.

All the commands were tested on Oracle database version 11gR2 (11.2.0.4)

**SHOW COMMAND**

1) Shows all parameters.

RMAN> show all;

2) Shows the archivelog deletion policy.

RMAN> show archivelog deletion policy;

3) Shows the number of archivelog backup copies

RMAN> show archivelog backup copies;

4) Shows the auxiliary database information.

RMAN> show auxname;

5) Shows whether optimization is on or off.

RMAN> show backup optimization;

6) Shows how the normal channel and auxiliary channel are configured.

RMAN> show [auxiliary] channel;

7) Shows the characteristics of the channel

RMAN> show channel for device type [disk | <media device>;

8) Shows whether control file autobackup is on or off.

RMAN> show controlfile autobackup;

9) Shows the format of the autobackup control file

RMAN> show controlfile autobackup format;

10) Shows the number of datafile backup copies being kept.

RMAN> show datafile backup copies;

11) Shows the default type (disk or tape)

RMAN> show default device type;

12) Shows policy for datafile and control file backups and copies that RMAN marks as obsolete.

RMAN> show retention policy;

13) Shows the encryption algorithm currently in use.

RMAN> show encryption algorithm;

14) Shows the encryption for the database and every tablespace.

RMAN> show encryption for [database | tablespace];

15) Shows the tablespaces excluded from the backup.

RMAN> show exclude;

16) Shows the maximum size for backup sets. The default is unlimited.

RMAN> show maxsetsize;

17) Shows the policy for datafile and control file backups and copies that RMAN marks as obsolete.

RMAN> show retention policy;

18) Shows the snapshot control filename.

RMAN> show snapshot controlfile name;

19) Shows the compression algorithm in force. The default is the ZLIB algorithm.

RMAN> show compression algorithm;

**BACKUP COMMAND**

1) To perform a manual backup of the current control file

RMAN> backup current controlfile;

2) To back up the control file as part of a tablespace backup operation

RMAN> backup tablespace users include current controlfile;

3) To back up the server parameter file

RMAN> backup spfile;

4) To restart an RMAN backup that failed midway through a nightly backup.

RMAN> backup not backed up since time ‘sysdate-1′ database plus archivelog;

5) To force RMAN to back up a file regardless of whether it’s identical to a previously backed up file by specifying the force option

RMAN> backup database force;

By using the force option, you make RMAN back up all the specified files, even if the backup optimization feature is turned on.

6) To backup complete database

RMAN> backup database;

7) To backup database plus archivelogs

RMAN> backup database plus archivelogs;

8) To backup all archive logs

RMAN> backup archivelog all;

9) To backup specific data file

RMAN> backup datafile 5 tag dbfile_5_bkp;

A tag was also added to easily locate this datafile’s backup.

**CATALOG COMMAND**

1) Add user-managed copies of datafile to RMAN repository

RMAN> catalog datafilecopy ‘/u01/oracle/users.bkp’;

RMAN> catalog datafilecopy ‘/u01/oracle/users.bkp’ level 0; (To catalog as incremental level 0 backup)

2) Add uncataloged backup piece to RMAN repository

RMAN> catalog backuppiece ‘ertt2lu4_1_1′;

3) To catalog multiple files ( say you copied production backup to target database for database cloning)

RMAN> catalog start with ‘/backups/source_bkp’ noprompt;

The start with clause specifies that RMAN catalog all valid backup sets, datafile copies, and archived redo logs starting with the string pattern you pass.

4) To catalog all files in the flash recovery area

RMAN> catalog recovery area;

**REPORT COMMAND**

1) To find out which backups you need to make in order to conform to the retention policy you put in place

RMAN> report need backup;

The output of the report need backup command tells you that you must back up which all database files to comply with your retention policy.

2) To get a report about all the datafiles in a database

RMAN> report schema;

RMAN> report schema at time ‘sysdate-1′; (from a past point in time)

3) To reports on any obsolete backups

RMAN> crosscheck backup;

RMAN> report obsolete;

Always run the crosscheck command first in order to update the status of the backups in the RMAN repository to that on disk and tape.

**LIST COMMAND**

1) To review RMAN backups of datafiles, archived redo logs, and control files.

RMAN> list backup;

2) List the backups by just the backup files

RMAN> list backup by file;

3) Lists only backup sets and proxy copies but not image copies

RMAN> list backupset;

4) Lists only datafile, archived redo log, and control file copies

RMAN> list copy;

5) Lists backups by tag:

RMAN> list backupset tag ‘full_database_backup’;

6) To list the backups of all datafiles and archivelogs of the target database:

RMAN> list backup of database;

7) Lists all incarnations of a database

RMAN> list incarnation;

When you perform an open resetlogs operation, it results in the creation of a new incarnation of the database. When performing recovery operations on such a database, you might want to check the database incarnation

8) Lists all restore points in the target database

RMAN> list restore point;

9) Lists the names of all recovery catalog scripts

RMAN> list script names;

10) Which of the backups of the target database have an expired status in the repository.

RMAN> list expired backup;

11) Which of the archived redo log backups have the expired status

RMAN> list expired archivelog all;

12) To restrict the list of backups and copies whose status is listed as available

RMAN> list recoverable backup;

13) To view all the restore points in the database

RMAN> list restore point all;

**CROSSCHECK COMMAND**

1) Cross-checking just backup sets

RMAN> crosscheck backupset;

2) Cross-checking a copy of a database

RMAN> crosscheck copy of database;

3) Cross-checking specific backupsets

RMAN> crosscheck backupset 10, 12;

4) Cross-checking using a backup tag

RMAN> crosscheck backuppiece tag = ‘monthly_backup’;

5) Cross-checking a control file copy;

RMAN> crosscheck controlfilecopy ‘/backups/control01.ctl’;

6) Cross-checking backups completed after a specific time

RMAN> crosscheck backup of datafile “/u01/oracle//system01.dbf” completed after ‘sysdate-7′;

7) Cross-checking of all archivelogs and the spfile

RMAN> crosscheck backup of archivelog all spfile;

8) Cross-checking all backups on disk and tape

RMAN> crosscheck backup;

The crosscheck command checks whether the backups still exist. The command checks backup sets, proxy copies, and image copies.

**DELETE COMMAND**

1) To remove both archived redo logs and RMAN backups

RMAN> delete backup;

RMAN always prompts you for confirmation before going ahead and deleting the backup files. You can issue the delete noprompt command to suppress the confirmation prompt. This will also remove the physical file from the backup media

To make sure the repository and the physical media are synchronized, run “RMAN> crosscheck backup;” before running above command

2) To remove all image copies

RMAN> delete copy;

To make sure the repository and the physical media are synchronized, run “RMAN> crosscheck copy;” before running above command

3) To delete specfic backuppiece

RMAN> delete backuppiece 9;

4) To delete copy of controlfile under /backups

RMAN> delete copy of controlfile like ‘/backups/%’;

5) To delete backups with specific tag

RMAN> delete backup tag=’double_bkp_prod’;

6) To delete bakups of specific tablespace

RMAN> delete backup of tablespace sysaux device type sbt;

******************

You can also use force, expired, obsolete keyword with delete commad:

delete force ..: Deletes the specified files whether they actually exist on media or not and removes their records from the RMAN repository as well

delete expired ..: Deletes only those files marked as expired as per crosscheck command.

delete obsolete ..: Deletes datafile backups and copies and the archived redo logs and log backups that are recorded as obsolete in the RMAN repository

The delete obsolete command relies only on the backup retention policy in force.

******************

7) To delete all archived redo logs

RMAN> delete archivelog all;

8) To delete already backed up archived redo logs

RMAN> delete archivelog all backed up 2 times to sbt;

9) To delete specific archived redo logs

RMAN> delete archivelog until sequence = 1234;

10) Delete archive logs after taking backup

RMAN> backup device type sbt archivelog all delete all input;

11) Delete stored script

RMAN> delete script full_disk_db;

If you have two scripts—one local and one global—in the same name, then the delete script command drops the local one, not the global one. If you want to drop the global script, you must use the keyword global in the command, as shown here:

RMAN> delete global script full_disk_db;

**CHANGE COMMAND**

1) Change the status of a backup set to unavailable

RMAN> change backupset 6 unavailable;

You usually do it when you don’t want to delete the backup/copy but you also don’t want to delete that backup/copy (probably it is not available physically on disk)

Once you mark a backup file unavailable, RMAN won’t use that file in a restore or recover operation.

2) Change the status of a backup set to available again

RMAN> change backupset 6 available;

For example, say you performed a backup using an NFS-mounted disk and that disk subsequently becomes inaccessible for some reasons, just issue the change command to set the status of the backup as unavailable. Later, once the disk becomes accessible again, you can change its status back to available.

3) To modify a regular consistent database backup into an archival backup:

RMAN> change backup tag ‘initial_db_bkup’ keep forever;

When you make an archival backup with the keep … forever option, RMAN disregards the backup retention time for these backups.

4) To change the archival backup to a normal database backup

RMAN> change backup tag ‘inital_db_backup’ nokeep;

When you run the change … nokeep command, the backup set with the tag inital_db_backup, which was previously designated as a long term archival backup, will once again come under the purview of your configured retention policy.

5) To modify the time period for which you want to retain the archival backups

RMAN> change backupset 12 keep until time ‘sysdate+60′;

After the 60 days are up, the backup will become obsolete and is eligible for deletion by the delete obsolete command.

**VALIDATE COMMAND**

1) To check all the datafiles and the archived redo logs for physical corruption without actually performing the backup

RMAN> backup validate database archivelog all;

2) To check for logical corruption without actually performing the backup

RMAN> backup validate check logical database archivelog all;

The check logical clause means that RMAN will check for logical corruption only.

3) To validate a single backup set

RMAN> validate backupset 5;

4) To validate all datafiles at once

RMAN> validate database;

Note that the validate command can check at a much more granular level than the backup … validate command. You can use the validate command with individual datafiles, backup sets, and even data blocks.

The validate command always skips all the data blocks that were never used, in each of the datafile it validates.

5) To validate recovery area

RMAN> validate recovery area;

6) To validate all the recovery related files

RMAN> validate recovery files;

7) To validate the spfile

RMAN> validate spfile;

8) To validate specific tablespace

RMAN> validate tablespace <tablespace_name>;

9) To validate specific control file copy

RMAN> validate controlfilecopy <filename>;

10) To validate specific backupset

RMAN> validate backupset <primary_key>;

ORACLE DATA GUARD INTERVIEW QUESTIONS

Some of the Oracle Data Guard related questions are listed below.

Q 1

What is data guard in simple language?

A 1

Your primary database is running and you want to reduce downtime because of unplanned outages. You create a replica of this primary database (termed as standby database).
You regularly ship redo generated in the primary database to standby database and apply it there. So that is our ‘Data Guard’ standby database and it is in a continuous state of recovery, validating and applying redo to remain in sync with the primary database.

Q 2

Your standby database was out of reach because of network issue. How will you synchronize it with primary database again?

A 2

Data Guard automatically resynchronizes the standby following network or standby outages using redo data that has been archived at the primary.

Q 3

What is Redo Transport Services (RTS)?

A 3

This process takes care of the transmission of redo from a primary database to the standby database.

Below is how Redo Transport Services (RTS) works:

** For multi-standby configuration, the primary database has a separate LNS process for each standby database.
** Two redo transport methods are supported with the LNS process: synchronous (SYNC) or asynchronous (ASYNC).

Q 4

What is the difference between SYNC and ASYNC redo transport method?

A 4

Synchronous transport (SYNC)

Also known as a “zero data loss” redo transport menthod.

Below is how it works:

1) Log Network Server (LNS) reads redo information from the redo buffer in SGA of PRIMARY Database
2) Log Network Server (LNS) passes redo to Oracle Net Services for transmission to the STANDBY database
3) Remote File Server (RFS) records the redo information transmitted by the LNS at the STANDBY database
4) Remote File Server (RFS) writes it to a sequential file called a standby redo log file (SRL) at the STANDBY database
5) Remote File Server (RFS) transmits an acknowledgement back to the LNS process on the primary database
6) Log Network Server (LNS) notifies the LGWR that transmission is complete on the primary database.
7) Log Writer (LGWR) acknowledges the commit to the user.

Asynchronous transport (ASYNC)

Unlike SYNC, Asynchronous transport (ASYNC) eliminates the requirement that the LGWR wait for acknowledgement from the LNS. This removes the performance impact on the
primary database irrespective of the distance between primary and standby locations.

So if the LNS is unable to keep pace and the log buffer is recycled before the redo can be transmitted to the standby, the LNS automatically transitions to reading and sending from the Online Redo logs. Once the LNS is caught up, it automatically transitions back to reading & sending directly from the log buffer.

Below is how it works:

so step 5, 6 & 7 as discussed above for SYNC are not applicable here.

The only drawback of ASYNC is the increased potential for data loss. Say a failure destroyed the primary database before any transport lag was reduced to zero, this means any committed transactions that were a part of the transport lag will be lost. So it is highly advisable to have enough network bandwidth to handle peak redo
generation rates when using ASYNC method.

Q 5

How Synchronous transport (SYNC) can impact the primary database performance?

A 5

SYNC guarantees protection for every transaction that the database acknowledges as having been committed but at the same time LGWR must wait for confirmation that data is protected at the standby before it can proceed with the next transaction. It can impact primary database performance and it depends on factors like

> the amount of redo information to be written
> available network bandwidth
> round-trip network latency (RTT)
> standby I/O performance writing to the SRL.
> distance betweeen primary and standby databases as network RTT increases with distance.

Q 6

What is Data Guard’s Automatic Gap Resolution?

A 6

Your database is using ASYNC transport method and the instance load is at the peak. The LNS is unable to keep pace and the log buffer is recycled before the redo can be transmitted to the standby, the LNS automatically transitions to reading and sending from the Online Redo logs. Once the LNS is caught up, it automatically transitions back to reading & sending directly from the log buffer.

Now in some cases there can be two or more log switches before the LNS has completed sending the redo information from online redo log files and in meantime if any such required online redo log files were archived then those redo information will be transmitted via Data Guard’s gap resolution process “Automatic Gap Resolution”.

In some other case when your network or the standby database is down and your primary system is one busy system, so before the connection between the primary and standby is restored, a large log file gap will be formed.
Automatic Gap Resolution will take care of such scenarios by following below action plan:

1) ARCH process on the primary database continuously ping the standby database during the outage to determine its status.
2) As soon as the standby is restored, the ARCH ping process queries the standby control file (via its RFS process) to determine the last complete log file that the standby received from the primary database.
3) Data Guard determines which log files are required to resynchronize the standby database and immediately begins transmitting them using additional ARCH processes.
4) LNS process at primary database will also attempt and succeed in making a connection to the standby database and will begin transmitting current redo. So first all the ARCH files are applied and then current redo log.

The Data Guard architecture enables gaps to be resolved quickly using multiple background ARCH processes

Q 7

What is the difference between Physical standby and Logical standby database?

A 7

Data Guard Apply process in standby database can apply redo information directly and in that case it will be called physical standby.
OR It can apply SQL and in that case it will be called Logical standby.

Physical Standby:

In this case standby database is an exact, block-by-block, physical replica of the primary database.
The change vectors received by RFS process are directly applied to the standby database by using media recovery.so here the apply process read data blocks, assemble redo changes from mappings, and then apply redo changes to data blocks directly.
Physical Standby is the best choice for disaster recovery (DR) based upon their simplicity, transparency, high performance, and good data protection.

Logical Standby:

In this case standby database uses SQL Apply method to “mine” the redo by converting it to logical change records, and then building SQL
transactions and applying SQL to the standby database.

As this process of replaying the workload is more complex than the Physical Standby’s process, so it requires more memory, CPU, and I/O.

One good advantage here is that a logical standby database can be opened read-write while SQL Apply is active which means you can update (create/insert/delete etc) local tables and schemas in the logical standby database.

Q 8

How is Data Guard Apply process works if primary and secondary database involves Oracle RAC?

A 8

If Primary database is RAC but standby is Non-RAC:

Each primary Oracle RAC instance ships its own thread of redo that is merged by the Data Guard apply process at the standby and applied in SCN order to the standby database.
If both Primary and standby databases are RAC:

If the standby is also an Oracle RAC database, only one instance (the apply instance) will merge and apply changes to the standby database. If the apply instance fail for any reason, the apply process will automatically failover to a surviving instance in the Oracle RAC standby database when using the Data Guard broker.

Q 9

What is Active Data Guard Option (Oracle Database 11g Enterprise Edition)?

A 9

For physical standby database, prior to 11g, the database would have to be in the mount state when media recovery was active which means you were not able to query the standby database during media recovery stage as there was no read-consistent view.

Active Data Guard 11g features solves the read consistency problem by use of a “query” SCN. The media recovery process on the standby database will advance the query SCN after all the changes in a transaction have been applied . The query SCN will appear to user as the CURRENT_SCN column in the V$DATABASE view on the standby database. So Read-only users will only be able to see data up to the query SCN, and hence guaranteeing the same read consistency as the primary database.

This enables a physical standby database to be open as read-only while media recovery is active, making it useful for doing read-only workloads.

Also, if you need read-write access to the standby database, you can use SQL Apply method of dataguard.

Q 10

What are the important database parameters related to Data Guard corruption prevention?

A 10

On the primary database:

a) DB_ULTRA_SAFE

Values can be DATA_AND_INDEX or DATA_ONLY. Setting DB_ULTRA_SAFE at the primary will also automatically set DB_ LOST_WRITE_PROTECT=TYPICAL on the primary database.
In Oracle Database 11g Release 2 (11.2), the primary database automatically attempts to repair the corrupted block in real time by fetching a good version of the same block from a physical standby database.

On the standby database:

a) DB_BLOCK_CHECKSUM=FULL

DB_BLOCK_CHECKSUM detects redo and data block corruptions and detect corruptions on the primary database and protect the standby database. This parameter requires minimal CPU resources.

b) DB_LOST_WRITE_PROTECT=TYPICAL
A lost write can occur when an I/O subsystem acknowledges the completion of a write, while in fact the write did not occur in persistent storage.
This will create a stale version of the data block. When the DB_LOST_WRITE_PROTECT initialization parameter is set, the database records buffer cache block reads in the redo log, and this information is used to detect lost writes.

You set DB_LOST_WRITE_PROTECT to TYPICAL in both primary and standby databases.

Q 11

What are different Data Guard protection modes?

A 11

Data Guard protection modes implement rules that controls how the configuration will respond to failures, enabling you to achieve specific objectives for data protection, availability, and performance.

a) Maximum Performance

– emphasis is on primary database performance over data protection.
– requires ASYNC (the default method) redo transport so that the LGWR process never waits for acknowledgment from the standby database.
– network connection between primary and standby OR the availability of the standby database DO NOT IMPACT the primary database performance

b) Maximum Availability

– first emphasis is on availability and second priority is zero data loss protection.
– requires SYNC redo transport so primary database performance may be impacted in waiting for acknowledgment from the standby (it doesn’t mean indefinite wait in case standby database fails, maximum wait will be equal to parameter NET_TIMEOUT seconds).

c) Maximum Protection

– utmost priority is on data protection.
– also requires SYNC redo transport.
– unlike ‘Maximum Availability’ it does not consider the NET_TIMEOUT parameter, which means If the primary does not receive acknowledgment from a SYNC standby database, it will stall primary and eventually abort it, preventing any unprotected commits from occurring.
– highly recommended to use a minimum of two SYNC standby databases at different locations if using ‘Maximum Protection’ to have high availability of primary database.

Q 12
What is Switchover event?

A 12

Switchover is useful for minimizing downtime during planned maintenance. It is a planned event in which Data Guard reverses the roles of the primary and a standby database.

The primary database runs unaffected while we are making the required changes on our standby database (e.g. patchset upgrades, full Oracle version upgrades, etc).
Once changes are complete, production is switched over to the standby site running at the new release.

This means regardless of how much time is required to perform planned maintenance, the only production database downtime is the time required to execute a switchover, which can be less than 60 seconds
Below operations happens when switchover command is executed:

1. primary database is notified that a switchover is about to occur.
2. all users are disconnected from the primary.
3. a special redo record is generated that signals the End Of Redo (EOR).
4. primary database is converted into a standby database.
5. the final EOR record is applied to standby database, this guarantees that no data has been lost, and it converts the standby to the primary role.

Q 13

What is Failover event?

A 13

The Failover process is similar to switchover event except that the primary database never has the chance to write an EOR record as this is an unplanned event.

Whether or not a failover results in data loss depends upon the Data Guard protection mode:

a) Maximum Protection >> No Data Loss

b) Maximum Availability >> No Data Loss (except when there was a previous failure (e.g. a network failure) that had INTERRUPTED REDO TRANSPORT and allowed the primary database to move ahead of standby)

c) Maximum Performance (ASYNC) >> may lose any committed transactions that were not transmitted to the standby database before the primary database failed.

Failover event can be of two types:

1) Manual

Administrator have complete control of primary-standby role transitions. It can lengthen the outage by the amount of time required for the administrator to be notified and manual execution of command.

2) Automatic

It uses Data Guard’s Fast-Start Failover feature which automatically detects the failure, evaluates the status of the Data Guard configuration, and, if appropriate, executes the failover to a previously chosen standby database.

Q 14

Which tools can be used for Data Guard Management?

A 14

1) SQL*Plus – traditional method, can prove most tedious to use
2) Data Guard broker – automates and centralizes the creation, maintenance, and monitoring of a Data Guard configuration. Simplifies and automates many administrative
tasks. It has its own command line (DGMGRL) and syntax.
3) Enterprise Manager – requires that the Data Guard broker be enabled. a GUI to the Data Guard broker, replacing the DGMGRL command line and interfacing directly with the broker’s monitor processes.

Q 15

What is Data Guard 11g snapshot standby?

A 15

With 11g, you can thoroughly test your changes on a true replica of your production system and database using actual production workload.
Data Guard 11g physical standby can now be converted to a snapshot standby, independent of the primary database, that is open read-write and able to be used for preproduction testing. It uses Flashback Database and sets a guaranteed restore point (GRP) at the SCN before the standby was open read-write.

NOTE: Primary database redo continues to be shipped to a snapshot standby, and while not applied, it is archived for later use.

You can convert this snapshot database back into a synchronized physical standby database when testing is complete. Redo Apply process at standby will take care that all
primary database redo archived while a snapshot standby is applied until it is caught up with the primary database.

Q 16

What is the difference between Recovery Point Objective(RPO) and Recovery Time Objective (RTO)?

A 16

A) Recovery Point Objective(RPO)

RPO concerns with data. It is the amount of data you are willing to lose when the failure occurs in your database system. Usually people define data loss in terms of time, so possible values can be 5 seconds of data loss, 2 hours of data loss etc.

Remember that each standby database has its own set of attributes and parameters. It means you can mix zero data loss standby databases with minimal data loss standby
databases in the same Data Guard configuration

If you have decided that you want to implement zero data loss strategy, then you should really focus on Networks and Data Loss
B) Recovery Time Objective (RTO)

RTO is defined as how fast you can get back up and running (whereas RPO is concerned with data loss)

So with your RPO strategy you lost say only about 6 seconds of data as you committed to your client but with RTO you need to formulate how fast clients can connect back to the database system after the data loss has occurred.

Q 17

What are Standby Redo Log (SRL) files?

A 17

The SRL files are where the Remote File Server (RFS) process at your standby database writes the incoming redo so that it is persistent on disk for recovery. SRL files are important for better redo transport performance and data protection.

SRL are MUST in Maximum Availability or Maximum Protection mode and OPTIONAL (but recommended) in Maximum Performance mode.

If there are no Standby Redo Log (SRL) files, then at each log switch in the primary database, the RFS process on the standby database that is serving an asynchronous standby destination has to create an archive log of the right size. While the RFS is busy doing creating the archive log file, the LNS process at the primary database has to wait, getting further and further behind the LGWR (in case of Maximum Performance mode). That is why it recommended to have Standby Redo Log (SRL) files in Maximum Performance mode also.

We generally configure them on our primary database as well in preparation for a role transition b/w primary-standby.

Also, do not multiplex SRLs. Since Data Guard will immediately request a new copy of the archive log if an SRL file fails, there is no real need to have more than one copy of each.

Q 18

What is Fast Start Fail Over (FSFO)?

A 18

Main criticism of Oracle standby databases has always been that too much manual interaction is required in case of disaster situation. FSFO helps in filling up this requirement. FSFO quickly and reliably fails over the target standby database to the primary database role, without requiring you to perform any manual steps to invoke the failover. Please keep in mind that you need to have Broker configuration done to be able to use FSFO feature.

Q 19

What is the concept of OBSERVER in Fast Start Fail Over ( FSFO)?

A 19

In normal scenario, If you have to perform a switch over activity in your standby setup you keep some kind of monitor/observation on your setup so that you are aware that when your primary database is not available and you need to switch over to standby database. Oracle helped in getting this manual observation activity by providing an OBSERVER process which constantly monitors the availability of the Primary database. Now, if we run OBSERVER on the primary or secondary database server itself then their is risk that the OBSERVER itself will get down when that server is down because of disaster. That is why observer is a separate OCI client-side component that runs on a different computer from the primary and standby databases.

So Once the observer is started, no further user interaction is required. If both the observer and designated standby database lose connectivity with the primary database for longer than the number of seconds specified by the FastStartFailoverThreshold configuration property, the observer will initiate a fast-start failover to the standby database.

Q 20

What are the high level steps for configuring Fast Start Fail Over (FSFO)?

A20

To configure FSFO in your Standby setup, broad level steps will be:

STEP 1: Determine Which of the Available Standby Databases is the Best Target for the Failover. Means you want to choose physical over logical etc.
STEP 2: Specify the Target Standby Database with the FastStartFailoverTarget Configuration Property. This may not be required if you have only one standby database in configuration.
STEP 3: Determine the Protection Mode You Want . You will have to choose from either maximum performance or maximum availability. This is more of business decision and you will have to take consensus from all stakeholders on which mode will prove right for you.
STEP 4: Set the FastStartFailoverThreshold Configuration Property. This parameter tells how long (in seconds) OBSERVER process should wait before starting failover
STEP 5: Set Other Properties Related to Fast-Start Failover (Optional). There are some other parameters like FastStartFailoverAutoReinstate, ObserverOverride etc which can also be applicable to meet your specific requirements.
STEP 6: Enable Additional Fast-Start Failover Conditions (Optional). This step can give you some more options to define when you primary database is unusable example: stuck archiver, corrupt control file etc.
STEP 7: Enable FSFO Using DGMGRL or Cloud Control. This is the main step in which you will enable to FSFO.
STEP 8: Start the Observer. You can use Cloud Control or DGMGRL to start the observer process.
STEP 9: Verify the Fast-Start Failover Environment. DGMGRL command “SHOW FAST_START FAILOVER” can show you easily the status of FSFO.