Monday, December 7, 2009

Oracle RAC One Node – part 1

This post is about a installation and configuration of Oracle RAC 11gR2 in One Node configuration on VMWare. At the beginning I want describe a RAC One Node a little bit more – this is a new possible configuration with special licence and price which is very similar to fail over cluster configuration. During a normal work only one instance and up and running – like in failover cluster, a difference is in migration process. The following steps are performed during a migration:
  1. Second instance in started on target node
  2. All session are migrated to target instance – TAF has to be enabled on client configuration
  3. Source instance in shutdown in transaction mode
  4. After timeout a source instance in shutdown in abort mode
Installation and configuration in based on Linux CentOS 5.3 and Oracle 11g 11.2.0.1

If you want to skip Grid Infrastructure configuration tips click here

Oracle Grid Infrastructure
Oracle Grid Infrastructure has to be installed and configured on all nodes belongs to Oracle RAC (this is requirement for both RAC configuration – typical RAC and One Node RAC). In previous releases Oracle Grid Infrastructure was called an Oracle ClusterWare. In 11g R2 a name has been change to Grid Infrastructure and a lot of changes have been made. The most important are:
  • Oracle ASM is a part of Grid Infrastructure and not a part of Oracle Database
  • Vote and cluster configuration can use ASM disk or cluster file system
  • Raw or block devices for vote and cluster configuration are not supported during a installation time
  • More RAM is required ?

Last change made me sad as I have only 4 GB RAM on my laptop so I can allocate about 3 GB for two VM. But why do not try. After a few tests I have found a working configuration for both nodes.

RAC1 – node number 1
  • 1.5 GB of RAM allocated for VM
  • 2.0 GB of swap
  • 1 CPU
  • 10 GB of free space for Oracle Homes (both infrastructure and database)

RAC2 – node number 2
  • 1.0 GB of RAM allocated for VM
  • 2.0 GB of swap
  • 1 CPU
  • 10 GB of free space for Oracle Homes (both infrastructure and database)

Yes there is a difference between memory sizes in nodes – start OUI on node number 1 with more allocated memory.

Common configuration:

/etc/hosts
10.10.20.129    rac1.localdomain rac1
192.168.226.10 rac1-priv.localdomain rac1-priv
10.10.20.130 rac2.localdomain rac2
192.168.226.20 rac2-priv.localdomain rac2-priv
10.10.20.200 rac-cluster
10.10.20.210 rac1-vip.localdomain rac1-vip
10.10.20.220 rac2-vip.localdomain rac2-vip
RAC1-VIP and RAC2-VIP has to be assigned to public network and not configured during an installation. There is an additional entry for rac-cluster which is SCAN interface and has to be in same network as both VIP interfaces and not configured during a configuration time.
SCAN interface is a single entry point for cluster – for more information you can see RAC documentation.

Now is time to start OUI and install a Grid Control infrastructure. Below some installation tips:
  • Installation option – choose Install and configure Grid Infrastructure for Cluster
  • Installation type – Choose Typical installation
  • SCAN name – type rac-cluster
  • Add both nodes – RAC1 and RAC2 with proper VIP – note there is no need to private name anymore
  • Test SSH connectivity between nodes and click Setup there are any problems
  • Choose 10.10.20.x subnet as public and 192.168 as private
  • Specify a Oracle Base and Software location (which is a Oracle Home for Grid Infrastructure) - note that Oracle Home for Grid has to be in different location that Oracle Base
  • Select a disk for ASM group – it will be used for vote and configuration file too – if required change a Discovery Path to correct value (ex. /dev/sdc* for block devices or ORCL:* if you are using ASMLib)
  • If there are any problem with kernel setting or missing packages solve it before installation start – you can ignore memory, swap and NTP – but you have to have at least memory size specified above.
After about 15 minutes there is a time for a last step – execute a root.sh to configure and start cluster infrastructure. Run root.sh on node with more memory first.


Oracle Database Oracle Home

Just perform a standard installation of Oracle 11gR2 binaries without database creation.

RAC One Node patch

This is a best time to install a patch which adds One Node support to our database Oracle Home. Why ? Because it has to be installed when DB is down, so before creation of database we don’t need any other actions. Patch number is RACONENODE_p9004119_112010_LINUX.zip and can be found on Oracle Support Pages.

Database creation

The most important thing is to create a database only on one node. On the first screen a RAC database has to be chosen and then only one node (ex. rac1) has to be selected.
The next important thing is storage for a database. In our example all database files will be placed in ASM disk group. This same used to keep Grid Infrastructure cluster configuration.
All other configuration settings don’t have any impact on RAC One Node configuration.

Service configuration

A new service has to be added to support a RAC One Node configuration. This service will be used in our client configuration and will be entry point to our database.

srvctl add service -d testone -s serviceone -r testone1
where
  • testone – is a database name
  • serviceone – is a service name
  • testone1 – is a instance name created in previous point

RAC One node configuration

When database and service are up there is a time to start a RAC One Node configuration.
To do that a raconeinit has to be started.


Candidate Databases on this cluster:
# Database RAC One Node Fix Required
=== ======== ============ ============
[1] testone NO N/A
Enter the database to initialize [1]:
Database testone is now running on server rac1
Candidate servers that may be used for this DB: rac2

Enter the names of additional candidate servers where this DB may run (space delimited): rac2

Please wait, this may take a few minutes to finish.
Database configuration modified.

After that command a new configuration should be in place. Current status can be checked with following command:raconestatus


RAC One Node databases on this cluster:

Database UP Fix Required Current Server Candidate Server Names
======== == ============ ============================== ========================================
testone Y N rac1 rac1 rac2

Available Free Servers:

RAC One node operations

Main RAC One Node operation is moving an instance between nodes. That operation can be done using Omotion tool. Here is a example of Omotion execution

RAC One Node databases on this cluster:

# Database Server Fix Required
=== ======== ============================== ============
[1] testone rac1 N
Enter number of the database to migrate [1]:
Specify maximum time in minutes for migration to complete (max 30) [30]: 5
Available Target Server(s) :
# Server Available
=== ================== =========
[1] rac2 Y
Enter number of the target node [1]:

Omotion Started...
Starting target instance on rac2...
Migrating sessions...
Stopping source instance on rac1...
Omotion Completed...

=== Current Status ===
Database testone is running on node rac2

In that example database instance has been moved from node rac1 to node rac2. Instance on rac1 has been closed in transaction mode. In that scenario there was no remaining session on rac1 instance. When there are outstanding sessions/transactions on source node (in that case rac1) Omotion will shutdown that instance in transaction mode and then after time out will shutdown in abort mode – see example below.

RAC One Node databases on this cluster:

# Database Server Fix Required
=== ======== ============================== ============
[1] testone rac1 N

Enter number of the database to migrate [1]:
Specify maximum time in minutes for migration to complete (max 30) [30]: 5
Available Target Server(s) :
# Server Available
=== ================== =========
[1] rac2 Y

Enter number of the target node [1]:
Omotion Started...
Starting target instance on rac2...
Migrating sessions...
Stopping source instance on rac1...
Timeout exceeded, aborting instance...
Omotion Completed...

=== Current Status ===
Database testone is running on node rac2
Client configuration
How is look like from client perspective?
  1. TAF not configured – session has to be reconnected after instance migration
  2. TAF configured in client TNS – only current transaction has to be rollback.
TAF example:
tnsnames.ora

testone =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = rac-cluster)(PORT = 1521))
)
(CONNECT_DATA =
(SERVICE_NAME = testone)
(FAILOVER_MODE=
(TYPE=select)
(METHOD=basic))
)
)

Take a look on address - there is no VIP any more - now SCAN name has to be entered
in TNS alias and resolved via DNA or hosts as well as all RAC VIP (rac1-vip and rac2-vip).
SQL*Plus test


sqlplus system@testone
SQL> select instance_name, host_name from v$instance;

INSTANCE_NAME HOST_NAME
---------------- ----------------------------------------------------------------
testone_1 rac2


SQL> select * from test;

ID
----------
1
2
3
4
5
6
7

7 rows selected.

SQL> select instance_name, host_name from v$instance;

INSTANCE_NAME HOST_NAME
---------------- ----------------------------------------------------------------
testone_1 rac2


Omotion has been started.

SQL> select instance_name, host_name from v$instance;
select instance_name, host_name from v$instance
*
ERROR at line 1:
ORA-25402: transaction must roll back


SQL> rollback;

Rollback complete.

SQL> select instance_name, host_name from v$instance;

INSTANCE_NAME HOST_NAME
---------------- ----------------------------------------------------------------
testone_2 rac1


SQL> select * from test;

ID
----------
1
2
3
4
5
6

6 rows selected.

SQL>

After migration there was two changes - instance name and host name has been changed - it's not like in typical failover clustre
where instance is migrated from one host to other. In RAC One Node a new instance

is started on second node and during a migration time this configuration is working
as typical RAC.

This is end of part one - next part with more test and operations soon.


4 comments:

Surachart Opun said...

Good job...
That's a good post.

I wish I'll see what happened rac02 down when client connecting on it?

-)

Marcin Przepiorowski said...

Hi,

You have to wait until part two ;)
I'm going to present cases.

Daniel Williams said...

Great walkthrough - thank you for taking the time to do this!

Daniel Williams said...

Great walkthrough - thank you for taking the time to do this valuable service.