Dev's Weblog

We have moved to sysdbaonline.com

Huge Page Setup On Linux

What is hugepages and What are its advantages?
Hugepages is a mechanism that allows the Linux kernel to utilise the multiple page size capabilities of modern hardware architectures. Linux uses pages as the basic unit of memory – physical memory is partitioned and accessed using the basic page unit. The default page size is 4096 Bytes in the x86 architecture.

Hugepages allows large amounts of memory to be utilised with a reduced overhead. Linux uses a mechanism in the CPU architecture called “Transaction Lookaside Buffers” (TLB). These buffers contain mappings of virtual memory to actual physical memory addresses. The TLB is a limited hardware resource, so utilising a huge amount of physical memory with the default page size consumes the TLB and adds processing overhead.

The Linux kernel is able to set aside a portion of physical memory to be able be addressed using a larger page size. Since the page size is higher, there will be less overhead managing the pages with the TLB.

In the Linux 2.6 series of kernels, hugepages is enabled using the CONFIG_HUGETLB_PAGE feature when the kernel is built. All kernels supplied by Red Hat for the Red Hat Enterprise Linux 4 release have the feature enabled.

Systems with large amount of memory can be configured to utilise the memory more efficiently by setting aside a portion dedicated for hugepages. The actual size of the page is dependent on the system architecture. A typical x86 system will have a Huge Page Size of 2048 kBytes. The huge page size may be found by looking at the /proc/meminfo :

# cat /proc/meminfo |grep Hugepagesize
Hugepagesize: 2048 kB

Follow the following steps to enable Oracle to start using Hugepage.

1.Hugepage counting formula

(SGA_SIZE/2M ) + 100

So for 10G SGA hugepage should be set to ,

((10*1024)M / 2M ) + 100 = 5220

You can find SGA size from show sga command on sqlplus prompt.

2.Please put following kernel parameter into /etc/sysctl.conf file.

vm.nr_hugepages = 5220

3.Run the following command.

#sysctl -p

4. Modify /etc/security/limits.conf file to have following entry.

oracle soft memlock 20086560
oracle hard memlock 20086560

Then Reboot the machine.

5.Check the Hugepage Total and Hugepage Free using following command.

#cat /proc/meminfo | grep -i huge

This should return like following.

HugePages_Total:   5220
HugePages_Free:    5220
HugePages_Rsvd:     00
Hugepagesize:     2048 kB

Also check the memlock setting using ulimit -a command.

6.Start the Oracle DB and check whether Hugepage is getting allocated or not by using following command.

#cat /proc/meminfo | grep -i huge

August 31, 2008 Posted by sdevang | Linux (Ubuntu & RedHat) | | No Comments Yet

SRVCTL Command ( RAC )

srvctl is the tool Oracle recommends that DBAs use to interact with CRS and the cluster registry. Oracle does provide several tools to interface with the cluster registry and CRS more directly, at a lower level .srvctl, in contrast, is well documented and easy to use.

Using srvctl

Even if you are experienced with 9i srvctl, it’s worth taking a look at this section; 9i and 10g srvctl commands are slightly different.

srvctl must be run from the $ORACLE_HOME of the RAC you are administering. The basic format of a srvctl command is

srvctl <command> <target> [options]

where command is one of

enable|disable|start|stop|relocate|status|add|remove|modify|getenv|setenv|unsetenv|config

and the target, or object, can be a database, instance, service, ASM instance, or the nodeapps.

Examples
=======

Example 1. Bring up the ORCL1 instance of the ORCL database.

[oracle@newserver oracle]$ srvctl start instance -d ORCL -i O1RCL

Example 2. Stop the ORCL database: all its instances and all its services, on all nodes.

[oracle@newserver oracle]$ srvctl stop database -d ORCL

Example 3. Stop the nodeapps on the newserver node. NB: Instances and services also stop.

[oracle@newserver oracle]$ srvctl stop nodeapps -n newserver

Example 4. Add the ORCL3 instance, which runs on the newserver node, to the ORCL
clustered database.

[oracle@newserver oracle]$ srvctl add instance -d ORCL -i ORCL3 -n newserver

Example 4. Add a new node, the newserver node, to a cluster.

[oracle@newserver oracle]$ srvctl add nodeapps -n newserver -o $ORACLE_HOME -A 10.177.56.56/255.255.252.0/eth1

(The -A flag precedes an address specification.)

Example 5. To change the VIP (virtual IP) on a RAC node, use the command

[oracle@newserver oracle]$ srvctl modify nodeapps -A new_address

Example 6. Find out whether the nodeapps on newserver are up.

[oracle@newserver oracle]$ srvctl status nodeapps -n newserver
VIP is running on node: newserver
GSD is running on node: newserver
Listener is not running on node: newserver
ONS daemon is running on node: newserver

Example 7. Disable the ASM instance on newserver for maintenance.

[oracle@newserver oracle]$ srvctl disable asm -n newserver

Debugging srvctl

Debugging srvctl in 10g couldn’t be easier. Simply set the SRVM_TRACE environment variable.

[oracle@newserver bin]$ export SRVM_TRACE=true

Let’s repeat Example 6 with SRVM_TRACE set to true:

[oracle@newserver oracle]$ srvctl status nodeapps -n newserver
/u01/app/oracle/product/10.1.0/jdk/jre//bin/java -classpath /u01/app/oracle/product/10.1.0/jlib/netcfg.jar:/u01/app/oracle/product/10.1.0/jdk/jre//lib/rt.jar:
/u01/app/oracle/product/10.1.0/jdk/jre//lib/i18n.jar:/u01/app/oracle/product/10.1.0/jlib/srvm.jar:
/u01/app/oracle/product/10.1.0/jlib/srvmhas.jar:/u01/app/oracle/product/10.1.0/jlib/srvmasm.jar:
/u01/app/oracle/product/10.1.0/srvm/jlib/srvctl.jar
-DTRACING.ENABLED=true -DTRACING.LEVEL=2 oracle.ops.opsctl.OPSCTLDriver status nodeapps -n newserver
[main] [19:53:31:778] [OPSCTLDriver.setInternalDebugLevel:165] tracing is true at level 2 to file null
[main] [19:53:31:825] [OPSCTLDriver.<init>:94] Security manager is set
[main] [19:53:31:843] [CommandLineParser.parse:157] parsing cmdline args
[main] [19:53:31:844] [CommandLineParser.parse2WordCommandOptions:900] parsing 2-word cmdline
[main] [19:53:31:866] [GetActiveNodes.create:212] Going into GetActiveNodes constructor…
[main] [19:53:31:875] [HASContext.getInstance:191] Module init : 16
[main] [19:53:31:875] [HASContext.getInstance:216] Local Module init : 19 …
[main] [19:53:32:285] [ONS.isRunning:186] Status of ora.ganges.ons on newserver is true
ONS daemon is running on node: newserver
[oracle@newserver oracle]$

Pitfalls

A little impatience when dealing with srvctl can corrupt your OCR, ie, put it into a state where the information for a given object is inconsistent or partially missing. Specifically, the srvctl remove command provides the -f option, to allow you to force removal of an object from the OCR. Use this option judiciously, as it can easily put the OCR into an inconsistent state.

August 31, 2008 Posted by sdevang | Real Application Cluster | | No Comments Yet

11G New Features

Active database Duplication using RMAN

===========================


Starting from 11g database can be duplicated with having prior copy of the database backup on the destination. Prior to 11g it requires source database, a copy of a backup on destination and destination db.

Beginning with 11g you can use the RMAN or Enterprise manager to create a duplicate database online. This feature instructs the source database to perform online image copies and archived log copies directly to the target instance. Preexisting backups are not required.

RMAN > DUPLICATE TARGET DATABASE
TO db_duplicate
FROM ACTIVE DATABASE
SPFILE PARAMETER_VALUE_CONVERT ‘/u02′, ‘u03′
SET SGA_MAX_SIXE = ‘500m’
SET SGA_TARGET = ‘250M’
SET LOG FILE_NAME_CONVERT = ‘/u02′, ‘u03′
DB_FILE_NAME_CONVERT = ‘/u02′, ‘u03′;

Using above command the target database is duplicated to database “db_duplicate” and the database file locations are changes from /u02 to /u03. Make sure that the /u03 partition is already existing on the OS side.

ASM Fast Mirror Resync

================

- In 10g ASm assumes that an offline disk only contains stale data and reads no data from such disks as a result disk is put offline. After this ASM drops the disk from group and recreates all the extents using the mirror copy which is fairly time consuming process and may take hours.
- ASM fast mirror resync significantly reduces the time required to resync a transient failure of any disk. With this feature when disk goes offline ASM track all the changes to the extents during the offline time and when the disk comes online ASM quickly resync ONLY the extents that were affected during the offline period.

- DISK_REPAIR_TIME is the attribute that needs to be set corresponding to the disk group> This determines the duration of disk outage that ASM instance will tolerate being able to resync.

How to setup ASM fast mirror resync:

1. Use ALTER DISKGROUP to set the DISK_REPAIR_TIME attribute.

ALTER DISKGROUP diskgrp1 SET ATTRIBUTE ‘disk_repair_time’ = ‘3.5h’
or
ALTER DISKGROUP dg01 SET ATTRIBUTE ‘disk_repair_time’ = ‘210m’

2. Use ALTER DISKGROUP DISK ONLINE statement to bring the disk online.

For Example.

ALTER DISKGROUP diskgrp1 OFFLINE DISK diskgrp_1 DROP AFTER 10m;

This will take disk diskgrp_1 offline and drops it after 10 minutes.

3. Can refer to V$ASM_OPERATION view while running any of ALTER DISKGROUP commands as it displays name and current state of operation you are performing.

ASM Scalability

- 63 Disk groups
- 10,000 ASM Disks
- 4 Petabytes per ASM disk
- 40 Exabytes of storage
- 1 Million files per disk group
- Maximum filesize
External reduntancy : 140 PB
Nornal reduntancy : 42 PB
High reduntancy : 15 PB

August 31, 2008 Posted by sdevang | 11G, Standalone Oracle Database | | No Comments Yet