Huge Page Setup On Linux
What is hugepages and What are its advantages?
Hugepages is a mechanism that allows the Linux kernel to utilise the multiple page size capabilities of modern hardware architectures. Linux uses pages as the basic unit of memory – physical memory is partitioned and accessed using the basic page unit. The default page size is 4096 Bytes in the x86 architecture.
Hugepages allows large amounts of memory to be utilised with a reduced overhead. Linux uses a mechanism in the CPU architecture called “Transaction Lookaside Buffers” (TLB). These buffers contain mappings of virtual memory to actual physical memory addresses. The TLB is a limited hardware resource, so utilising a huge amount of physical memory with the default page size consumes the TLB and adds processing overhead.
The Linux kernel is able to set aside a portion of physical memory to be able be addressed using a larger page size. Since the page size is higher, there will be less overhead managing the pages with the TLB.
In the Linux 2.6 series of kernels, hugepages is enabled using the CONFIG_HUGETLB_PAGE feature when the kernel is built. All kernels supplied by Red Hat for the Red Hat Enterprise Linux 4 release have the feature enabled.
Systems with large amount of memory can be configured to utilise the memory more efficiently by setting aside a portion dedicated for hugepages. The actual size of the page is dependent on the system architecture. A typical x86 system will have a Huge Page Size of 2048 kBytes. The huge page size may be found by looking at the /proc/meminfo :
# cat /proc/meminfo |grep Hugepagesize
Hugepagesize: 2048 kB
Follow the following steps to enable Oracle to start using Hugepage.
1.Hugepage counting formula
(SGA_SIZE/2M ) + 100
So for 10G SGA hugepage should be set to ,
((10*1024)M / 2M ) + 100 = 5220
You can find SGA size from show sga command on sqlplus prompt.
2.Please put following kernel parameter into /etc/sysctl.conf file.
vm.nr_hugepages = 5220
3.Run the following command.
#sysctl -p
4. Modify /etc/security/limits.conf file to have following entry.
oracle soft memlock 20086560
oracle hard memlock 20086560
Then Reboot the machine.
5.Check the Hugepage Total and Hugepage Free using following command.
#cat /proc/meminfo | grep -i huge
This should return like following.
HugePages_Total: 5220
HugePages_Free: 5220
HugePages_Rsvd: 00
Hugepagesize: 2048 kB
Also check the memlock setting using ulimit -a command.
6.Start the Oracle DB and check whether Hugepage is getting allocated or not by using following command.
#cat /proc/meminfo | grep -i huge
SRVCTL Command ( RAC )
srvctl is the tool Oracle recommends that DBAs use to interact with CRS and the cluster registry. Oracle does provide several tools to interface with the cluster registry and CRS more directly, at a lower level .srvctl, in contrast, is well documented and easy to use.
Using srvctl
Even if you are experienced with 9i srvctl, it’s worth taking a look at this section; 9i and 10g srvctl commands are slightly different.
srvctl must be run from the $ORACLE_HOME of the RAC you are administering. The basic format of a srvctl command is
srvctl <command> <target> [options]
where command is one of
enable|disable|start|stop|relocate|status|add|remove|modify|getenv|setenv|unsetenv|config
and the target, or object, can be a database, instance, service, ASM instance, or the nodeapps.
Examples
=======
Example 1. Bring up the ORCL1 instance of the ORCL database.
[oracle@newserver oracle]$ srvctl start instance -d ORCL -i O1RCL
Example 2. Stop the ORCL database: all its instances and all its services, on all nodes.
[oracle@newserver oracle]$ srvctl stop database -d ORCL
Example 3. Stop the nodeapps on the newserver node. NB: Instances and services also stop.
[oracle@newserver oracle]$ srvctl stop nodeapps -n newserver
Example 4. Add the ORCL3 instance, which runs on the newserver node, to the ORCL
clustered database.
[oracle@newserver oracle]$ srvctl add instance -d ORCL -i ORCL3 -n newserver
Example 4. Add a new node, the newserver node, to a cluster.
[oracle@newserver oracle]$ srvctl add nodeapps -n newserver -o $ORACLE_HOME -A 10.177.56.56/255.255.252.0/eth1
(The -A flag precedes an address specification.)
Example 5. To change the VIP (virtual IP) on a RAC node, use the command
[oracle@newserver oracle]$ srvctl modify nodeapps -A new_address
Example 6. Find out whether the nodeapps on newserver are up.
[oracle@newserver oracle]$ srvctl status nodeapps -n newserver
VIP is running on node: newserver
GSD is running on node: newserver
Listener is not running on node: newserver
ONS daemon is running on node: newserver
Example 7. Disable the ASM instance on newserver for maintenance.
[oracle@newserver oracle]$ srvctl disable asm -n newserver
Debugging srvctl
Debugging srvctl in 10g couldn’t be easier. Simply set the SRVM_TRACE environment variable.
[oracle@newserver bin]$ export SRVM_TRACE=true
Let’s repeat Example 6 with SRVM_TRACE set to true:
[oracle@newserver oracle]$ srvctl status nodeapps -n newserver
/u01/app/oracle/product/10.1.0/jdk/jre//bin/java -classpath /u01/app/oracle/product/10.1.0/jlib/netcfg.jar:/u01/app/oracle/product/10.1.0/jdk/jre//lib/rt.jar:
/u01/app/oracle/product/10.1.0/jdk/jre//lib/i18n.jar:/u01/app/oracle/product/10.1.0/jlib/srvm.jar:
/u01/app/oracle/product/10.1.0/jlib/srvmhas.jar:/u01/app/oracle/product/10.1.0/jlib/srvmasm.jar:
/u01/app/oracle/product/10.1.0/srvm/jlib/srvctl.jar
-DTRACING.ENABLED=true -DTRACING.LEVEL=2 oracle.ops.opsctl.OPSCTLDriver status nodeapps -n newserver
[main] [19:53:31:778] [OPSCTLDriver.setInternalDebugLevel:165] tracing is true at level 2 to file null
[main] [19:53:31:825] [OPSCTLDriver.<init>:94] Security manager is set
[main] [19:53:31:843] [CommandLineParser.parse:157] parsing cmdline args
[main] [19:53:31:844] [CommandLineParser.parse2WordCommandOptions:900] parsing 2-word cmdline
[main] [19:53:31:866] [GetActiveNodes.create:212] Going into GetActiveNodes constructor…
[main] [19:53:31:875] [HASContext.getInstance:191] Module init : 16
[main] [19:53:31:875] [HASContext.getInstance:216] Local Module init : 19 …
[main] [19:53:32:285] [ONS.isRunning:186] Status of ora.ganges.ons on newserver is true
ONS daemon is running on node: newserver
[oracle@newserver oracle]$
Pitfalls
A little impatience when dealing with srvctl can corrupt your OCR, ie, put it into a state where the information for a given object is inconsistent or partially missing. Specifically, the srvctl remove command provides the -f option, to allow you to force removal of an object from the OCR. Use this option judiciously, as it can easily put the OCR into an inconsistent state.
11G New Features
Active database Duplication using RMAN
===========================
Starting from 11g database can be duplicated with having prior copy of the database backup on the destination. Prior to 11g it requires source database, a copy of a backup on destination and destination db.
Beginning with 11g you can use the RMAN or Enterprise manager to create a duplicate database online. This feature instructs the source database to perform online image copies and archived log copies directly to the target instance. Preexisting backups are not required.
RMAN > DUPLICATE TARGET DATABASE
TO db_duplicate
FROM ACTIVE DATABASE
SPFILE PARAMETER_VALUE_CONVERT ‘/u02′, ‘u03′
SET SGA_MAX_SIXE = ‘500m’
SET SGA_TARGET = ‘250M’
SET LOG FILE_NAME_CONVERT = ‘/u02′, ‘u03′
DB_FILE_NAME_CONVERT = ‘/u02′, ‘u03′;
Using above command the target database is duplicated to database “db_duplicate” and the database file locations are changes from /u02 to /u03. Make sure that the /u03 partition is already existing on the OS side.
ASM Fast Mirror Resync
================
- In 10g ASm assumes that an offline disk only contains stale data and reads no data from such disks as a result disk is put offline. After this ASM drops the disk from group and recreates all the extents using the mirror copy which is fairly time consuming process and may take hours.
- ASM fast mirror resync significantly reduces the time required to resync a transient failure of any disk. With this feature when disk goes offline ASM track all the changes to the extents during the offline time and when the disk comes online ASM quickly resync ONLY the extents that were affected during the offline period.
- DISK_REPAIR_TIME is the attribute that needs to be set corresponding to the disk group> This determines the duration of disk outage that ASM instance will tolerate being able to resync.
How to setup ASM fast mirror resync:
1. Use ALTER DISKGROUP to set the DISK_REPAIR_TIME attribute.
ALTER DISKGROUP diskgrp1 SET ATTRIBUTE ‘disk_repair_time’ = ‘3.5h’
or
ALTER DISKGROUP dg01 SET ATTRIBUTE ‘disk_repair_time’ = ‘210m’
2. Use ALTER DISKGROUP DISK ONLINE statement to bring the disk online.
For Example.
ALTER DISKGROUP diskgrp1 OFFLINE DISK diskgrp_1 DROP AFTER 10m;
This will take disk diskgrp_1 offline and drops it after 10 minutes.
3. Can refer to V$ASM_OPERATION view while running any of ALTER DISKGROUP commands as it displays name and current state of operation you are performing.
ASM Scalability
- 63 Disk groups
- 10,000 ASM Disks
- 4 Petabytes per ASM disk
- 40 Exabytes of storage
- 1 Million files per disk group
- Maximum filesize
External reduntancy : 140 PB
Nornal reduntancy : 42 PB
High reduntancy : 15 PB
-
Recent
- Data Guard Auto startup
- Alert.log Monitoring.
- Tier1 Rule Changes
- Menu Driver ORAENV
- HOW TO APPLY CPU JAN 2009
- Trouble shoot Out Of Memory Error for Oracle
- Users to roles and system privileges
- Enterprise Manager Grid Control ( OMS Problem )
- Data Guard Diagnostic Scripts
- NetApp Too many users logged in! Please try again later.
- MySQL Replication
- Data Guard Broker Setup for MAA Architecture
-
Links
-
Archives
- June 2009 (1)
- March 2009 (2)
- January 2009 (3)
- December 2008 (4)
- November 2008 (4)
- October 2008 (6)
- September 2008 (35)
- August 2008 (3)
-
Categories
-
RSS
Entries RSS
Comments RSS
