Thursday, June 30, 2011

Performance Tuning on Linux for Oracle Database

  • Using all the resources available to you?
  • Many default settings in Linux suck
  • Font server for X Windows is running as a daemon by default, but do you need it?
  • Check out these tunings that can give you lots of computing juice...

Kernel

To successfully run enterprise applications, such as a database server, on your Linux distribution, you may be required to update some of the default kernel parameter settings. For example, the 2.4.x series kernel message queue parameter msgmni has a default value (for example, shared memory, or shmmax is only 33,554,432 bytes on Red Hat Linux by default) that allows only a limited number of simultaneous connections to a database.  Here are some recommended values (by the IBM DB2 Support Web site) for database servers to run optimally:

- kernel.shmmax=268435456 for 32-bit
- kernel.shmmax=1073741824 for 64-bit
- kernel.msgmni=1024
- fs.file-max=8192
- kernel.sem="250 32000 32 1024"

Shared Memory

To view current settings, run command:
# more /proc/sys/kernel/shmmax
To set it to a new value for this running session, which takes effect immediately, run command:
# echo 268435456 > /proc/sys/kernel/shmmax
To set it to a new value permanently (so it survives reboots), modify the sysctl.conf file:
...
kernel.shmmax = 268435456
...

Semaphores

To view current settings, run command:
# more /proc/sys/kernel/sem 
250 32000 32 1024 
To set it to a new value for this running session, which takes effect immediately, run command:
# echo 500 512000 64 2048 > /proc/sys/kernel/sem
Parameters meaning:
SEMMSL - semaphores per ID
SEMMNS - (SEMMNI*SEMMSL) max semaphores in system
SEMOPM - max operations per semop call
SEMMNI - max semaphore identifiers

ulimits

To view current settings, run command:
# ulimit -a
To set it to a new value for this running session, which takes effect immediately, run command:
# ulimit -n 8800
# ulimit -n -1 // for unlimited; recommended if server isn't shared

Alternatively, if you want the changes to survive reboot, do the following:

- Exit all shell sessions for the user you want to change limits on.
- As root, edit the file /etc/security/limits.conf and add these two lines toward the end:
 user1        soft    nofile          16000
 user1        hard    nofile          20000
  ** the two lines above changes the max number of file handles - nofile - to new settings.
- Save the file.
- Login as the user1 again. The new changes will be in effect.


Message queues

To view current settings, run command:
# more /proc/sys/kernel/msgmni
# more /proc/sys/kernel/msgmax
To set it to a new value for this running session, which takes effect immediately, run command:
# echo 2048 > /proc/sys/kernel/msgmni
# echo 64000 > /proc/sys/kernel/msgmax

Network




Gigabit-based network interfaces have many performance-related parameters inside of their device driver such as CPU affinity.  Also, the TCP protocol can be tuned to increase network throughput for connection-hungry applications. 





Tune TCP

To view current TCP settings, run command:
# sysctl net.ipv4.tcp_keepalive_time
net.ipv4.tcp_keepalive_time = 7200 // 2 hours
where net.ipv4.tcp_keepalive_time is a TCP tuning parameter.
To set a TCP parameter to a value, run command:
# sysctl -w net.ipv4.tcp_keepalive_time=1800
A list of recommended TCP parameters, values, and their meanings:
Tuning Parameter  Tuning Value    Description of impact 
------------------------------------------------------------------------------
net.ipv4.tcp_tw_reuse 
net.ipv4.tcp_tw_recycle  1    Reuse sockets in the time-wait state 
---
net.core.wmem_max   8388608   Increase the maximum write buffer queue size 
---
net.core.rmem_max   8388608   Increase the maximum read buffer queue size 
---
net.ipv4.tcp_rmem   4096 87380 8388608  Set the minimum, initial, and maximum sizes for the 
       read buffer. Note that this maximum should be less 
       than or equal to the value set in net.core.rmem_max. 
---
net.ipv4.tcp_wmem   4096 87380 8388608  Set the minimum, initial, and maximum sizes for the 
       write buffer. Note that this maximum should be less 
       than or equal to the value set in net.core.wmem_max. 
---
timeout_timewait   echo 30 > /proc/sys/net/ipv4/tcp_fin_timeout Determines the time that must elapse before 
       TCP/IP can release a closed connection and reuse its resources. 
       This interval between closure and release is known as the TIME_WAIT 
       state or twice the maximum segment lifetime (2MSL) state. 
       During this time, reopening the connection to the client and 
       server cost less than establishing a new connection. By reducing the 
       value of this entry, TCP/IP can release closed connections faster, providing 
       more resources for new connections. Adjust this parameter if the running application 
       requires rapid release, the creation of new connections, and a low throughput 
       due to many connections sitting in the TIME_WAIT state. 

Disk I/O




Choose the Right File System
Use 'ext3' file system in Linux.
- It is enhanced version of ext2
- With journaling capability - high level of data integrity (in event of unclean shutdown)
- It does not need to check disks on unclean shutdown and reboot (time consuming)
- Faster write - ext3 journaling optimizes hard drive head motion

# mke2fs -j -b 2048 -i 4096 /dev/sda
mke2fs 1.32 (09-Nov-2002)
/dev/sda is entire device, not just one partition!
Proceed anyway? (y,n) y
Filesystem label=
OS type: Linux
Block size=2048 (log=1)
Fragment size=2048 (log=1)
13107200 inodes, 26214400 blocks
1310720 blocks (5.00%) reserved for the super user
First data block=0
1600 block groups
16384 blocks per group, 16384 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
        16384, 49152, 81920, 114688, 147456, 409600, 442368, 802816, 1327104,
        2048000, 3981312, 5619712, 10240000, 11943936

Writing inode tables: done
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 28 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.

Use 'noatime' File System Mount Option
Use 'noatime' option in the file system boot-up configuration file 'fstab'.  Edit the fstab file under /etc.  This option works the best if external storage is used, for example, SAN:


# more /etc/fstab
LABEL=/                 /                       ext3    defaults        1 1
none                    /dev/pts                devpts  gid=5,mode=620  0 0
none                    /proc                   proc    defaults        0 0
none                    /dev/shm                tmpfs   defaults        0 0
/dev/sdc2               swap                    swap    defaults        0 0
/dev/cdrom              /mnt/cdrom              udf,iso9660 noauto,owner,kudzu,ro 0 0
/dev/fd0                /mnt/floppy             auto    noauto,owner,kudzu 0 0
/dev/sda                /database               ext3    defaults,noatime 1 2
/dev/sdb                /logs                   ext3    defaults,noatime 1 2
/dev/sdc                /multimediafiles        ext3    defaults,noatime 1 2

Tune the Elevator Algorithm in Linux Kernel for Disk I/O
After choosing the file system, there are several kernel and mounting options that can affect it. One such kernel setting is the elevator algorithm. Tuning the elevator algorithm helps the system balance the need for low latency with the need to collect enough data to efficiently organize batches of read and write requests to the disk. The elevator algorithm can be adjusted with the following command:

# elvtune -r 1024 -w 2048 /dev/sda
/dev/sda elevator ID 2 
read_latency: 1024 
write_latency: 2048 
max_bomb_segments: 6 
The parameters are: read latency (-r), write latency (-w) and the device affected. 
Red Hat recommends using a read latency half the size of the write latency (as shown). 
As usual, to make this setting permanent, add the 'elvtune' command to the 
/etc/rc.d/rc.local script.

Others

Disable Unnecessary Daemons (They Take up Memory and CPU)
There are daemons (background services) running on every server that are probably not needed. Disabling these daemons frees memory, decreases startup time, and decreases the number of processes that the CPU has to handle. A side benefit to this is increased security of the server because fewer daemons mean fewer exploitable processes.


Some example Linux daemons running by default (and should be disabled).  Use command:
#/sbin/chkconfig --levels 2345 sendmail off 
#/sbin/chkconfig sendmail off 
Daemon
Description
apmd
Advanced power management daemon
autofs
Automatically mounts file systems on demand (i.e.: mounts a CD-ROM automatically)
cups
Common UNIX� Printing System
hpoj
HP OfficeJet support
isdn
ISDN modem support
netfs
Used in support of exporting NFS shares
nfslock
Used for file locking with NFS
pcmcia
PCMCIA support on a server
rhnsd
Red Hat Network update service for checking for updates and security errata
sendmail
Mail Transport Agent
xfs
Font server for X Windows

Shutdown GUI
Normally, there is no need for a GUI on a Linux server. All administration tasks can be achieved by the command line, redirecting the X display or through a Web browser interface.  Modify the 'inittab' file to set boot level as 3:

To set the initial runlevel (3 instead of 5) of a machine at boot, 
modify the /etc/inittab file as shown: