To check for disk space. [root@london ~]# df -hFilesystem Size Used Avail Use% Mounted on /dev/sda2 26G 16G 8.3G 66% / /dev/sda1 99M 12M 83M 12% /boot tmpfs 742M 0 742M 0% /dev/shm
To check for space consumed by an folder. [root@canada ~]# du -s -h /u01/5.9G /u01/
To check for system bottlenecks use vmstat. Usage: vmstat 5 displays system resource usage every 5 seconds. Use ctrl + c to exit. vmstat 5 10 displays system reource usage every 5 seconds for upto 10 reports. [root@london ~]# vmstat 5 10procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------ r b swpd free buff cache si so bi bo in cs us sy id wa st 0 0 0 1003736 32756 369368 0 0 647 50 920 225 4 12 79 5 0 0 0 0 1003736 32756 369368 0 0 0 0 892 132 0 0 99 0 0 0 0 0 1003736 32764 369368 0 0 0 5 887 130 0 0 98 1 0 0 0 0 1003736 32764 369368 0 0 0 0 884 128 0 0 99 0 0 0 0 0 1003736 32764 369368 0 0 0 0 886 137 0 3 97 0 0 0 0 0 1003736 32772 369368 0 0 0 13 894 131 0 0 100 0 0 0 0 0 1003736 32772 369368 0 0 0 0 887 133 0 0 99 0 0 0 0 0 1003736 32772 369368 0 0 0 0 888 141 0 0 100 0 0 0 0 0 1003736 32772 369368 0 0 0 0 882 133 0 1 99 0 0 0 0 0 1003736 32772 369368 0 0 0 12 892 133 0 0 100 0 0
Vmstat columns description. r number of processes in queue waiting for run time or for CPU resources. If this number exceeds the number of CPUs on the server, then that means there is CPU bottleneck. b The number of processes in uninterruptible sleep. (b=blocked queue, waiting for resource (e.g. filesystem I/O blocked, inode lock)) swpd amount of virtual memory. free amount of idle memory. buff amount of buffer memory. cache amount of cache memory. si amount of memory swapped from disk per second. A swap in operation occurs when the server is experiencing a shortage of RAM memory.High value for this indicates shortage of RAM memory. so amount of memory swapped to disk per second. bi blocks read per second from disk. bo blocks written per second to disk. in number of interrupts per second. cs number of context switches per second. us CPU time running non-kernel code or servicing user tasks . sy CPU time running kernel code or servicing system tasks. Id CPU time idle. wa CPU time waiting for I/O Ex: disk I/O. st CPU time taken from virtual machine. Interpret CPU utilization using vmstat. 1) us+sy is greater than or equal to 80% => Cpu is about to reach its load capacity. 2) us+sy is 100% => It means there is CPU bottleneck. 3) sy is high => application is making lot of system calls to kernel. Identify top processes that are consuming server resources. [root@london ~]# toptop - 19:46:42 up 2:44, 2 users, load average: 1.98, 0.88, 0.40 Tasks: 147 total, 6 running, 141 sleeping, 0 stopped, 0 zombie Cpu(s): 5.5%us, 64.2%sy, 16.1%ni, 0.0%id, 10.9%wa, 0.9%hi, 2.4%si, 0.0%st Mem: 1518744k total, 1314932k used, 203812k free, 128716k buffers Swap: 3068404k total, 0k used, 3068404k free, 957496k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 5374 root 34 19 6272 3772 628 R 30.0 0.2 0:43.26 prelink 3880 root 15 0 37372 11m 5208 R 5.3 0.8 0:21.19 Xorg 4218 root 15 0 40716 13m 9332 R 1.3 0.9 0:02.32 gnome-terminal 520 root 10 -5 0 0 0 D 0.7 0.0 0:01.23 kjournald 244 root 15 0 0 0 0 S 0.2 0.0 0:00.23 pdflush 2968 root 15 0 5288 2608 2164 S 0.2 0.2 0:00.83 vmtoolsd 3478 haldaemo 15 0 6368 4444 1688 S 0.2 0.3 0:04.36 hald 4085 root 16 0 110m 15m 12m S 0.2 1.1 0:01.11 nautilus 9134 root 15 0 2332 1060 804 R 0.2 0.1 0:00.03 top 1 root 18 0 2072 656 560 S 0.0 0.0 0:01.61 init 2 root RT -5 0 0 0 S 0.0 0.0 0:00.00 migration/0 3 root 34 19 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/0 4 root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/0 5 root 10 -5 0 0 0 S 0.0 0.0 0:00.02 events/0 6 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 khelper 7 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 kthread 10 root 10 -5 0 0 0 S 0.0 0.0 0:00.46 kblockd/0
In the first row CPU load average is shown in intervals of 1 min, 5 min and 15 minutes. Only the 5 and 15 min load averages are important. if in a single core system the 5 and specially 15 min load averages increase beyond the mark of "1.00" then better eliminate unwanted processes as it means the load on the system is reaching undesirable levels. Cloumn description for "TOP". PID Unique process identifier USER OS username that is runing the process. PR Priority of the process. NI Nice value('-'= high priority, '+' = low priority) VIRT total virtual memory used by processes. RES Non swapped physical memory used. SHR shared memory used by process. S Process status %CPU Percent of CPU consumption since last screen refresh. %MEM Percent of physical memory consumed by process. TIME+ Toatal CPU time, showing hundredths of seconds. Identify CPU and Memory consuming resorces. Top processes consuming CPU. [root@london ~]# ps -e -o pcpu,pid,user,tty,args | sort -n -k 1 -r | head21.0 5027 oracle ? ora_m000_dup 5.5 11313 root ? /bin/bash /usr/sbin/makewhatis -w 0.3 8849 oracle ? ora_mman_dup 0.2 3880 root tty7 /usr/bin/Xorg :0 -br -audit 0 -auth /var/gdm/:0.Xauth -nolisten tcp vt7 0.1 9023 oracle ? ora_cjq0_dup 0.1 8861 oracle ? ora_mmon_dup 0.1 8857 oracle ? ora_smon_dup %CPU PID USER TT COMMAND 0.0 9037 oracle ? ora_q001_dup 0.0 9035 oracle ? ora_q000_dup
Top processes consuming memory. [root@london ~]# ps -e -o pmem,pid,user,tty,args | sort -n -k 1 -r | head4.7 8857 oracle ? ora_smon_dup 3.6 8861 oracle ? ora_mmon_dup 3.2 9023 oracle ? ora_cjq0_dup 2.1 8853 oracle ? ora_lgwr_dup 1.9 8949 oracle ? ora_arc3_dup 1.9 8947 oracle ? ora_arc2_dup 1.9 8945 oracle ? ora_arc1_dup 1.9 8936 oracle ? ora_arc0_dup 1.8 9035 oracle ? ora_q000_dup 1.7 12644 oracle ? ora_w000_dup
Shortcut trick: create alias for above commands and then use it. It is something similar to synonyms in database. alias topc='ps -e -o pcpu,pid,user,tty,args | sort -n -k 1 -r | head' alias topm='ps -e -o pmem,pid,user,tty,args | sort -n -k 1 -r | head' Next time instead of typing the full command just use the alias. Example: [root@london ~]# topc 5.7 11313 root ? /bin/bash /usr/sbin/makewhatis -w 0.2 8849 oracle ? ora_mman_dup 0.2 3880 root tty7 /usr/bin/Xorg :0 -br -audit 0 -auth /var/gdm/:0.Xauth -nolisten tcp vt7 0.1 8861 oracle ? ora_mmon_dup 0.1 8857 oracle ? ora_smon_dup %CPU PID USER TT COMMAND 0.0 9037 oracle ? ora_q001_dup 0.0 9035 oracle ? ora_q000_dup 0.0 9023 oracle ? ora_cjq0_dup 0.0 8988 oracle ? ora_qmnc_dup [root@london ~]# topm 4.8 8857 oracle ? ora_smon_dup 3.6 8861 oracle ? ora_mmon_dup 3.2 9023 oracle ? ora_cjq0_dup 2.1 8853 oracle ? ora_lgwr_dup 1.9 8949 oracle ? ora_arc3_dup 1.9 8947 oracle ? ora_arc2_dup 1.9 8945 oracle ? ora_arc1_dup 1.9 8936 oracle ? ora_arc0_dup 1.8 9035 oracle ? ora_q000_dup 1.7 8851 oracle ? ora_dbw0_dup Identify I/O problems. iostat 10 --The above command shows device statistics every 10 seconds. [root@london ~]# iostat 10 Linux 2.6.18-164.el5 (london) 03/11/2012 avg-cpu: %user %nice %system %iowait %steal %idle 0.96 1.42 11.95 4.54 0.00 81.13Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 11.36 279.51 213.89 2322996 1777614 sda1 0.02 0.24 0.00 1992 22 sda2 11.31 279.03 213.88 2319010 1777592 sda3 0.02 0.19 0.00 1618 0
Column description. Device partition name. tps I/O transfers per second to the device. Blk_read/s Blocks read per second from the device. Blk_wrtn/s Blocks written per second to the device. Blk_read Number of blocks read. Blk_wrtn Number of blocks written. iostat with extended statistics. iostat -xd 5 ------------------->(x is for extended statistics where d is for disk only statistics) Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda 43.37 4462.65 65.46 89.56 17712.45 36642.57 350.63 3.25 20.98 5.14 79.74 sda1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sda2 39.96 4462.65 63.86 89.56 17672.29 36642.57 354.04 3.24 21.13 5.20 79.74 sda3 3.41 0.00 1.61 0.00 40.16 0.00 25.00 0.01 6.12 3.62 0.58 hdc 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 rrqm/s and wrqm/s The number of merged read and write requests queued per second. “Merged” means the operating system took multiple logical requests and grouped them into a single request to the actual device. r/s and w/s The number of read and write requests sent to the device per second. rsec/s and wsec/s The number of sectors read and written per second. Some systems also output rkB/s and wkB/s, the number of kilobytes read and written per second. avgrq-sz The request size in sectors. avgqu-sz The number of requests waiting in the device’s queue. await The number of milliseconds required to respond to requests, including queue time and service time. Unfortunately, iostat doesn’t show separate service time statistics for read and write requests, which are so different that they really shouldn’t be averaged together. However, you can probably chalk up high I/O waits to reads, because writes can often be buffered but reads usually have to be served directly from the spindles. svctm The number of milliseconds spent servicing requests, from beginning to end, including queue time and the time the device actually takes to fulfill the request. %util The percentage of CPU time during which requests were issued. This really shows the device utilization, as the name implies, because when the value approaches 100%, the device is saturated. Out of these following columns are really important "await","%util", "avgqu-sz" In the above example sda2 has the highest I/O utilization of 79.74%, with avg wait of 21.13 and number of requests waiting in the device queue 3.24. To identify number of processors in linux. [root@canada ~]# cat /proc/cpuinfo|grep processor|wc -l 1 Check Memory Usage. [root@canada sa]# free -m total used free shared buffers cached Mem: 1010 749 260 0 121 488 -/+ buffers/cache: 140 870 Swap: 2047 0 2047 In this the 2nd row gives you the accurate picture of the actual memory utilization. The 3rd row gives you the details regarding swap memory utilization. Command or utility to print directory structure. Using ls -R [root@canada /]# ls -R /u02 | grep ":$" | sed -e 's/:$//' -e 's/[^-][^\/]*\//--/g' -e 's/^/ /' -e 's/-/|/' /u02 |-app |---oracle |-----product |-------10.2.0 |---------db_1 using tree command [root@canada /]# tree /u02 /u02 `-- app `-- oracle `-- product `-- 10.2.0 `-- db_1 5 directories, 0 files
Search This Blog
Monday, March 12, 2012
Tools & commands for analyzing server performance.
Subscribe to:
Post Comments (Atom)
0 comments:
Post a Comment