The vmstat command is useful for reporting statistics about virtual memory. The vmstat command is located in /usr/bin, is part of the bos.acct fileset and is installable from the AIX base installation media.
The vmstat command summarizes the total active virtual memory used by all of the processes in the system, as well as the number of real-memory page frames on the free list. Active virtual memory is defined as the number of virtual-memory working segment pages that have actually been touched. This number can be larger than the number of real page frames in the machine, because some of the active virtual-memory pages may have been written out to paging space.
Syntax:
# vmstat [ -fsviItlw ] [Drives] [ Interval [Count] ]
Useful combinations of the vmstat command
- vmstat or vmstat Interval Count
- vmstat -v
The vmstat command gives data on virtual memory activity to standard output. The first line of data is an average since the last system reboot. In Example below you can see a summary of the virtual memory activity since the last system startup.
# vmstat System configuration: lcpu=4 mem=7168MB ent=0 kthr memory page faults cpu ----- ----------- ------------------------ ------------ ------------------------- r b avm fre re pi po fr sr cy in sy cs us sy id wa pc ec 1 1 148162 1649286 0 0 0 0 0 0 0 29121 133 0 0 99 0 0.00 0.2
When determining if a system might be short on memory or if some memory tuning needs to be done, run the vmstat command over a set interval and examine the pi and po columns on the resulting report. These columns indicate the number of paging space page-ins per second and the number of paging space page-outs per second. If the values are constantly non-zero, there might be a memory bottleneck. Having occasional non-zero values is not a concern because paging is the main principle of virtual memory.
To use the vmstat command, specifying Interval and Count, you would input the interval for the update period in seconds, and the Count should represent the number of iterations to be performed. The first report contains statistics since the system startup. Each report after that contains data collected during the interval time period.
vmstat – CPU/RAM
Here are common examples to view the CPU/RAM usage on a AIX system.
- vmstat -t 5 3 shows 3 statistics in 5 seconds interval (-t: it will show timestamps as well)
- vmstat -l 5 it will show large pages as well (alp:active large page, flp: free large page)
- vmstat -s displays the count of various events (paging in and paging out events)
- vmstat hdisk0 2 5 displays 5 summaries for hdisk0 at 2 seconds interval
# vmstat -Iwt 2 (it is what IBM-ers are using) kthr memory page faults cpu time ----------- --------------------- ------------------------------------ ------------------ ----------------------- -------- r b p avm fre fi fo pi po fr sr in sy cs us sy id wa pc ec hr mi se 0 0 0 1667011 35713 0 0 0 0 0 0 16 488 250 0 0 99 0 0.01 0.3 11:38:56 0 0 0 1667012 35712 0 0 0 0 0 0 16 102 236 0 0 99 0 0.01 0.1 11:38:58 1 0 0 1664233 38490 0 1 0 0 0 0 12 218 245 0 0 99 0 0.01 0.3 11:39:00 0 0 0 1664207 38515 0 15 0 0 0 0 164 5150 450 1 3 96 0 0.20 4.9 11:39:02
kthr: kernel threads
- r: threads placed in run queue (runnable threads) or are already ecexuting (running)
- b: threads placed in virtual memory waiting queue (b=blocked queue,waiting for resource (e.g. filesystem I/O blocked, inode lock))
(kernel threads blocking on blocked io, it is an indication of io workload and if we have inode lock contetion.)
If runnable threads (r) divided by the number of CPU is greater than one -> possible CPU bottleneck.
(The (r) coulmn should be compared with number of CPUs (logical CPUs as in uptime) if we have enough CPUs or we have more threads.)
High numbers in the blocked processes column (b) indicates slow disks.
(r) should always be higher than (b); if it is not, it usually means you have a CPU bottleneck
an example:
lcpu=2, r=18 (18/2=9), so 8 threads are waiting. But you have to compare this number with the nature of the work is being done. (These processes are holding onto a CPU for a long time or they are using the CPU (running) for a very little time then they get load off from there.) If a queue can be emptied fast then 8 may not be a problem.
memory
avm: The amount of active virtual memory (in 4k pages) you are using, not including file pages.
Active virtual memory is defined as the number of virtual-memory working segment pages that have actually been touched.
from Earl Jew:
Active Virtual Memory is computational memory which is active. AVM does not include any file buffer cache at all. AVM is your computational memory percent that you see listed under topas. AVM includes the active pages out on the paging space. It is possible you have computational memory or virtual memory which was not recently active and it would not be in this caclulation.”
“Over memory commitment would be a situation where AVM would be greater that the installed RAM. It is good to keep AVM at or less than 80%.”
(non computational memory is your file buffer cache)
fre: The size of your memory free list.
We don’t worry when fre is small, as AIX loves using every last drop of memory and does not return it as fast as you might like. This setting is determined by the minfree parameter of the vmo command.
page
fr and sr ratio can show how much pages we had to scan to free up that amount. (if we scanned 1000 and freed 999 those memory pages were not in use recently, it is an indicator)
Look at the largest value of avm (output of vmstat: active virtual pages). Multiply it by 4KB. Compare that number with the installed RAM. Ideally avm should be smaller than total RAM. (avm * 4096 faults
- in: interrupt rate (hardware interrups against the network or san… it is good if it is not high, like here)
- sy: system calls (this amount shows how much work is done by the system, if it is a 6 digit number it is doing a lot of work)
- cs: context switch (process or thread switch) (the rate is given in switches per second)
(A context switch occurs when the currently runnig thread is different from the previously running thread, so it is taken off of the CPU.) It is not uncommon to see the context switch rate be approximately the same as device interrupt rate (in column)
If cs is high, it may indicate too much process switching is occurring, thus using memory inefficiently.
If a program is written inefficiently, it may generate an unusually large number of system calls. (sy)
If cs is higher then sy, system is doing more context switching than actual work.
High r with high cs -> possible lock contention
Lock contention occurs whenever one process or thread attempts to acquire a lock held by another process or thread. The more granular the available locks, the less likely one process/thread will request a lock held by the other. (For example, locking a row rather than the entire table, or locking a cell rather than the entire row.)
When you are seeing blocked processes or high values on waiting on I/O (wa), it usually signifies either real I/O issues where you are waiting for file accesses or an I/O condition associated with paging due to a lack of memory on your system.
cpu
- us: % of CPU time spent in user mode (not using kernel code, not able to acces to kernel resources)
- sy: % of CPU time spent in system mode (it can acces kernel resources (all the nfs daemons and lrud are kernel processes)
- id: % of CPU time when CPUs is idle
- wa: % of CPU time when there was at least one I/O in progress (waiting for finishing that I/O)
- pc: physical capacity (how much physical cpu is used)
- ec: entitled capacity (in percentage) (it correlates with the system calls (sy))
When a wait process is running it can show up either in id (idle) or wa (wait):
- -wait%: if there is at least 1 outstanding thread which is waiting for something (such as I/O to complete, or read it from disk)
- -idle%: if there is nothing to wait for it will show up as idle%
(If the CPU is waiting data from real memory, the CPU is still considered as being in busy state.)
To measure true idle time measure id+wa together
– if id=0%, it does not mean all CPU is consummed, becuase “wait” (wa) can be 100% and waiting for an I/O to complete
– if wait=0%, it does not mean I have no I/O waiting issues, because as long I have threads which keep the CPU busy I could have additional threads waiting for I/O, but this will be masked by the running threads
If process A is running and process B is waiting on I/O, the wai% still would have a 0 number.
A 0 number doesn’t mean I/O is not occurring, it means that the system is not waiting on I/O.
If process A and process B are both waiting on I/O, and there is nothing that can use the CPU, then you would see that column increase.
– if wait% is high, it does not mean I have io performance problem, it can be an indication that I am doing some IO but the cpu does not kept busy at all
– if id% is high then likely there is no CPU or I/O problem.
To measure cpu utilization measure us+sy together (and compare it to physc)
– if us+sy is always greater than 80%, then CPU is approaching its limits (but check physc as well and in “sar -P ALL” for each lcpu)
– if us+sy = 100% -> possible CPU bottleneck, but in an uncapped shared lpar check physc as well.
– if sy is high, your appl. is issuing many system calls to the kernel and asking the kernel to work. It measures how heavily the appl. is using kernel services.
– if sy is higher then us, this means your system is spending less time on real work (not good)
Don’t forget to compare these values with ouputs where each logical CPU can be seen (like “sar -p ALL 1 5”). Some examples when physical consumption of a CPU should be also looked when smt is on.:
– usr+sys=16%, but physc=0.56, it means i see 16% is utliized of a CPU, but actually half of the physical CPU (0.56) is used.
– if us+sys=100 and physc=0.45 we have to look both. If someone says 100% percent is used, then 100% of what? The 100% of the half of the CPU (physc=0.45) is used.
– %usr+%sys=83% for lcpu 0 (output from command sar). It looks a high number at the first sight, but if you check physc, you can see only 0.01 physical core has been used, and the entitled capacityis 0.20, so this 83% is actually very little CPU consumption.
# vmstat -v 4980736 memory pages 739175 lruable pages -------------------- 432957 free pages -------------------- 1 memory pools 84650 pinned pages 80.0 maxpin percentage 20.0 minperm percentage
free pages
how many free pages we have. earls rule: 5 digit of free pages is ideal, 6 digit is generous, 4 digits trouble, 3 digits you are in big trouble
pbuf
These are physical device buffers. pbus are allocated in the memory per lun in the volume group. (if you have more luns there will be more pbufs) Every lun in the vg are pulled together, all ios to these LUNs go through these pbufs (these are pinned memory structures)if you exhaust pbufs, you will get pending disk I/O blocked with pbuf and you need to allocate more pbufs
psbuf
you have to compare this with vmstat -s: paging space page outs. If you see 'psbufs' value is relatively high to paging space page outs then you know, there are high burst of paging out, so pbufs can't handle it. If 'psbufs' value is low, then paging out is moderate there are'nt so big peaks.
fsbuf
When AIX mounts a fs it allocates a static naumber of fsbufs per filesytem. (and that is included in pinned memory). These numbers (last 5 lines) means that many times the buffer for the specified fs has been exhausted (no I/O can go through until the fs buffer unblocks)
# vmstat -s 15503846449 total address trans. faults 3320663543 page ins
total address trans. faults
Every page ins/outs will cause 1 total addr. trans. faults.
- If the sum of page ins+outs is higher that total addr. trans. faults, it means data is paged in and out that has the total addres trans. faults already calculated, so I am reading in and out the same data
- If the sum of page ins+outs smaller than total addr. trans. faults it means we are not reading/writing the same data, but there are additional io probably from process executions...
The value of total addr. trans. faults can be compared to the sum of the below 4 lines (page ins/outs, paging space page ins/outs). If the 1st line is larger than the sum of the below 4 than the TLB (Translation Lookaside buffer) has to be recalculated for the contents that already have.
paging space page outs
earls rule: independently from the system uptime is paging space page outs 5 digits then it should grab your attention and every plus digit should take 10 times more concern from you (6 digit 10 times concern, 7 digit 100 times more concern, 8 digit ....)
pages examined -revolutions of the clock hand - pages freed
clock hand: it examines the pages in memory (in background at a very low priority).
lrud is a kernel process that does the scanning and freeing (sr and fr in vmstat -I). The clock hand is the pointer that lrud is using for scanning and freeing memory. It examines the pages and/or frees the pages.
If the system has nothing to do, the clock hand starts to examine pages. And if there are pages which have not been used, it frees them. revolutions of the clock hand means lrud that many times scanned through the memory since uptime.
(If it is low the system is busy most of the time. If the system would be totally idle for 60 day it would be a 6 digit number.)
'pages examined' shows how many pages have been scanned by lrud, 'pages freed' shows how many pages were freed. ratio of pages examined and pages freed is usueful to know (how much work a system has to do to free some pages.)
free frame waits
it is whenever the amount of free memory hits zero (since boot how many times ther were no free memory) And the system has to scan and free memory in order to
start I/Os - iodones
how many ios started and how many are done (if it is blocked/timed out it is not done it had to be restarted). If iodones are higher than start I/O then probably NFS is running there. (page ins+page outs is the start I/Os)