COMPUTER AND NETWORK
SYSTEM ADMINISTRATION
Summer 1995 - Lesson 22
Performance Analysis
A. Introduction
1. Performance is affected by the efficency of the
four main resources that a system offers:
- CPU speed
- memory speed and amount
- disk bandwidth
- network bandwidth
2. These are all related.
- NFS traffic depends on network bandwidth as well
as disk bandwidth
- disk bandwidth depends on memory if disk caching
is in place
3. What is good performance?
- to the user is is usually keyboard response time
- it may be execution time: 'How long does it take to
run my job?'
- the system administrator must distinguish between
poor performance caused by system malfunctioning
and that caused by heavy usage
- times of heavy usage are good times to analyze the
system and see where bottlenecks are
- this will help you determine where to put scarce
funds
4. Run time
- several system commands will time a job
- /usr/bin/time, /usr/5bin/time (Solaris), shell's built-in "time"
time "find ..."
user system wall (U+S)/W shared ave.
CPU CPU time mem unshared num of
data swaps
--------------------------------------------------------
0.200u 4.930s 0:10.73 47.8% 155k 150k 81pf
5. Interactive response time
- UNIX makes every effort to prioritize interactive response
time on the local machine
- it does this to prevent keyboard buffers from overflowing
- network delays are primary culprit on a single-user workstation
- on a large multi-user server, CPU sharing is primary culprit
B. memory performance analysis
1. buying more memory is generally the cheapest way to
improve performance (especially now :)
2. generally, active processes require more physical memory
than is available
3. to make memory available the kernel begins to copy pages
of 'unneeded' memory to disk
4. the kernel may also copy the memory image of entire processes
to disk, this is called swapping
5. review the basic UNIX memory management algorithm
- early versions of UNIX (prior to 3BSD) were based on swapping only
- processes were swapped out in their entirety (except for shared
text)
- beginning with 3 BSD demand paging was implemented
- neither the working set model or prepaging was incorporated
- whenever filling memory from file, though, UNIX tries to read
any pages of the file adjacent to the faulted page (called
fill-on-demand-clustering)
- the algorithm was developed using extensive simulation
6. The pagedaemon
- the paging algorithm was implemented partly in the kernel and
partly in a new process called the "pagedaemon" - process 2
- process 0 is the swapper, and process 1 is init
- the kernel memory is fixed (not paged)
- a data structure called the core map is also fixed
- the core map contains information about the contents of each
page frame in memory
- a page frame may contain a page (text, data, stack, page table) or
may be on the free list
- the pagedaemon's main job is to periodically check and see if the
free list is getting short, and if so, it frees up some pages
7. The 4.3 BSD UNIX page replacement algorithm
- the pagedaemon also executes the page replacement algorithm
- every 250 msec the pagedaemon awakes and checks to see if the
number of page frames on the freelist is >= lotsfree - typically
1/4 of memory
- if there are enough free page frames the daemon goes back to sleep
- if not then the daemon starts looking for pages to eject
- a global algorithm is used - that is, a page from any process
is a candidate for ejection
- the page replacment algorithm that is used is the two-handed
clock algorithm
- this algorithm approximates global LRU
- the hands of the clock sweep across all page frames
- the first hand CLEARS a reference bit (in the core map)
- if a page is referenced then the reference bit is SET
- the second hand checks to see if the reference bit is set and
if it is not, then the page is a candidate for ejection
- it is written to disk (if dirty), and the page frame is placed on
free list
- contents are not erased so can be recovered if it is needed before
it is overwritten
- if hands are close together then only heavily used pages are
spared
- if hands are far apart and memory is large then practically every
page is spared and the PFF goes way up!
- UNIX keeps the hands 2 Mbytes apart (if memory is divided into 1 K
clusters and pages are scanned at about 200 pages per second then the
time between the first and last hand is about 10 seconds)
8. swapper
- the swapper moves processes which has been idle for more than
20 seconds (preventative swapping - normal housekeeping)
- if the pagedaemon cannot keep lotsfree high enough, if the
number of Kbytes of free memory fall below minfree then
the swapper kicks in (desperation swapping)
- the swapper chooses a process to swap out based on 2 criteria:
> longest sleep time
> if none are sleeping, then use resident memory size
(the swapper chooses largest 4 processes, then picks the one
which has been resident longest)
- when a process is swapped out, everything goes - even the user
structure and the page tables
- swapping is much more expensive than paging so a highly loaded
system - that invokes swapping frequently - does not perform well
- UNIX (BSD) attempts to prevent swapping by making lotsfree large,
frequently one-fourth of memory
9. When do we have problems?
- preventative swapping is normal
- a ps -aux usally shows many swapped out processes
- paging is also part of normal operations
> a new process must have new pages brought into memory
> also must page in when it references non-recently used section
of memory
- page faults always cause a performance degradation
- usually, the pagedaemon quickly fixes the problem by
getting rid of unneeded pages and loading the needed
ones
- when the pagedeamon fails then desperation swapping begins
- what types of processes are likely to be swapped out by
desperation swapping?
> ans: ones that sleep: editors, shells, generally interactive
processes
> keyboard response time goes to pot since a keystroke requires
a disk access (and the disk is probably heavily loaded at this
time)
10. how to diagnose
1. tools - BSD: vmstat
S5: sar
2. these tools report:
page-ins
page-outs
swap-ins
swap-outs
3. page-ins
- most UNIX systems use 'demand paging'
- when a process is started only the memory
maps for the process are loaded in physical
memory
- each memory access causes a page fault
and each page is brought in 'on demand'
- the alternative is 'pre-paging'
- thus page-ins are normal
4. swap-ins
- a new process acts like a swap-in
- not very useful
5. page-outs
- this is a first indicator that your memory is
inadequate
- some page-out activity is normal
- does the frequency of page-outs dramatically
increase whenever system performance is sluggish?
- acceptable rate is O/S and hardware dependent
- in order to know you need to establish baselines of
activity
6. swap-outs
- example vmstat -S
procs memory page disk faults cpu
r b w avm fre si so pi po fr de sr d0 d1 d2 d3 in sy cs us sy id
0 0 0 0 3028 4 1 1 2 1 0 0 2 2 0 0 0 82 177 89 33 9
- procs
Number of processes:
r - runnable (not waiting for I/O or sleeping)
b - blocked for resources (i/o, paging, etc.)
w - runnable or short sleeper (< 20 secs) but
swapped
- any number but 0 in the w column indicates what?
> ans: desperation swapping
- memory
avm - number of active virtual Kbytes (used in last 20 secs)
fre - size of the free list in Kbytes
> when this gets close to lotsfree, 2M on xi, then page-outs
begin
- page
Report information about swapping, page faults, and paging
activity
Reported in units per second (averaged over last 5 seconds)
si - procs swap-ins
so - procs swap-outs (not due to idle)
pi - kilobytes per second paged in
po - kilobytes per second paged out
fr - kilobytes freed per second
de - anticipated short term memory shortfall in Kbytes
sr - pages scanned by clock algorithm, per-second
- disk
Report number of disk operations per second.
- faults
Report trap/interrupt rate averages per second over
last 5 seconds.
in - (non clock) device interrupts per second
sy - system calls per second
cs - CPU context switch rate (switches/sec)
- cpu
Give a breakdown of percentage usage of CPU time.
us - user time for normal and low priority processes
sy - system time
id - CPU idle
- we are most concerned with swap-outs and page-outs
procs memory page disk faults cpu
r b w avm fre si so pi po fr de sr d0 d1 d2 d3 in sy cs us sy id
0 0 0 0 2508 20 0 0 0 0 0 0 13 0 0 0 226 216 350 7 6 87
0 0 0 0 2280 0 0 16 0 0 0 0 3 0 0 0 258 361 343 5 8 87
0 0 0 0 2104 21 0 124 56 184 0 111 5 0 0 0 545 667 563 14 16 70
0 0 0 0 2120 0 0 36 12 60 0 37 0 0 0 0 338 387 345 3 5 92
0 0 0 0 2076 0 0 12 0 28 0 23 1 0 0 0 263 271 370 3 4 92
0 1 0 0 2048 5 0 0 0 44 16 33 1 0 0 0 320 473 497 6 9 85
8 1 0 0 2116 10 0 0 0 100 0 56 23 0 0 0 514 377 898 14 14 72
0 0 0 0 2084 5 0 24 16 148 0 67 6 0 0 0 350 424 529 9 10 81