If you're running a fair few non-global zones on Solaris 10, you'll know full well how painfully slow the patching process is. Well, I'm please to say "not any more".

The Zones Parallel Patching feature was officially released on Tuesday and is contained in the latest Solaris 10 patch utilities patch, 119254-66 (SPARC) and 119255-66 (x86).

Getting the functionality is simple: just apply the patch as you would any other patch. Taking advantage of it is a different thing.

By default the behaviour is as before: NO parallel patching takes place. But it's easily changed by setting "num_proc" to the number of non-global zones to be patched in parallel in the /etc/patch/pdo.conf file.

Prior to this feature, each non-global zone was patched sequentially. With this feature invoked, the global zone continues to be patched first, but then the non-global zones can be patched in parallel, leading to significant performance gains in patching operations on zones systems.

While the performance gain is dependent on a number of factors, including the number of non-global zones, the number of on-line CPUs, the speed of the system, the I/O configuration of the system, etc., a significant performance gain (up to 300% has been reported) can be expected for patching the non-global zones.

Here's the relevant note from the patch README file (you read these files don't you ;-) ) that provides a bit more useful information in this new functionality:

NOTE 10: 119254-66 is the first revision of the patch utilities to deliver
"zones parallel patching". This new functionality allows multiple
non-global zones to be patched in parallel by patchadd. Prior to
revision 66, patchadd would patch all applicable non-global zones
sequentially, that is one after another. With zones parallel
patching, a sysadmin can now set the number of zones to patch in
parallel in a new configuration file for patchadd called
/etc/patch/pdo.conf.

The two factors that affect the number of non-global zones that
can be patched in parallel are:

1. Number of on-line CPUs

2. The value of num_proc in /etc/patch/pdo.conf

If the value of num_proc is less than or equal to 1.5 times
the number of on line CPUs, then patchadd limits the maximum
number of non-global zones that will be patched in parallel
to num_proc. If the value of num_proc is greater than 1.5
times the number of on-line CPUs, then patchadd limits the
maximum number of non-global zones that will be patched in
parallel to 1.5 times the number of on-line CPUs. Note that
patchadd will patch all applicable non-global zones on a system;
the above description outlines only how patchaadd determines
the maximum number of job slots to be used during parallel
patching of non-global zones.

An example of this in operation would be where:

num_proc=8
and number of on-line CPU's is 4

In this case the maximum setting for num_proc would be 6, that
is the maximum number of zones that could be patched in parallel
is 6. If there are more than this number of non-global zones on
the system, the first 6 will be patched in parallel, then the
remaining non-global zones will be patched as processes finish
patching the first 6 non-global zones.

There is only one patching process used to patch each non-global
zone, so if num_proc exceeds the number of installed zones, then
num_proc will be set to the number of non-global zones assuming
that num_proc does not exceed on-line CPU count * 1.5 as above.
Please see comments in /etc/patch/pdo.conf for more details on
setting num_proc.

Of course, I didn't want to take someone else's word for it, so I tested this new functionality myself.

The Configuration

Host: T5240
OS: Solaris 10 10/08 (aka update 6) - SUNWCuser installation cluster on ZFS
Zone config: 5 very simple and basic sparse root zones without network interfaces:

# zoneadm list -cv
ID NAME STATUS PATH BRAND IP
0 global running / native shared
10 zone1 running /zones/zone1 native shared
11 zone2 running /zones/zone2 native shared
12 zone3 running /zones/zone3 native shared
13 zone4 running /zones/zone4 native shared
14 zone5 running /zones/zone5 native shared
#
# zonecfg -z zone1 info
zonename: zone1
zonepath: /zones/zone1
brand: native
autoboot: true
bootargs:
pool:
limitpriv:
scheduling-class:
ip-type: shared
inherit-pkg-dir:
dir: /lib
inherit-pkg-dir:
dir: /platform
inherit-pkg-dir:
dir: /sbin
inherit-pkg-dir:
dir: /usr
#

All zones were cloned from the first.

The Patching

Now to keep things nice and simple, lets start with the umount patch 140796-01...

num_proc=1 (default):

# time patchadd 140796/140796-01
[...]
patchadd 140796/140796-01 31.46s user 18.05s system 45% cpu 1:49.48 total
#

num_proc=5:

# time patchadd 140796/140796-01
[...]
patchadd 140796/140796-01 42.60s user 24.00s system 170% cpu 39.032 total
#

Not bad. 5 zones patched in less than half the time for one simple patch. Lets try something more substantial like KU 138888-08...

num_proc=1:

# time patchadd 138888/138888-08
[...]
patchadd 138888/138888-08 95.89s user 89.16s system 53% cpu 5:49.15 total
#

num_proc=5:

# time patchadd 138888/138888-08
[...]
patchadd 138888/138888-08 131.29s user 121.02s system 152% cpu 2:45.89 total
#

About the same. Not quite the 300% I've seen quoted (though I don't know what their config was) but it's still quite a considerable improvement.

Not satisfied, I backed out the KU and cloned 5 more zones (total 10) and tested again:

num_proc=1:

# time patchadd 138888/138888-08
[...]
patchadd 138888/138888-08 184.01s user 171.04s system 45% cpu 13:07.70 total
#

num_proc=10:

# time patchadd 138888/138888-08
[...]
patchadd 138888/138888-08 305.07s user 259.50s system 231% cpu 4:04.37 total
#

Woooohoooo!!! Now that's more like it. Patched in a third of the time!!!

Monday's task will be to see how long it takes to apply the latest patch cluster to a Solaris 10 5/08 (aka u5) on UFS system with each setting with 10 zones - I know, I'm a sucker for pain. I actually wanted to try with u3, but the min supported OS for the T5240 I've got is s10u4+patches.

This is really impressive and a godsend for anyone running a lot of zones.