It’s been one of those periods of ‘nothing remakarkable ever happens’. So, in desperation, I decided to try to find a blog post or two in the unremarkable instead.
Let’s start with my little HP Folio 13, my near-two-year-old notebook, of which I said in a recent blog piece, “the Folio only has 4GB RAM, so running multiple simultaneous VMs is not really an option: this Oracle will have to run on the physical machine or not at all”
Absolutely accurate as it stands, in that the thing does indeed ship with only 128GB hard disk and 4GB RAM, which is not enough to hold a decent party, let alone run a decent database.
However, I had reckoned without these guys. Their web site tools found me this:
It’s a 250GB mSATA hard drive (mSATA essentially being the innards of an ordinary solid state hard drive without the fancy external casing). At a stroke, and for relatively modest outlay, I was able to double my disk capacity and its speed. Virtualisation on such a storage platform becomes distinctly do-able.
My second purchase was this:
For a mere AU$100, that 8GB stick of laptop RAM doubles the laptop’s existing capacity -and, again at a stroke, makes it more than capable of hosting a 3-machine Oracle RAC.
Fitting these goodies was not a piece of cake, I have to say, what we me being blessed with fingers that are as dainty as a french Boulangerie’s Baguette-Rex. For the most part, I followed the instructions provided by this kind Internet soul without incident, though I still managed to rip out the connector ribbons that make minor details like the keyboard and monitor work in my heavy-handed case opening attempts. I’m pleased to report, however, that the relevant connectors appear to have been designed with complete Klutzes in mind, so I was able to reconnect them when required and the laptop is now operating normally once more.
So now I am blessed with a 16GB, 1.5TB SSHD monster of a Toshiba laptop for running anything serious (for example, a 2-node RAC and 2-node Data Guard setup, practicing for patches, failovers and switchovers). It is technically portable, and so I can brace my neck and arms and lug into work on the train if I have to.
But with the peanut-sized hardware upgrades mentioned here, however clumsily fitted by yours truly, I am now additionally blessed with an 8GB, 250GB SSHD svelte, bare-noticeable HP ultrabook that I can carry around for hours and not mind… and it’s good enough to run a Windows virtual machine with SQL Server and a 2-node Oracle RAC, so practicsing patching, SQL Server→Oracle replication and such database-y things is trivially easy, without breaking my neck or upper arms.
It’s nice to have rescued a near-two-year-old ultrabook from oblivion, too, because with the additional hardware has not only extended the original machine’s technical capacity, it’s just about doubled its useful lifetime, too.
Flushed with my new hardware capabilities, then, I recently decided to dry-rehearse the update of an Oracle 126.96.36.199.0 RAC to 188.8.131.52.9 (i.e., by applying the January 2014 CPU patchset to it, which for Grid+RAC purposes is patch 17735354). It didn’t go awfully well, to be honest -and the reason it didn’t go very well was instructive!
The basic process of applying a Grid+RAC patch to a node is:
Copy the patchfile to an empty directory owned by the oracle user (I used /home/oracle/patches), and unzip it there
Make sure the /u01/app/grid/OPatch and /u01/app/oracle/product/11.2.0/db_1/OPatch directories on all nodes are wiped and replaced with the latest unzipped p6880880 download (that gets your patching binaries right)
Create an ‘ocm response file’ by issuing the command /u01/app/grid/OPatch/ocm/bin/emocmrsp -no_banner -output /home/oracle/ocm.rsp (on all nodes)
Become the root user, set your PATH to include /u01/app/grid/OPatch and then launch opatch auto /home/oracle/patches -ocmrf /home/oracle/ocm.rsp
After you launch the patch application utility at Step 4, it’s all supposed to be smooth sailing. Unfortunately, whenever I did this on Gamow (the primary node of my standby site and thus the first site to be patched in a ‘standby first’ scenario), I got this result:
2014-02-17 12:56:45: Starting Clusterware Patch Setup Using configuration parameter file: /u01/app/grid/crs/install/crsconfig_params Stopping RAC /u01/app/oracle/product/11.2.0/db_1 ... Stopped RAC /u01/app/oracle/product/11.2.0/db_1 successfully patch /home/oracle/patches/17592127/custom/server/17592127 apply successful for home /u01/app/oracle/product/11.2.0/db_1 patch /home/oracle/patches/17540582 apply successful for home /u01/app/oracle/product/11.2.0/db_1 Stopping CRS... Stopped CRS successfully patch /home/oracle/patches/17592127 apply failed for home /u01/app/grid Starting CRS... CRS-4123: Oracle High Availability Services has been started. Failed to patch QoS users. Starting RAC /u01/app/oracle/product/11.2.0/db_1 ... Started RAC /u01/app/oracle/product/11.2.0/db_1 successfully opatch auto succeeded.
If you read it fast enough, you might just glance at the last line there and think everything is tickety-boo: “opatch auto succeeded”, after all! You might even scan through some of the lines shown getting to that point which say happy things like, “17592127 apply successful for home /u01/app/oracle/product/11.2.0/db_1” and conclude that all’s well. But a keener eye is needed to notice that *one* line says “17592127 apply failed for home /u01/app/grid” and another mentions something about having “Failed to patch QoS users” . So what’s going on: is opatch being successful or not?
The answer lies in the log file which it tells you it’s created. Mine had this sort of stuff in it:
2014-02-17 13:06:51: Successfully removed file: /tmp/fileS5bCZV 2014-02-17 13:06:51: /bin/su exited with rc=1 2014-02-17 13:06:51: Error encountered in the command /u01/app/grid/bin/qosctl -autogenerate > Syntax Error: Invalid usage > > Usage: qosctl <username> <command> > > General > username - JAZN authenticated user. The users password will always be prompted for. > > Command are: > -adduser <username> <password> | > -checkpasswd <username> <password> | > -listusers | > -listqosusers | > -remuser <username> | > -setpasswd <username> <old_password> <new_password> | > -help > > End Command output 2014-02-17 13:06:51: Running as user oracle: /u01/app/grid/bin/crsctl start resource ora.oc4j 2014-02-17 13:06:51: s_run_as_user2: Running /bin/su oracle -c ' /u01/app/grid/bin/crsctl start resource ora.oc4j ' 2014-02-17 13:07:06: Removing file /tmp/file102UrG 2014-02-17 13:07:06: Successfully removed file: /tmp/file102UrG 2014-02-17 13:07:06: /bin/su successfully executed
Again, that last line shows opatch has a nasty habit of declaring success at the drop of a hat! It may distract you from seeing that there’s been a syntactical problem: the patch tool was trying to execute qosctl -autogenerate and encountered a syntax error instead. Clearly, the qosctl program didn’t like “autogenerate” as a command switch. Perhaps at this point you think, “Another fine Oracle stuff-up, but as I don’t use Quality of Service features anyway, this won’t be of significance to me”.
Unfortunately, it will -because the syntax error here is not really what you’re supposed to be looking at. The syntax error is the clue: this autogenerate command would be syntactically correct if the qosctl binaries had been patched to 184.108.40.206.9 (because the autogenerate switch was introduced somewhere around 220.127.116.11.5). So it can only be a syntactical error if the binaries haven’t been patched successfully. And if this particular qosctl binary wasn’t patched, there’s a very good chance that some other binaries that you do make use of will have been skipped too.
But to see evidence for whether that’s a problem or not, you have to look upwards in the patching log, and keep a sharp eye out for this:
2014-02-17 13:05:22: The apply patch output is Oracle Interim Patch Installer version 18.104.22.168.6 Copyright (c) 2013, Oracle Corporation. All rights reserved. Oracle Home : /u01/app/grid Central Inventory : /u01/app/oraInventory from : /u01/app/grid/oraInst.loc OPatch version : 22.214.171.124.6 OUI version : 126.96.36.199.0 Log file location : /u01/app/grid/cfgtoollogs/opatch/opatch2014-02-17_13-05-18PM_1.log Verifying environment and performing prerequisite checks... Prerequisite check "CheckSystemSpace" failed. The details are: Required amount of space(6601.28MB) is not available. UtilSession failed: Prerequisite check "CheckSystemSpace" failed. Log file location: /u01/app/grid/cfgtoollogs/opatch/opatch2014-02-17_13-05-18PM_1.log OPatch failed with error code 73 2014-02-17 13:05:22: patch /home/oracle/patches/17592127 apply failed for home /u01/app/grid
So this comes from about 1 minute before the qosctl syntax error report… and is clearly the source of the original ‘failed to apply’ error that was displayed as part of opatch’s screen output. And the cause for that error is now apparent: the patch failed because a ‘CheckSystemSpace’ prerequisite failed. Or, in plain English, I haven’t got enough free disk space to apply this patch.
If you’re like me, that will surprise you. My file system has a reasonable amount of free space, after all:
[[email protected] db_1]$ df -h Filesystem Size Used Avail Use% Mounted on /dev/sda2 21G 15G 5.3G 74% / tmpfs 1.9G 444M 1.5G 24% /dev/shm balfour:/griddata 63G 3.1G 57G 6% /gdata balfour:/dbdata 63G 3.1G 57G 6% /ddata
5.3GB of free space is not exactly generous, but it’s non-trivial, too… and yet it seems not to be enough for this patch to feel comfortable.
Anyway, to cut a long story short(er): never just focus on the bleeding obvious errors reported by OPatch. Dig deeper, look harder …you’ll probably find something which explains that the obscure-stated “failed to patch QoS users” is actually just a plea for more disk space.
I’ll wrap this blog piece up to say that I deliberately create my RAC nodes with only 25GB hard disks (it says so in the instructions!). I wondered after this experience whether I’d need to modify my Salisbury and Asquith articles to specify a larger hard disk size than that…. but actually, it turns out not to be necessary. Instead, make sure you delete the contents of the /osource directory before you start patching (that means wiping out the biinaries needed for installing Oracle and Grid… by now, you need neither, of course). If you do this, therefore:
[[email protected] osource]$ cd grid [[email protected] grid]$ rm -rf * [[email protected] grid]$ cd .. [[email protected] osource]$ cd database [[email protected] database]$ rm -rf * [[email protected] database]$ df -h Filesystem Size Used Avail Use% Mounted on /dev/sda2 21G 12G 8.2G 59% / tmpfs 1.9G 444M 1.5G 24% /dev/shm balfour:/griddata 63G 3.1G 57G 6% /gdata balfour:/dbdata 63G 3.1G 57G 6% /ddata
…then I can promise you that 8.2GB of free space is adequate and the 188.8.131.52.9 PSU will be applied without error, second time of asking.
Of course, you may prefer simply to increase the size of the hard disk you’re working on so that there’s loads of free space, regardless of whether you delete things or not. That’s the approach I first took, too… and I ran into all sorts of problems when I tried it. But that’s a story for another blog piece, I think!