Salisbury and Asquith, my ‘frameworks’ for automated, nearly-hands-free building of Oracle servers, are retiring. Which is to say, I’m not going to maintain them any more.

My attempts over the years to persuade my System Admin colleagues at work that RAC via NFS (as Salisbury uses) might be a good idea have all fallen on deaf ears, Kevin Closson’s fine articles on the subject notwithstanding. So Salisbury became a bit of a dead end after that, which is why I cooked up Asquith. Asquith uses real iSCSI (as real as anything a virtual environment can cook up, anyway!) and layers ASM on top of that and thus provided me with a playground that much more faithfully reflects what we do in our production environment.

But it’s a pain having two frameworks doing pretty much the same job. So now I’m phasing them out and replacing them with Churchill. The Churchill framework uses NFS (because it’s much easier to automate the configuration of that than it is of iSCSI), but it then creates fake hard disks in the NFS shares and layers ASM on top of the fake hard disks. So you end up with a RAC that uses ASM, but without the convoluted configuration previously needed.

The other thing we do in production at work is split the ownership of the Grid Infrastructure and the Database (I don’t know why they decided to do that: it was before my time. The thing is all administered by one person -me!- anyway, so the split ownership is just an annoyance as far as I’m concerned). Since I’ve been bitten on at least one significant occasion where mono- and dual-ownership models do things differently, I thought I might as well make Churchill dual-ownership aware. You don’t have to do it that way: Churchill will let you build a RAC with everything owned by ‘oracle’ if you like. But it does it by default, so you end up with ‘grid’ and ‘oracle’ users owning different bits of the cluster, unless you demand otherwise.

Other minor changes from Asquith/Salisbury: Churchill doesn’t do Oracle 11.2.0.1 installations, since that version’s now well past support. You can build Churchill infrastructures with 11.2.0.3, 11.2.0.4 and 12.1.0.2. Of those, only the last is freely available from otn.oracle.com.

Additionally, the bootstrap lines have changed a little. You now invoke Alpher/Bethe installations by a reference to “ks.php” instead of “kickstart.php” (I don’t like typing much!). And there’s a new bootstrap parameter: “split=y” or “split=n”. That turns on or off the split ownership model I mentioned earlier. By default, “split” will be “y”.

Finally, I’ve made the whole thing about 5 times smaller than before by the simple expedient of removing the Webmin web-based system administration tool from the ISO download. I thought it was a good idea at the time to include it for Asquith and Salisbury but, in fact, I’ve never subsequently used it and it made the framework ISO downloads much bigger than they needed to be. Cost/benefit wasn’t difficult to do: Webmin is gone (you can always obtain it yourself and add it to your servers by hand, of course).

The thing works and is ready for download right now. However, it will take me quite some time to write up the various articles and so on, so bear with me on that score. All the documentation, as it gets written, will be accessible from here.

The short version, though, is you can build a 2-node RAC and a 2-node Active data guard with six basic commands:

  • ks=hd:sr1/churchill.ks (to build the Churchill Server)
  • ks=http://192.168.8.250/ks.php?sk=1 oraver=11203 ksdevice=eth0 (to build Alpher)
  • ks=http://192.168.8.250/ks.php?sk=2 oraver=11203 filecopy=n ksdevice=eth0 (to build Bethe)
  • ks=http://192.168.8.250/atlee.ks (to build Atlee, the file server for the Data Guard nodes)
  • ks=http://192.168.8.250/ks.php?sk=3 oraver=11203 ksdevice=eth0 (to build Gammow)
  • ks=http://192.168.8.250/ks.php?sk=4 oraver=11203 filecopy=n ksdevice=eth0 (to build Dalton)

With Churchill and the rest of the crew, I can now build a pretty faithful replica of my production environment in around 2 hours. Not bad.

Salisbury and Asquith will remain available from the front page until the Churchill documentation is complete; after that, they’ll disappear from the front page but remain available for download from the Downloads page, should anyone still want them.

Asquith 2.0

I’ve decided: There will be no Asquith 2.0 that runs on Red Hat/CentOS 7.

There are a lot of stumbling blocks, some of which I’ve documented here recently -including things like iSCSI target configurations no longer being easily scriptable, the use of systemd and the use of dynamic names for network devices. No doubt, all of these problems will be resolved over time by the upstream developers, but they currently make it practically impossible to construct a highly-automated, self-building Asquith framework. (Salisbury, needing only NFS, is a much better proposition, but even there the network device naming issue presents automation difficulties).

Since Red Hat 6.x is supported until 2020, I’ll pass on Red Hat 7 and its assorted clones. I rather imagine quite a lot of real-life enterprises might do the same!

CentOS 7 – Not Recommended

I’ve been poking around with CentOS 7, Red Hat 7 and Oracle Enterprise Linux 7 extensively in the past couple of weeks, in the hope of producing a 7-compliant version of Asquith. For the sake of the rest of this post, lets agree to call all those distros, generically, Enterprise Linux 7.x

It’s been quite a ride, because an awful lot has been changed in the transition from Enterprise Linux 6.x to 7.x. I’ll list just some of the differences that have specifically tripped me up here:

  • How you invoke a Kickstart installation in the first place has changed
  • How you do firewalling has changed (firewalld not iptables)
  • The packages and package groups has changed
  • The way you configure iSCSI targets has changed, dramatically, and uses an interactive shell that isn’t suitable for scripting and doesn’t work with Kickstart
  • The way you disable and enable, stop and start services has changed (systemd v. init scripts)
  • The way network devices are named is now “intelligent”… and completely borks Kickstart

I’ll just explain a little more about that last point, by the way, since I’ve not mentioned it at all in any previous blogs. The gist of it is that you probably know and love your network interfaces as things like “eth0” and “eth1”, and have done for years. But they aren’t called that any more. Oh no. Instead, you get names such as “enp0s3” and “eno16777736” …and (this is the particularly cunning bit): you get different names depending on what your hardware and your BIOS is capable of.

The idea behind the change is logical and admirable in and of itself: in a server with two Ethernet cards, you were never entirely sure which one would pick up the “eth0” designation and which the “eth0”. Whereas now, the names are bus/slot dependent and are thus assigned deterministically: the card in slot 3 gets the ‘s3’ name and the one in slot 4 gets an ‘s4’ name. Simple, although you won’t know what your interface is going to be called until after the installation has itemised all your hardware and assigned the appropriate names.

The trouble is that in Kickstart, we used to do this sort of thing:

network --device eth0 --bootproto static --ip 192.168.8.101 --netmask 255.255.255.0 --nameserver 192.168.8.250 --hostname alpher.dizwell.home
network --device eth1 --bootproto static --ip 10.0.0.101 --netmask 255.255.255.0 --nameserver 192.168.8.250  hostname alpher-priv

That is to say: assign one set of network attributes to an interface called “eth0” and another to one called “eth1” …and do those assignments before the installation has completed.

So how do you write equivalent lines for an Enterprise Linux 7 Kickstart script when don’t know whether your interfaces are going to be called “enp0s3” or “eno16777736” or something completely different until after the installation has finished? You can’t. It’s just impossible to write one Kickstart script to run on any hardware now, because you won’t know what device names are ahead of time.

A case in point: my laptop and my home desktop PC, running a virtual machine in VMware Workstation, both produce Linux guests that end up with a network interface called “eno1677736”, but my work PC (also running VMware) produces guests that have a “enp0s3” network interface. One Kickstart script cannot do duty for both environments… and I have no idea what other variants my readers and would-be Asquith users might end up with, so I can’t even start taking them into account!

It’s a mess. In plain words: Enterprise Linux 7 breaks Kickstart installations.

I’ve had to work around it for now by reducing the Kickstart network line down to:

network --hostname=asquith.dizwell.home

…which doesn’t attempt to configure networking interfaces at all. Later on in my Kickstart script, in the %post section, I then cheat like mad and run this:

for f in `ls /sys/class/net`; do
  if [[ $f != "lo" ]]; then 
    cat > /etc/sysconfig/network-scripts/ifcfg-$f << EOF
DNS1=192.168.8.250
GATEWAY=
BOOTPROTO=none
DEVICE=$f
ONBOOT=yes
IPV6INIT=no
TYPE=Ethernet
IPADDR0=192.168.8.250
PREFIX0=24
DEFROUTE=yes
IPV4_FAILURE_FATAL=no
NAME="System $f"
EOF
  fi
done

…which simply writes a network configuration file for any non-loopback interface it finds listed in the /sys/class/net directory. This is taking place after the installation has all-but completed, so by that stage the finished interface names should be available. As that stands, it will write the same configuration for both interfaces in a 2-interface server, which is obviously not right… but a little bit of bash if-then-else’ing should see that right. For Asquith itself, which only has one network interface to worry about, this code as written will work no matter what interface names your installation decides to bestow upon your guest.

But it’s not “right” and it’s certainly not elegant, and the fact can’t be dodged that in their eagerness to embrace “meaningful network interface names”, the Enterprise Linux developers broke Kickstart. Hopefully, they will un-break it before too long.

Anyway… all these changes I’ve been whinging about of late apply equally well to all three variants of Enterprise Linux that I’ve been working with. It doesn’t matter whether you use CentOS, RHEL or OEL: Kickstart (for example) will struggle to configure your network interfaces correctly in all of them.

I do, however, have a special word of opprobrium to hurl uniquely in CentOS’s direction: it’s the only one of the three distros that decided it was more important to have a word processor on your Enterprise Linux server than a working Oracle database.

Let me explain: When you register for the Red Hat trial and download the 3.4 GB Red Hat 7 ISO; or when you download the Oracle Enterprise Linux 4GB ‘V46135-01’ ISO…. in both cases you end up with a single DVD image which contains a mix of 32-bit and 64-bit libraries/packages. As you know, Oracle’s Linux installs still require a mix of 32-bit and 64-bit packages to work properly (for example, you need glibc-x86_64 and glibc-i686 before things will compile correctly during the ‘linking phase’ of the Oracle database installation). So Red Hat and OEL both provide distro installation media which can satisfy those requirements.

But if you download the 4GB CentOS 7 installation DVD, you get pure, 64-bit only packages. No .i686 software exists at all, and thus no Oracle software installations are possible with it. I asked on the CentOS forums why they decided to package things up quite differently than their upstream vendor (i.e., Red Hat) did and the only explanation someone offered was that “to make room for the LibreOffice software, they had to ditch the i686 libraries”. I’m not sure if that’s so (a DVD ISO can be 4.7GB, so there’s room for 700MB of extras on the CentOS DVD even as it stands), but if it were true, it’s a weird choice: we package our Enterprise Linux distro so that you can run a word processor, but not an Enterprise-class database. You figure the logic of that, because I can’t see any in it.

You can, of course, install Oracle database software on CentOS by the “simple” expedients of either (a) connecting your server directly to the Internet and downloading the relevant 32-bit packages with yum; or (b) downloading the CentOS “Everything” ISO, instead of the plain-vanilla “DVD ISO”. The Everything ISO is 6.7GB in size and does include the i686 software packages you’ll need. But that means it’s nearly 3GB bigger than OEL or RHEL’s Oracle-ready equivalents.

I shall be interested to see how Scientific Linux do their packaging when the time comes (they are currently stuck at version 6.5, so I don’t know when or if SL7.0 will be making an appearance).

In the meantime, I shall have no choice but to strongly recommend NOT using CentOS 7 as a platform for Oracle databases. I’ll be switching all my development work to OEL 7.x, which is Oracle-database-ready AND can be downloaded and updated for free. CentOS just seems too weirdly and obtusely different from the other Enterprise Linux distros to be worth bothering with at the moment.

Updated to add: This isn’t the only point at which CentOS diverges in annoying ways from Red Hat or Oracle’s treatment of what is supposed to be essentially the same distro: try doing an lsb_release -r to see what version your distro reports itself to be. Red Hat reports 7.0. OEL reports 7.0. CentOS, however, decides it will be clever and report 7.0.1406. Version number reporting is important to Asquith, because it determines where you’ll fetch your software from when building client servers. Having one distro decide to be different from all the others is therefore, frankly, rather annoying!

Asquith 1.09

It’s been a long time coming, but I’ve just released a new version of Asquith which now supports installing 11.2.0.4 standalone and clustered databases. Previously, in the 11g product range, it only supported 11.2.0.1 and 11.2.0.3 versions. It still supports 12c, too, of course.

The only change in behaviour over the previous version is that you supply a ORAVER=11204 bootstrap parameter when booting a member server (having previously copied the 11.2.0.4 installation media to your Asquith server first, of course).

It will take a while to update various pages/articles to reflect the new ORAVER option, but hopefully by the end of the weekend I’ll have it all done.

Note that Salisbury doesn’t get this update: Asquith and Salisbury parted ways some time ago.

A Miscellany

It’s been one of those periods of ‘nothing remakarkable ever happens’. So, in desperation, I decided to try to find a blog post or two in the unremarkable instead.

Let’s start with my little HP Folio 13, my near-two-year-old notebook, of which I said in a recent blog piece, “the Folio only has 4GB RAM, so running multiple simultaneous VMs is not really an option: this Oracle will have to run on the physical machine or not at all”

Absolutely accurate as it stands, in that the thing does indeed ship with only 128GB hard disk and 4GB RAM, which is not enough to hold a decent party, let alone run a decent database.

However, I had reckoned without these guys. Their web site tools found me this:

It’s a 250GB mSATA hard drive (mSATA essentially being the innards of an ordinary solid state hard drive without the fancy external casing). At a stroke, and for relatively modest outlay, I was able to double my disk capacity and its speed. Virtualisation on such a storage platform becomes distinctly do-able.

My second purchase was this:

For a mere AU$100, that 8GB stick of laptop RAM doubles the laptop’s existing capacity -and, again at a stroke, makes it more than capable of hosting a 3-machine Oracle RAC.

Fitting these goodies was not a piece of cake, I have to say, what we me being blessed with fingers that are as dainty as a french Boulangerie’s Baguette-Rex. For the most part, I followed the instructions provided by this kind Internet soul without incident, though I still managed to rip out the connector ribbons that make minor details like the keyboard and monitor work in my heavy-handed case opening attempts. I’m pleased to report, however, that the relevant connectors appear to have been designed with complete Klutzes in mind, so I was able to reconnect them when required and the laptop is now operating normally once more.

So now I am blessed with a 16GB, 1.5TB SSHD monster of a Toshiba laptop for running anything serious (for example, a 2-node RAC and 2-node Data Guard setup, practicing for patches, failovers and switchovers). It is technically portable, and so I can brace my neck and arms and lug into work on the train if I have to.

But with the peanut-sized hardware upgrades mentioned here, however clumsily fitted by yours truly, I am now additionally blessed with an 8GB, 250GB SSHD svelte, bare-noticeable HP ultrabook that I can carry around for hours and not mind… and it’s good enough to run a Windows virtual machine with SQL Server and a 2-node Oracle RAC, so practicsing patching, SQL Server→Oracle replication and such database-y things is trivially easy, without breaking my neck or upper arms.

It’s nice to have rescued a near-two-year-old ultrabook from oblivion, too, because with the additional hardware has not only extended the original machine’s technical capacity, it’s just about doubled its useful lifetime, too.

Flushed with my new hardware capabilities, then, I recently decided to dry-rehearse the update of an Oracle 11.2.0.3.0 RAC to 11.2.0.3.9 (i.e., by applying the January 2014 CPU patchset to it, which for Grid+RAC purposes is patch 17735354). It didn’t go awfully well, to be honest -and the reason it didn’t go very well was instructive!

The basic process of applying a Grid+RAC patch to a node is:

  1. Copy the patchfile to an empty directory owned by the oracle user (I used /home/oracle/patches), and unzip it there
  2. Make sure the /u01/app/grid/OPatch and /u01/app/oracle/product/11.2.0/db_1/OPatch directories on all nodes are wiped and replaced with the latest unzipped p6880880 download (that gets your patching binaries right)
  3. Create an ‘ocm response file’ by issuing the command /u01/app/grid/OPatch/ocm/bin/emocmrsp -no_banner -output /home/oracle/ocm.rsp (on all nodes)
  4. Become the root user, set your PATH to include /u01/app/grid/OPatch and then launch opatch auto /home/oracle/patches -ocmrf /home/oracle/ocm.rsp

After you launch the patch application utility at Step 4, it’s all supposed to be smooth sailing. Unfortunately, whenever I did this on Gamow (the primary node of my standby site and thus the first site to be patched in a ‘standby first’ scenario), I got this result:

2014-02-17 12:56:45: Starting Clusterware Patch Setup
Using configuration parameter file: /u01/app/grid/crs/install/crsconfig_params

Stopping RAC /u01/app/oracle/product/11.2.0/db_1 ...
Stopped RAC /u01/app/oracle/product/11.2.0/db_1 successfully

patch /home/oracle/patches/17592127/custom/server/17592127  apply successful for home  /u01/app/oracle/product/11.2.0/db_1 
patch /home/oracle/patches/17540582  apply successful for home  /u01/app/oracle/product/11.2.0/db_1 

Stopping CRS...
Stopped CRS successfully

patch /home/oracle/patches/17592127  apply failed  for home  /u01/app/grid

Starting CRS...
CRS-4123: Oracle High Availability Services has been started.
Failed to patch QoS users.

Starting RAC /u01/app/oracle/product/11.2.0/db_1 ...
Started RAC /u01/app/oracle/product/11.2.0/db_1 successfully

opatch auto succeeded.

If you read it fast enough, you might just glance at the last line there and think everything is tickety-boo: “opatch auto succeeded”, after all! You might even scan through some of the lines shown getting to that point which say happy things like, “17592127 apply successful for home /u01/app/oracle/product/11.2.0/db_1” and conclude that all’s well. But a keener eye is needed to notice that *one* line says “17592127 apply failed for home /u01/app/grid” and another mentions something about having “Failed to patch QoS users” . So what’s going on: is opatch being successful or not?

The answer lies in the log file which it tells you it’s created. Mine had this sort of stuff in it:

2014-02-17 13:06:51: Successfully removed file: /tmp/fileS5bCZV
2014-02-17 13:06:51: /bin/su exited with rc=1

2014-02-17 13:06:51: Error encountered in the command /u01/app/grid/bin/qosctl -autogenerate
>  Syntax Error: Invalid usage
>
>  Usage: qosctl <username> <command>
>
>    General
>      username - JAZN authenticated user. The users password will always be prompted for.
>
>    Command are:
>      -adduser <username> <password> |
>      -checkpasswd <username> <password> |
>      -listusers |
>      -listqosusers |
>      -remuser <username> |
>      -setpasswd <username> <old_password> <new_password> |
>      -help 
>
>  End Command output
2014-02-17 13:06:51: Running as user oracle: /u01/app/grid/bin/crsctl start resource ora.oc4j
2014-02-17 13:06:51: s_run_as_user2: Running /bin/su oracle -c ' /u01/app/grid/bin/crsctl start resource ora.oc4j '
2014-02-17 13:07:06: Removing file /tmp/file102UrG
2014-02-17 13:07:06: Successfully removed file: /tmp/file102UrG
2014-02-17 13:07:06: /bin/su successfully executed

Again, that last line shows opatch has a nasty habit of declaring success at the drop of a hat! It may distract you from seeing that there’s been a syntactical problem: the patch tool was trying to execute qosctl -autogenerate and encountered a syntax error instead. Clearly, the qosctl program didn’t like “autogenerate” as a command switch. Perhaps at this point you think, “Another fine Oracle stuff-up, but as I don’t use Quality of Service features anyway, this won’t be of significance to me”.

Unfortunately, it will -because the syntax error here is not really what you’re supposed to be looking at. The syntax error is the clue: this autogenerate command would be syntactically correct if the qosctl binaries had been patched to 11.2.0.3.9 (because the autogenerate switch was introduced somewhere around 11.2.0.3.5). So it can only be a syntactical error if the binaries haven’t been patched successfully. And if this particular qosctl binary wasn’t patched, there’s a very good chance that some other binaries that you do make use of will have been skipped too.

But to see evidence for whether that’s a problem or not, you have to look upwards in the patching log, and keep a sharp eye out for this:

2014-02-17 13:05:22: The apply patch output is Oracle Interim Patch Installer version 11.2.0.3.6
 Copyright (c) 2013, Oracle Corporation.  All rights reserved.

 Oracle Home       : /u01/app/grid
 Central Inventory : /u01/app/oraInventory
    from           : /u01/app/grid/oraInst.loc
 OPatch version    : 11.2.0.3.6
 OUI version       : 11.2.0.3.0
 Log file location : /u01/app/grid/cfgtoollogs/opatch/opatch2014-02-17_13-05-18PM_1.log

 Verifying environment and performing prerequisite checks...
 Prerequisite check "CheckSystemSpace" failed.
 The details are:
 Required amount of space(6601.28MB) is not available.
 UtilSession failed:
 Prerequisite check "CheckSystemSpace" failed.
 Log file location: /u01/app/grid/cfgtoollogs/opatch/opatch2014-02-17_13-05-18PM_1.log

 OPatch failed with error code 73

2014-02-17 13:05:22: patch /home/oracle/patches/17592127  apply failed  for home  /u01/app/grid

So this comes from about 1 minute before the qosctl syntax error report… and is clearly the source of the original ‘failed to apply’ error that was displayed as part of opatch’s screen output. And the cause for that error is now apparent: the patch failed because a ‘CheckSystemSpace’ prerequisite failed. Or, in plain English, I haven’t got enough free disk space to apply this patch.

If you’re like me, that will surprise you. My file system has a reasonable amount of free space, after all:

[[email protected] db_1]$ df -h
Filesystem         Size  Used Avail Use% Mounted on
/dev/sda2           21G   15G  5.3G  74% /
tmpfs              1.9G  444M  1.5G  24% /dev/shm
balfour:/griddata   63G  3.1G   57G   6% /gdata
balfour:/dbdata     63G  3.1G   57G   6% /ddata

5.3GB of free space is not exactly generous, but it’s non-trivial, too… and yet it seems not to be enough for this patch to feel comfortable.

Anyway, to cut a long story short(er): never just focus on the bleeding obvious errors reported by OPatch. Dig deeper, look harder …you’ll probably find something which explains that the obscure-stated “failed to patch QoS users” is actually just a plea for more disk space.

I’ll wrap this blog piece up to say that I deliberately create my RAC nodes with only 25GB hard disks (it says so in the instructions!). I wondered after this experience whether I’d need to modify my Salisbury and Asquith articles to specify a larger hard disk size than that…. but actually, it turns out not to be necessary. Instead, make sure you delete the contents of the /osource directory before you start patching (that means wiping out the biinaries needed for installing Oracle and Grid… by now, you need neither, of course). If you do this, therefore:

[[email protected] osource]$ cd grid
[[email protected] grid]$ rm -rf *
[[email protected] grid]$ cd ..
[[email protected] osource]$ cd database
[[email protected] database]$ rm -rf *
[[email protected] database]$ df -h
Filesystem         Size  Used Avail Use% Mounted on
/dev/sda2           21G   12G  8.2G  59% /
tmpfs              1.9G  444M  1.5G  24% /dev/shm
balfour:/griddata   63G  3.1G   57G   6% /gdata
balfour:/dbdata     63G  3.1G   57G   6% /ddata

…then I can promise you that 8.2GB of free space is adequate and the 11.2.0.3.9 PSU will be applied without error, second time of asking.

Of course, you may prefer simply to increase the size of the hard disk you’re working on so that there’s loads of free space, regardless of whether you delete things or not. That’s the approach I first took, too… and I ran into all sorts of problems when I tried it. But that’s a story for another blog piece, I think!

Asquith and the new Red Hat

Whilst I was busy planning my Paris perambulations, Red Hat went and released version 6.5 of their Enterprise Server distro. Oracle swiftly followed …and, even more remarkably, CentOS managed to be almost equally as swift, releasing their 6.5 version on December 1st. Scientific Linux has not yet joined this particular party, but I assume it won’t be long before they do.

I had also assumed Asquith would work unchanged with the new distro -but I hadn’t banked on the clumsy way I originally determined the distro version number which actually meant it all fell into a nasty heap of broken. Happily, it only took a minute or so to work out which bit of my crumbly code was responsible and that’s now been fixed.

Asquith therefore has been bumped to a new 1.07 release, making it entirely compatible with any 6.5 Red Hat-a-like distro (and any future 6.x releases, come to that).

Another feature of this release is that the ‘speedkeys’ parameters have been altered so that they assume the use of version 6.5 of the relevant distro. That is, if you build your RAC nodes by using a bootstrap line that reads something like ks=http://192.168.8.250/kickstart.php?sk=1…, then you’ll be assumed to be using a 6.5 distro and the source OS for the new server will be assumed to reside in a <distro>/65 directory.

If you want to continue using 6.4 or 6.3 versions, of course, you can still spell that out (ks=http://192.168.8.250/kickstart.php?distro=centos&version=63…). You just can’t use speedkeys to do it.

An equivalent update to Salisbury has also just been released.

Archibald Primrose, cut-throat, thief and leader of the infamous Slethwick Street gang of nineteenth century East London pick-pockets was…

Er, no.

Sorry… wrong notes. That’s actually Archibald Primrose, 5th Earl of Rosebery, sometime Prime Minister of the United Kingdom of Great Britain and Ireland (as it was back then).

An easy mistake to make, I rather think, all the same.

Anyway, “Slasher” Rosebery makes it to these pages because his name is associated with the secondary storage server a Data Guard environment will need to use. In the language of this blog, Rosebery is to Asquith what Balfour is to Salisbury: the secondary server in an Active Data Guard configuration using ASM via iSCSI shares. A new article on how to build one has just gone up.

Introducing Asquith

In life, Herbert Henry Asquith was prime minister of the United Kingdom from 1908 to 1916.

In the context of this blog, however, his is the name that will be attached to a new way of auto-building Oracle servers, of the standalone, RAC and RAC+Data Guard variety.

Salisbury, of course, has been doing that job for several months now, so why the need for Asquith? Well… Salisbury works fine… but is maybe not very realistic, in the sense that Salisbury’s use of NFS for shared storage has put some people off. So Asquith is effectively the same as Salisbury -except that he uses ASM for his shared storage, not NFS.

In my view, that perhaps makes him a little more ‘realistic’ than the Salisbury approach, but definitely results in a more useful learning environment (because now you can get to play with the delights of ASM disk groups and so forth, which is an important part of managing many production environments these days).

1. Asquith v. Salisbury

Other than his choice of storage, however, Asquith is pretty much identical to Salisbury: an Asquith server, just like a Salisbury server, provides NTP, DNS and other network services to the ‘client servers’, which can be standalone Oracle servers, part of a multi-node RAC or even part of a multi-node, multi-site Data Guard setup. If you’re doing RAC, the shared storage needed by each RAC node is provided by Asquith acting as an iSCSI target. The clients act in their turn as iSCSI initiators.

The only other significant difference between Salisbury and Asquith is that Asquith never auto-builds a database for you, not even in standalone mode. I figured that if you’re going to go to the trouble of using ASM, you’re doing ‘advanced stuff’, and don’t need databases auto-created for you. If automatic-everything is what you’re after, therefore, stick to using Salisbury. For this reason, too, Asquith does not provide an auto-start script for databases: since it uses ASM, it’s assumed you’ll install Oracle’s Grid software -and that provides the Oracle Restart utility which automates database restarts anyway. A home-brew script is therefore neither needed nor desirable.

All-in-all, Asquith is so similar to Salisbury that I’ve decided that the first release of Asquith should be called version 1.04, because that’s the release number of the current version of Salisbury. They will continue to be kept in lock-step for all future releases.

And this hopefully also makes it clear that Asquith doesn’t make Salisbury redundant: both will continue to be developed and updated, and each complements the other. It’s simply a question of which shared storage technology you prefer to use. If you like the simplicity of NFS and traditional-looking file systems, use Salisbury. If you want to learn and get familiar with ASM technology, then use Asquith. Each has its place, in other words, and both are useful.

2. Building an Asquith Server

In true Salisbury fashion, the job of building the Asquith server itself is completely automated, apart from you pointing to the asquith.ks kickstart file when first building it.

Your Asquith server can run OEL 6.x, Scientific Linux 6.x or CentOS 6.x -where x is either 3 or 4. In all cases, only 64-bit OSes are allowed. The Oracle versions its supports, like Salisbury, are 11.2.0.1, 11.2.0.3 or 12.1.0.1 The Asquith server needs a minimum of 60GB disk space, 512MB RAM, one network card and two DVD drives. The O/S installation disk goes in the first one; the Asquith ISO goes in the second.

The server is built by hitting <Tab> when the installation menu appears, and typing this on the bootstrap line:

ks=hd:sr1/asquith.ks

Once built, you need to copy your Oracle software to the /var/www/html directory of the new Asquith server, using file names of a specific and precise format. Depending on which version you intend to install on other client servers, you need to end up with files called:

  • oradb-11201-1of2.zip
  • oradb-11201-2of2.zip
  • oragrid-11201.zip
  • oradb-11203-1of2.zip
  • oradb-11203-2of2.zip
  • oragrid-11203.zip
  • oradb-12101-1of2.zip
  • oradb-12101-2of2.zip
  • oragrid-12101-1of2.zip
  • oragrid-12101-2of2.zip

You can, of course, have all 10 files present in the same /var/www/html directory if you intend to build a variety of Oracle servers running assorted different Oracle versions.

You can additionally (but entirely optionally) copy extra O/S installation media to the /var/www/html directory if you want future ‘client’ servers to use an O/S different to that used to build Asquith itself. Asquith automatically copies its own installation media to the correct sub-directories off that /var/www/html folder -so if you used CentOS 6.4 to build Asquith, you’ll already have a /var/www/html/centos/64 directory from which clients can pull their installation media. You would need to copy the DVD1 installation media for OEL and Scientific Linux to corresponding “oel/xx” and “sl/xx” sub-directories if you wanted to use all three Red Hat clones for the ‘client’ servers (where ‘xx’ can be either 63 or 64).

3. Building Asquith Clients

When building Asquith clients, you need to boot them with appropriate, locally-attached installation media. The netinstall disks for each distro are suitable, for example.The distro/version you boot with will be the distro/version your Asquith client will end up running. You cannot, for example, boot with a Scientific Linux netinstall disk, point it at Asquith and hope to complete a CentOS kickstart installation. As a consequence, what you boot your clients with must match something you’ve already copied to Asquith in full. If you boot a client with an OEL 6.4 netinstall disk, the DVD 1 media for Oracle Enterprise Linux 6.4 must already have been copied to Asquith’s own /var/www/html/oel/64 directory, in other words.

4. Asquith Bootstrap Parameters

You build an Asquith client by again pressing <Tab> on the boot menu at initial startup and then passing various parameters to the bootstrap line that’s then revealed. All bootstrap lines must start:

ks=http://192.168.8.250/kickstart.php?

You then add additional parameters as follows:

Parameter Compulsory? Possible Values (case sensitive)
distro Yes centos, oel or sl
version Yes 63 or 64
hostname No any valid name for the server being built
domain No any valid domain name of which the server is a part
rac No Is this server to be part of a RAC? If so, it will find its shared storage on the Asquith server. If not, no shared storage will be configured (any future database would be stored on the local server’s disk).
ip No IP of the server (the public IP if a RAC)
ic No IP of the server’s interconnect (if it’s to be part of a RAC)
dg No Is this server to be part of a Data Guard site? If so, it will find its shared storage on a Rosebery server, not on Asquith.

The parameters can come in any order, separated by ampersands (i.e., by the & character), and there must be no spaces between them. For example:

ks=http://192.168.8.250/kickstart.php?distro=centos&version=64&hostname=my_racnode&domain=dizwell.com&rac=y&ip=16.25.34.23&ic=10.0.0.2

(That example might wrap here, but is in fact typed continuously, without any line breaks or spaces).

Note that “rac=” and “dg=” are mutually exclusive. One causes the built server to use Asquith as its source of shared storage; the other directs the server to use Rosebery for its shared storage (I’ll talk more about Rosebery in Section 7 below). If your Data Guard servers are themselves to be part of a cluster, therefore, you just say “dg=y”, not “rac=y&dg=y”.

After you construct an appropriate bootstrap line, you must additionally add three space-separated Kickstart constants, as follows:

Constant Compulsory?
ksdevice= No eth0, eth1 or any other valid name for a network interface
oraver= Yes 11201, 11203, 12101 or NONE
filecopy= No y or n

ksdevice and filecopy are only relevant if you’re building a RAC: a RAC node must have two network cards, and you use ksdevice to say which of them should be used for installation purposes. The usual answer is eth0. If you miss this constant off, the O/S installer itself will prompt you for the answer, so you only need to supply one now if you want a fully-automated O/S install.

The second node of a RAC needs to have paths and environment variables set up in anticipation of Oracle software being ‘pushed’ to it from the primary node -but it itself doesn’t need a direct copy of the Oracle installation software. Hence ‘filecopy=n’ will suppress the copying of the oradb…zip files from Asquith to the node. If you miss this constant off, an answer of ‘y’ will default, which will mean about 4GB of disk space may be consumed unnecessarily. It’s not the end of the world if it happens, though.

The oraver constant is required, though. It lets the server build process create appropriate environment variables and directories, suitable for running Oracle eventually. You can only specify 11201, 11203 or 12101 depending on which version of Oracle you intend, ultimately, to run on the new server. If you don’t ever intend to run Oracle on your new server, you can say “oraver=none”, and after a basic O/S install, nothing else will be configured on the new server.

A complete bootstrap line, suitable for the first node of an intended 2-node RAC, might therefore look like this:

ks=http://192.168.8.250/kickstart.php?distro=centos&version=64&hostname=my_racnode1&domain=dizwell.com&rac=y&ip=16.25.34.21&ic=10.0.0.1 oraver=12101 filecopy=y ksdevice=eth0

Notice there are spaces between the three constants, and between them and the original part of the bootstrap line. Here’s another example, this time for the second node of a Data Guard RAC:

ks=http://192.168.8.250/kickstart.php?distro=oel&version=63&hostname=my_dgnode2&domain=dizwell.com&dg=y&ip=16.25.34.26&ic=10.0.0.6 oraver=12101 filecopy=n ksdevice=eth0

5. Asquith Speed Keys

It’s not really that much typing when you come to do it, but if you want to make things even quicker, there are four ‘speed keys’ available to you:

Speed Key Effect
sk=1 The server will be called alpher.dizwell.home, with IP 192.168.8.101 and Interconnect IP of 10.0.0.101. It will run as the first node of a RAC and is configured to look to Asquith as its shared storage source.
sk=2 The server will be called bethe.dizwell.home, with IP 192.168.8.102 and Interconnect IP of 10.0.0.102. It will run as the second node of a RAC and is configured to look to Asquith as its shared storage source.
sk=3 The server will be called gamow.dizwell.home, with IP 192.168.8.103 and Interconnect IP of 10.0.0.103. It will run as the first node of a RAC but is configured to look to Rosebery as its shared storage source.
sk=4 The server will be called dalton.dizwell.home, with IP 192.168.8.104 and Interconnect IP of 10.0.0.104. It will run as the first node of a RAC and is configured to look to Rosebery as its shared storage source.

If you want to use one of these speed keys, your bootstrap line becomes:

ks=http://192.168.8.250/kickstart.ks?sk=2 oraver=11203 filecopy=n ksdevice=eth0

Note that you still have to supply the three Kickstart constants -but at least you don’t have to supply any of the normal parameters. In fact, you only have to supply the oraver constant, so it could be even shorter to type, if you’d prefer.

6. Creating Databases and Clusters

All Asquith client servers end up being created with a root user, whose password is dizwell and an oracle user whose password is oracle. Use the operating system’s own passwd command to alter those after the O/S installation is complete if you like.

All Asquith client servers are also built with an appropriate set of Oracle software (if requested), stored in the /osource directory. Grid/Clusterware will be in the /osource/grid directory and the main Oracle RDBMS software will be in the /osource/databasedirectory. Your job is therefore simply to launch the relevant installer, like so:

/osource/grid/runInstaller
/osource/database/runInstaller

If you don’t want to run a RAC or use ASM, just pretend the grid software’s not there! If you do, standard operating procedures apply:

  • Run the /osource/grid/runInstaller
  • Do an advanced installation
  • Select to use ASM, keep the default DATA diskgroup name
  • Change the Disk Discovery Path to be /dev/asm*
  • Use External redundancy levels (at this stage, Asquith doesn’t do redundancy)
  • Click ‘Ignore All’ if any ‘issues’ are discovered
  • Run the root scripts on the various nodes when prompted

Once the Clusterware is installed, you can install the database in the usual way:

  • Run /osource/database/runInstaller
  • Do a typical installation
  • Select to use Automatic Storage Management -the DATA disk group should be automatically available
  • Supply passwords where appropriate
  • Ignore any prerequsite failures
  • Run the root script when prompted.

It’s all pretty painless, really -which is precisely the point!

7. Rosebery

Just as a Salisbury server is accompanied by a Balfour server when building a Data Guard environment, so Asquith has his Rosebery. (Archibald Primrose, 5th Earl of Rosebery, Prime Minister of Great Britain 1895-1896). A Rosebery server is built in the same way as an Asquith server (that is, 60GB hard disk minimum; 512MB RAM minimum, 1 NIC), but doesn’t need a second DVD drive from which to find its kickstart file: for that, you simply point it at Asquith.

The bootstrap line to build a Rosebery server is thus:

ks=http://192.168.8.250/rosebery.ks

After that, the Rosebery server builds automatically. It then provides a new iSCSI target for client servers built with the dg=y parameter in their bootstrap lines to connect to. In short, Rosebery provides shared storage to clients, just as Asquith does -and therefore provides a secondary, independent storage sub-system for Data Guard clients to make use of.

8. Conclusion

Asquith (and Rosebery) provide a conveniently-built infrastructure in which Standalone, RAC and Data Guard Oracle servers can be constructed with ease. It automates away a lot of the network and storage “magic” that is usually the preserve of the professional Systems Administrator, leaving the would-be Oracle Database Administrator to concentrate on actual databases! By employing ASM as its shared storage technology, Asquith/Rosebery allow the DBA to explore and learn an important aspect of modern Oracle database management.

I’ll be putting up a section of the site for Asquith to match the one that already exists for Salisbury. Until then, the only place for any Asquith documentation (and the only link to download the all-important Asquith ISO) is this article itself.