Churchill on Windows?!

I’ve had many requests over the years to repeat my ‘Churchill Framework’ on Windows, “Churchill” being my mostly-automated way of building a virtual RAC using Linux as the operating system of choice.

I’ve always refused: if you want a desktop RAC on your Windows PC, why not just deploy Churchill ‘proper’ and have three virtual machines running CentOS. It’s a RAC, and it’s still “on” Windows, isn’t it?!

Well, of course, that wasn’t quite the point my correspondents were making. They wanted a desktop RAC running on top of purely Windows operating systems. They aren’t Linux users, and they’re not interested in working at a command line. Could I please oblige?

Again, I’ve always said no, because Windows costs lots of money. It’s easy to build a 3-node or a 6-node setup in Linux, because you aren’t paying $1000 a pop every time you install your operating system! It seemed to me that RAC-on-Windows was a nice idea (I had it working back in 2001 with 9i on Windows 2000 after all), but it wasn’t very practical as a learning platform.

Happily for my correspondents, I’ve now changed my view in that regard. All the Windows-based would-be DBAs of my acquaintance are working for companies that supply them with MSDN subscriptions. And Microsoft’s Technet evaluation options allow even people with no MSDN access to download and use Windows Server 2012 and beyond for free, for at least 6 months.

So I’ve given in. There’s now available a new article for doing Desktop RAC using nothing but Windows. It bears a passing resemblance to ‘proper’ Churchill: there are three servers to build, with one acting as the supplier of shared storage and needed network services to the others. There’s even the use of iSCSI to provide the virtual shared storage layer. But it’s about as non-Churchill as it gets, really, because everything is hand-built… which explains the enormous number of screenshots and the overall length of the article!

Hypothetical

Suppose that about six weeks ago you, as a proactive kind of DBA, had noticed that your 2TB database was running at about 80% disk usage and had accordingly asked the SysAdmin to provision an additional 2TB slice of the SAN so that you could then add a new ASM disk to your database.

Imagine that the SysAdmin had provisioned as requested, and you as the DBA had applied the change in the form of adding a new ASM disk to your production instance -and that, in consequence, you’d been running at a much healthier 50% disk usage ever since. You’d probably feel pretty good at having been so proactive and helpful in avoiding space problems, right?

Suppose that weeks pass and it is now late October…

Now imagine that for some reason or other that made sense at the time, you kick off a new metadata-only Data Pump export which, workplace distractions being commonplace, you lose sight of, until 6 hours after you started it, you get told there’s a Sev 1 because the non-ASM, standard file system to which your data_pump_dir points has hit 100% usage and there’s no more free space. Foolish DBA!

But no matter, right? You just kill the export job, clear up the relevant hard disk… suddenly the OS is happy there’s space once more on its main hard disk.

But pile up the hypotheticals: the OS reports itself happy, but suppose you nevertheless discover that as a result of the space problems caused by your export, none of the nodeapps are listed as running on Node 1 and any attempt to start them with svrctl on node ends with an error message to the effect that it can’t contact the OHASD/CRSD on that node.

Suppose GV$INSTANCE still returns a count of 2: Node 1 is therefore still up, but no-one can connect to it, because no services are able to run on it. Basically, your Node 1 has kind-of left the building and the only possibility of getting it back, you might reasonably think, would be a whole node reboot. Thank God Node 2 is still up and has no difficulty working alone for a few hours! It’s good enough to cope with the rest of the day’s workload anyway.

So, in this hypothetical house of horrors, suppose that you arrange a schedule outage in which you will reboot Node 1 and wait for it to come back up as a fully-fledged cluster member once more. It should only be a matter of moments before Node 1 is back to its normal happy state, noticing that the non-ASM disk has loads of space once more, right?

Only, imagine that it doesn’t. Imagine instead that it takes at least 10 minutes to restart and, in fact, it’s response-less at that point and looking like it might take another 10 minutes more. Imagine, indeed, that after another 10 minutes on top of that lot, maybe you look at the ASM alert log for Node 1 and find these entries:

ORA-15032: not all alterations performed
ORA-15040: diskgroup is incomplete
ORA-15042: ASM disk "1" is missing from group number "2"

At this point, hypothetically… you might start adding 2 and 2 together and getting an approximation of 4: for you would know that disk 1 is the new 2TB one that you added to the database way back in September.

But why would that new disk, which has been in daily and heavy use ever since, be posing a problem now, rather than before now? You might start idly wondering whether, potentially, when it was provisioned, it was provisioned incorrectly somehow. This being the first reboot since that time, tonight (for it is now past midnight) is maybe the first opportunity which that mis-provisioning has had a chance to reveal itself?

You might at this point very well make a big mental note: on no account reboot node 2, because if it loses the ability to read ASM disks too the entire primary site will have been destroyed.

It would make for an interesting night, wouldn’t it? Especially if the SysAdmin who did the disk provisioning back in September was no longer available for consultation because he was on paternity leave. In New Zealand.

What might you as the DBA do about this state of affairs? Apart from panic, I mean?!

Well, first I think you might very well get your manager to call the SysAdmin and get him off paternity leave in a hurry -and he might take a quick look over the disks and confirm that he’d partitioned the disk back in September to start from cylinder 0… which is, er… a big no-no.

It is, in fact, perhaps the biggest no-no you can do when provisioning disk space for Oracle ASM. This is because doing so means your physical partition table starts at cylinder 0… but, unfortunately, Oracle’s ASM-specific information gets written at the beginning of the disk you give it, so it over-writes the partition table information with its own ASM-specific data. When ASM data replaces disk partition data… you don’t have any disk partitions anymore. Though you won’t know about it yet, because the disk partition information was read into memory at the time the disk was added and has thus been readable ever since.

To stop that happening, you’re supposed to make sure you start your partitions at something other than cylinder 0. Then Solaris can write partition info literally at cylinder 0, and Oracle’s ASM data can start… NOT at cylinder 0!

Apparently, the only operating system that even allows you to add cylinder-0-partitioned disks is Solaris: Oracle on other operating systems spots the potential for disaster and prevents you from adding it in the first place. Tough luck if, in this hypothetical situation, you’re stuck using Solaris, then!

Until you try and re-mount a disk after a reboot, you don’t know the partition table has been destroyed by Oracle’s ASM shenanigans. The partition information is in memory and the operating system is therefore happy. You can run like this forever… until you reboot the affected server, at which point the ASM over-write of the disk partition information proves fatal.

The second thing you might do is raise a severity 1 SR with Oracle to see if there’s any possible way of fixing the partition table on this disk without destroying it’s ASM-ness. However, Oracle support being what it is, chances are good that they will simply hum-and-haw and make dark noises about checking your backups. (Have you ever restored a 2TB database from tape? I imagine it might take one or two days…or weeks…)

So then you might start thinking: we have a Data Guard set up. Let’s declare a disaster, switch over to the secondary site, and thus free up the primary’s disks for being re-partitioned correctly. And at this point, hypothetically of course, you might then realise that when we added a disk to the ASM groups back in September on primary… er… we probably also did exactly the same on the standby!

This means (or would mean, because this is just hypothetical, right?!) that our disaster recovery site would be just as vulnerable to an inadvertent reboot or power outage as our primary is. And then you’d probably get the sysadmin who’s been contacted by phone to check the standby site and confirm your worst suspicions: the standby site is vulnerable.

At this point, you would have a single primary node running, provided it didn’t reboot for any reason. And a Data Guard site running, so long as it didn’t need to reboot. That warm glow of ‘my data is protected’ you would have been feeling about 12 hours ago would have long since disappeared.

Hypothetically speaking, you’ve just broken your primary and the disaster recovery site you were relying on to get you out of that fix is itself one power failure away from total loss. In which case, your multi-terabyte database that runs the entire city’s public transport system would cease to exist, for at least several days whilst a restore from tape took place.

If only they had decided to use ‘normal redundancy’ on their ASM disk groups! For then you would be able to drop the bad disk forcibly and know that other copies of data stored on the remaining good disks would suffice. But alas, they (hypothetically) had adopted external redundancy, for it runs on a SAN and SANs never go wrong…

At this point, you’ve been up in the wee small hours of the night for over 12 hours, but you might nevertheless come up with a cunning plan: use the fact that node 2 is still up (just!) and get it to add a new, good disk to the disk group and re-balance. The data is distributed off the badly-configured disk onto the new one (which you’ve made triply sure was not partitioned at cylinder 0!)

You could then drop the badly-configured disk, using standard ASM ‘drop disk’ commands. The data would then be moved off the bad disks onto the good ones. You could then remove the bad disk from the ASM array and your Data Guard site would, at least, be protected from complete failure once more.

Of course, Oracle support might tell you that it won’t work, because you can’t drop a disk group with external redundancy… because they seem to have forgotten that the second node is still running. And you’ve certainly never tried this before, so you’re basically testing really critical stuff out on your production standby site first. But what choice do you have, realistically?!

So, hypothetically of course, you’d do it. You’d add a disk, wait for a rebalance to complete (and notice that ASM’s ability to predict when a rebalance operation is finished is pretty hopeless: if it tells you 25 minutes, it means an hour and a half). And then you’d drop a disk and wait for a rebalance to complete. And then you’d reboot one of the Data Guard nodes… and when it failed to come back up, you might slump in the Slough of Despond and declare failure. Managers being by this time very supportive, they might propose that we abandon in-house efforts to achieve a fix, and call in Oracle technical staff for on-site help. And that decision having been taken in an emergency meeting, you might idly re-glance at your Data Guard site and discover that not only is +ASM1 instance up and running after all, but so is the database instance #1. It’s actually all come up fine, but you had lacked the patience to wait for it to sort itself out and had declared failure prematurely. Impatient DBA!

Flushed with the (eventual) success of getting the Data Guard site running on all-known-good-disks, you might want to hurry up and get the primary site repaired in like manner. Only this is a production environment under heavy change management control, so you’ll likely be told it can only be fiddled with at 11pm. So you would be looking at having worked 45 hours non-stop before the fix is in.

Nevertheless, hypothetically, you might manage to stay up until 11pm, perform the same add/rebalance/drop/rebalance/reboot trick on the primary’s node 2… and, at around 3am, discover yourself the proud owner of a fully-functioning 2-node RAC cluster once again.

(The point being here that Node 2 on the primary was never rebooted, though that reboot had been scheduled to happen and the SysAdmin sometimes reboots both nodes at the same time, to ‘speed things up’ a bit! Had it been rebooted, it too would have failed to come back up and the entire primary site would have been lost, requiring a failover from the now-re-protected standby. But since Node 2 is still available, it can still do ASM re-structuring, using the ‘add-good-disk; rebalance; drop bad-disk; rebalance’ technique.)

There might be a little bit of pride at having been able to calmly and methodically work out a solution to a problem that seemed initially intractable. A bit of pleasure that you managed to save a database from having to be restored from tape (with an associated outage measured in days that would have cost the company millions). There might even be a bit of relief that it wasn’t you letting an export consume too much disk space that was the root cause, but a sysadmin partitioning a disk incorrectly weeks ago.

It would make for an interesting couple of days, I think. If it was not, obviously and entirely, hypothetical. Wouldn’t it??!

Asquith 1.09

It’s been a long time coming, but I’ve just released a new version of Asquith which now supports installing 11.2.0.4 standalone and clustered databases. Previously, in the 11g product range, it only supported 11.2.0.1 and 11.2.0.3 versions. It still supports 12c, too, of course.

The only change in behaviour over the previous version is that you supply a ORAVER=11204 bootstrap parameter when booting a member server (having previously copied the 11.2.0.4 installation media to your Asquith server first, of course).

It will take a while to update various pages/articles to reflect the new ORAVER option, but hopefully by the end of the weekend I’ll have it all done.

Note that Salisbury doesn’t get this update: Asquith and Salisbury parted ways some time ago.

Asquith and the new Red Hat

Whilst I was busy planning my Paris perambulations, Red Hat went and released version 6.5 of their Enterprise Server distro. Oracle swiftly followed …and, even more remarkably, CentOS managed to be almost equally as swift, releasing their 6.5 version on December 1st. Scientific Linux has not yet joined this particular party, but I assume it won’t be long before they do.

I had also assumed Asquith would work unchanged with the new distro -but I hadn’t banked on the clumsy way I originally determined the distro version number which actually meant it all fell into a nasty heap of broken. Happily, it only took a minute or so to work out which bit of my crumbly code was responsible and that’s now been fixed.

Asquith therefore has been bumped to a new 1.07 release, making it entirely compatible with any 6.5 Red Hat-a-like distro (and any future 6.x releases, come to that).

Another feature of this release is that the ‘speedkeys’ parameters have been altered so that they assume the use of version 6.5 of the relevant distro. That is, if you build your RAC nodes by using a bootstrap line that reads something like ks=http://192.168.8.250/kickstart.php?sk=1…, then you’ll be assumed to be using a 6.5 distro and the source OS for the new server will be assumed to reside in a <distro>/65 directory.

If you want to continue using 6.4 or 6.3 versions, of course, you can still spell that out (ks=http://192.168.8.250/kickstart.php?distro=centos&version=63…). You just can’t use speedkeys to do it.

An equivalent update to Salisbury has also just been released.

Here’s a curious thing

My 2-node production RAC had been suffering from ‘checkpoint incomplete’ messages in the alert log for a while, so back at the end of October, I finally got off my bottom and bothered to take a look: only to discover the beast had been created with just 2 logs per thread and each of only 50MB.

My ‘standard, do it without even thinking’ approach to online logs has long been: 4 logs per thread minimum, each at least 500MB in size.

So, this database was under-specc’d by quite a long way. No problem: it is easy enough to alter database add logfile thread 1 ‘/blah/blah/log4a.rdo’ size 500M; several times until the requisite number of logs of the right size has been created. Problem solved.

Now, this 2-node RAC happens to be the Primary database in a Primary-Standby Active Data Guard setup. I did idly wonder whether the creation of the 500MB online logs would automatically happen over on the standby site, especially since we long ago issued the command alter system set standby_file_management=auto, but since “online” logs are never used on a genuinely standby database, it didn’t seem important to check it out one way or another. (I should clarify that redo generated by a primary is shipped to an Active Data Guard standby by LGWR and stored in standby redo logs, from where they are read by the managed recovery process, and out of which archived redo logs are thus generated. So standby logs are definitely used at an active data guard standby database, but not the “online” logs… they are there for when disaster strikes and the standby needs to become the new primary).

So, anyway: long story short, I increased the size of the primary’s online logs and didn’t bother to check what had happened over on the standby, Redo continued to flow from the primary to the standby, and a check of the latency of redo transmission showed that all was well (the standby never lagged the primary by more than 12 seconds). All’s well that ends well, I guess.

Except that, one day, for no real reason, I did this:

SQL> archive log list
Database log mode Archive Mode
Automatic archival Enabled
Archive destination +FRA
Oldest online log sequence 23445
Next log sequence to archive 23448
Current log sequence 23448

That’s on the primary node 1. And just for the hell of it, I did the same thing on the standby node 1:

SQL> archive log list
Database log mode Archive Mode
Automatic archival Enabled
Archive destination +FRA
Oldest online log sequence 23244
Next log sequence to archive 0
Current log sequence 23245

And that’s an apparent discrepancy of around 200 archive logs! This worried me, so I checked the alert log of the standby:

Thu Nov 07 11:39:23 2013
RFS[2]: Selected log 10 for thread 1 sequence 23448 dbid -2003148368 branch 798942256
Thu Nov 07 11:39:23 2013
Media Recovery Waiting for thread 1 sequence 23448 (in transit)
Recovery of Online Redo Log: Thread 1 Group 10 Seq 23448 Reading mem 0

…which showed that the standby was actually processing redo from time 23448 or so, which is exactly the ‘time’ being displayed by the archive log list command when run on the primary node 1. So the alert log was saying “no discrepancy”, but the SQL*Plus archive log list command was saying “200 logs out of whack!”.

Puzzled, I dug a little deeper:

SQL> select to_char(first_time,'DD-MON-YYYY HH24:MI')
 2 from v$archived_log where sequence#=23245 and thread#=1;

TO_CHAR(FIRST_TIM
-----------------
25-OCT-2013 10:11

The standby’s response to the archive log list command showed that it thought log 23245 was the last one applied.This query shows that specific log to have been created a couple of weeks ago, on 25th October at 10:11AM. So what happened around then that apparently stalled the increment of the redo log sequence number? Well, here’s primary node 1′s alert log for the relevant time:

Fri Oct 25 10:17:25 2013
alter database add logfile '+DATA/proddb/log1a.rdo' size 500m
Completed: alter database add logfile '+DATA/proddb/log1a.rdo' size 500m

…and that’s me resizing the primary’s online redo logs, at about 10:17AM on 25th October!

Personally, I think this is a bug. The standby was always receiving the latest redo, into its standby logs, as designed. Yet the SQL*Plus command was returning incorrect data, apparently flummoxed by the size discrepancy in the online logs between the primary and standby sites.

But whether it’s a bug or not, the fix-up suggested itself: take the standby out of managed recovery mode (alter database recover managed standby database cancel), switch to manual file management (alter system set standby_file_management=manual) and then add new online redo logs of the right size and drop the originals. When all log groups are of 500MB, simply reverse the process: file management becomes auto once more and recovery of the standby is re-commenced. Net result: output from archive log list immediately ‘catches up’ and starts displaying exactly the log sequence numbers that the primary reports.

Anyway, and happily, I don’t generally go around resizing redo logs more than once in the lifetime of a database. Just be aware that SQL*Plus gets a bit upset if you do and neglect to do it equivalently on both sides of your Data Guard setup.

Archibald Primrose, cut-throat, thief and leader of the infamous Slethwick Street gang of nineteenth century East London pick-pockets was…

Er, no.

Sorry… wrong notes. That’s actually Archibald Primrose, 5th Earl of Rosebery, sometime Prime Minister of the United Kingdom of Great Britain and Ireland (as it was back then).

An easy mistake to make, I rather think, all the same.

Anyway, “Slasher” Rosebery makes it to these pages because his name is associated with the secondary storage server a Data Guard environment will need to use. In the language of this blog, Rosebery is to Asquith what Balfour is to Salisbury: the secondary server in an Active Data Guard configuration using ASM via iSCSI shares. A new article on how to build one has just gone up.

Introducing Asquith

In life, Herbert Henry Asquith was prime minister of the United Kingdom from 1908 to 1916.

In the context of this blog, however, his is the name that will be attached to a new way of auto-building Oracle servers, of the standalone, RAC and RAC+Data Guard variety.

Salisbury, of course, has been doing that job for several months now, so why the need for Asquith? Well… Salisbury works fine… but is maybe not very realistic, in the sense that Salisbury’s use of NFS for shared storage has put some people off. So Asquith is effectively the same as Salisbury -except that he uses ASM for his shared storage, not NFS.

In my view, that perhaps makes him a little more ‘realistic’ than the Salisbury approach, but definitely results in a more useful learning environment (because now you can get to play with the delights of ASM disk groups and so forth, which is an important part of managing many production environments these days).

1. Asquith v. Salisbury

Other than his choice of storage, however, Asquith is pretty much identical to Salisbury: an Asquith server, just like a Salisbury server, provides NTP, DNS and other network services to the ‘client servers’, which can be standalone Oracle servers, part of a multi-node RAC or even part of a multi-node, multi-site Data Guard setup. If you’re doing RAC, the shared storage needed by each RAC node is provided by Asquith acting as an iSCSI target. The clients act in their turn as iSCSI initiators.

The only other significant difference between Salisbury and Asquith is that Asquith never auto-builds a database for you, not even in standalone mode. I figured that if you’re going to go to the trouble of using ASM, you’re doing ‘advanced stuff’, and don’t need databases auto-created for you. If automatic-everything is what you’re after, therefore, stick to using Salisbury. For this reason, too, Asquith does not provide an auto-start script for databases: since it uses ASM, it’s assumed you’ll install Oracle’s Grid software -and that provides the Oracle Restart utility which automates database restarts anyway. A home-brew script is therefore neither needed nor desirable.

All-in-all, Asquith is so similar to Salisbury that I’ve decided that the first release of Asquith should be called version 1.04, because that’s the release number of the current version of Salisbury. They will continue to be kept in lock-step for all future releases.

And this hopefully also makes it clear that Asquith doesn’t make Salisbury redundant: both will continue to be developed and updated, and each complements the other. It’s simply a question of which shared storage technology you prefer to use. If you like the simplicity of NFS and traditional-looking file systems, use Salisbury. If you want to learn and get familiar with ASM technology, then use Asquith. Each has its place, in other words, and both are useful.

2. Building an Asquith Server

In true Salisbury fashion, the job of building the Asquith server itself is completely automated, apart from you pointing to the asquith.ks kickstart file when first building it.

Your Asquith server can run OEL 6.x, Scientific Linux 6.x or CentOS 6.x -where x is either 3 or 4. In all cases, only 64-bit OSes are allowed. The Oracle versions its supports, like Salisbury, are 11.2.0.1, 11.2.0.3 or 12.1.0.1 The Asquith server needs a minimum of 60GB disk space, 512MB RAM, one network card and two DVD drives. The O/S installation disk goes in the first one; the Asquith ISO goes in the second.

The server is built by hitting <Tab> when the installation menu appears, and typing this on the bootstrap line:

ks=hd:sr1/asquith.ks

Once built, you need to copy your Oracle software to the /var/www/html directory of the new Asquith server, using file names of a specific and precise format. Depending on which version you intend to install on other client servers, you need to end up with files called:

  • oradb-11201-1of2.zip
  • oradb-11201-2of2.zip
  • oragrid-11201.zip
  • oradb-11203-1of2.zip
  • oradb-11203-2of2.zip
  • oragrid-11203.zip
  • oradb-12101-1of2.zip
  • oradb-12101-2of2.zip
  • oragrid-12101-1of2.zip
  • oragrid-12101-2of2.zip

You can, of course, have all 10 files present in the same /var/www/html directory if you intend to build a variety of Oracle servers running assorted different Oracle versions.

You can additionally (but entirely optionally) copy extra O/S installation media to the /var/www/html directory if you want future ‘client’ servers to use an O/S different to that used to build Asquith itself. Asquith automatically copies its own installation media to the correct sub-directories off that /var/www/html folder -so if you used CentOS 6.4 to build Asquith, you’ll already have a /var/www/html/centos/64 directory from which clients can pull their installation media. You would need to copy the DVD1 installation media for OEL and Scientific Linux to corresponding “oel/xx” and “sl/xx” sub-directories if you wanted to use all three Red Hat clones for the ‘client’ servers (where ‘xx’ can be either 63 or 64).

3. Building Asquith Clients

When building Asquith clients, you need to boot them with appropriate, locally-attached installation media. The netinstall disks for each distro are suitable, for example.The distro/version you boot with will be the distro/version your Asquith client will end up running. You cannot, for example, boot with a Scientific Linux netinstall disk, point it at Asquith and hope to complete a CentOS kickstart installation. As a consequence, what you boot your clients with must match something you’ve already copied to Asquith in full. If you boot a client with an OEL 6.4 netinstall disk, the DVD 1 media for Oracle Enterprise Linux 6.4 must already have been copied to Asquith’s own /var/www/html/oel/64 directory, in other words.

4. Asquith Bootstrap Parameters

You build an Asquith client by again pressing <Tab> on the boot menu at initial startup and then passing various parameters to the bootstrap line that’s then revealed. All bootstrap lines must start:

ks=http://192.168.8.250/kickstart.php?

You then add additional parameters as follows:

Parameter Compulsory? Possible Values (case sensitive)
distro Yes centos, oel or sl
version Yes 63 or 64
hostname No any valid name for the server being built
domain No any valid domain name of which the server is a part
rac No Is this server to be part of a RAC? If so, it will find its shared storage on the Asquith server. If not, no shared storage will be configured (any future database would be stored on the local server’s disk).
ip No IP of the server (the public IP if a RAC)
ic No IP of the server’s interconnect (if it’s to be part of a RAC)
dg No Is this server to be part of a Data Guard site? If so, it will find its shared storage on a Rosebery server, not on Asquith.

The parameters can come in any order, separated by ampersands (i.e., by the & character), and there must be no spaces between them. For example:

ks=http://192.168.8.250/kickstart.php?distro=centos&version=64&hostname=my_racnode&domain=dizwell.com&rac=y&ip=16.25.34.23&ic=10.0.0.2

(That example might wrap here, but is in fact typed continuously, without any line breaks or spaces).

Note that “rac=” and “dg=” are mutually exclusive. One causes the built server to use Asquith as its source of shared storage; the other directs the server to use Rosebery for its shared storage (I’ll talk more about Rosebery in Section 7 below). If your Data Guard servers are themselves to be part of a cluster, therefore, you just say “dg=y”, not “rac=y&dg=y”.

After you construct an appropriate bootstrap line, you must additionally add three space-separated Kickstart constants, as follows:

Constant Compulsory?
ksdevice= No eth0, eth1 or any other valid name for a network interface
oraver= Yes 11201, 11203, 12101 or NONE
filecopy= No y or n

ksdevice and filecopy are only relevant if you’re building a RAC: a RAC node must have two network cards, and you use ksdevice to say which of them should be used for installation purposes. The usual answer is eth0. If you miss this constant off, the O/S installer itself will prompt you for the answer, so you only need to supply one now if you want a fully-automated O/S install.

The second node of a RAC needs to have paths and environment variables set up in anticipation of Oracle software being ‘pushed’ to it from the primary node -but it itself doesn’t need a direct copy of the Oracle installation software. Hence ‘filecopy=n’ will suppress the copying of the oradb…zip files from Asquith to the node. If you miss this constant off, an answer of ‘y’ will default, which will mean about 4GB of disk space may be consumed unnecessarily. It’s not the end of the world if it happens, though.

The oraver constant is required, though. It lets the server build process create appropriate environment variables and directories, suitable for running Oracle eventually. You can only specify 11201, 11203 or 12101 depending on which version of Oracle you intend, ultimately, to run on the new server. If you don’t ever intend to run Oracle on your new server, you can say “oraver=none”, and after a basic O/S install, nothing else will be configured on the new server.

A complete bootstrap line, suitable for the first node of an intended 2-node RAC, might therefore look like this:

ks=http://192.168.8.250/kickstart.php?distro=centos&version=64&hostname=my_racnode1&domain=dizwell.com&rac=y&ip=16.25.34.21&ic=10.0.0.1 oraver=12101 filecopy=y ksdevice=eth0

Notice there are spaces between the three constants, and between them and the original part of the bootstrap line. Here’s another example, this time for the second node of a Data Guard RAC:

ks=http://192.168.8.250/kickstart.php?distro=oel&version=63&hostname=my_dgnode2&domain=dizwell.com&dg=y&ip=16.25.34.26&ic=10.0.0.6 oraver=12101 filecopy=n ksdevice=eth0

5. Asquith Speed Keys

It’s not really that much typing when you come to do it, but if you want to make things even quicker, there are four ‘speed keys’ available to you:

Speed Key Effect
sk=1 The server will be called alpher.dizwell.home, with IP 192.168.8.101 and Interconnect IP of 10.0.0.101. It will run as the first node of a RAC and is configured to look to Asquith as its shared storage source.
sk=2 The server will be called bethe.dizwell.home, with IP 192.168.8.102 and Interconnect IP of 10.0.0.102. It will run as the second node of a RAC and is configured to look to Asquith as its shared storage source.
sk=3 The server will be called gamow.dizwell.home, with IP 192.168.8.103 and Interconnect IP of 10.0.0.103. It will run as the first node of a RAC but is configured to look to Rosebery as its shared storage source.
sk=4 The server will be called dalton.dizwell.home, with IP 192.168.8.104 and Interconnect IP of 10.0.0.104. It will run as the first node of a RAC and is configured to look to Rosebery as its shared storage source.

If you want to use one of these speed keys, your bootstrap line becomes:

ks=http://192.168.8.250/kickstart.ks?sk=2 oraver=11203 filecopy=n ksdevice=eth0

Note that you still have to supply the three Kickstart constants -but at least you don’t have to supply any of the normal parameters. In fact, you only have to supply the oraver constant, so it could be even shorter to type, if you’d prefer.

6. Creating Databases and Clusters

All Asquith client servers end up being created with a root user, whose password is dizwell and an oracle user whose password is oracle. Use the operating system’s own passwd command to alter those after the O/S installation is complete if you like.

All Asquith client servers are also built with an appropriate set of Oracle software (if requested), stored in the /osource directory. Grid/Clusterware will be in the /osource/grid directory and the main Oracle RDBMS software will be in the /osource/databasedirectory. Your job is therefore simply to launch the relevant installer, like so:

/osource/grid/runInstaller
/osource/database/runInstaller

If you don’t want to run a RAC or use ASM, just pretend the grid software’s not there! If you do, standard operating procedures apply:

  • Run the /osource/grid/runInstaller
  • Do an advanced installation
  • Select to use ASM, keep the default DATA diskgroup name
  • Change the Disk Discovery Path to be /dev/asm*
  • Use External redundancy levels (at this stage, Asquith doesn’t do redundancy)
  • Click ‘Ignore All’ if any ‘issues’ are discovered
  • Run the root scripts on the various nodes when prompted

Once the Clusterware is installed, you can install the database in the usual way:

  • Run /osource/database/runInstaller
  • Do a typical installation
  • Select to use Automatic Storage Management -the DATA disk group should be automatically available
  • Supply passwords where appropriate
  • Ignore any prerequsite failures
  • Run the root script when prompted.

It’s all pretty painless, really -which is precisely the point!

7. Rosebery

Just as a Salisbury server is accompanied by a Balfour server when building a Data Guard environment, so Asquith has his Rosebery. (Archibald Primrose, 5th Earl of Rosebery, Prime Minister of Great Britain 1895-1896). A Rosebery server is built in the same way as an Asquith server (that is, 60GB hard disk minimum; 512MB RAM minimum, 1 NIC), but doesn’t need a second DVD drive from which to find its kickstart file: for that, you simply point it at Asquith.

The bootstrap line to build a Rosebery server is thus:

ks=http://192.168.8.250/rosebery.ks

After that, the Rosebery server builds automatically. It then provides a new iSCSI target for client servers built with the dg=y parameter in their bootstrap lines to connect to. In short, Rosebery provides shared storage to clients, just as Asquith does -and therefore provides a secondary, independent storage sub-system for Data Guard clients to make use of.

8. Conclusion

Asquith (and Rosebery) provide a conveniently-built infrastructure in which Standalone, RAC and Data Guard Oracle servers can be constructed with ease. It automates away a lot of the network and storage “magic” that is usually the preserve of the professional Systems Administrator, leaving the would-be Oracle Database Administrator to concentrate on actual databases! By employing ASM as its shared storage technology, Asquith/Rosebery allow the DBA to explore and learn an important aspect of modern Oracle database management.

I’ll be putting up a section of the site for Asquith to match the one that already exists for Salisbury. Until then, the only place for any Asquith documentation (and the only link to download the all-important Asquith ISO) is this article itself.

Oracle 12c and Salisbury

Version 1.04 of Salisbury is now available. It contains two key enhancements over the previous version: (1) it automatically initialises hard disks, even when they contain no previous partition information; and (2) it works to create standalone, RAC or RAC+Data Guard Oracle Version 12c setups.

I have not updated the Salisbury home page yet, though, to link to the new release (this post is the only place to do so at the moment). That’s because I have yet to update all the associated articles to reflect a bit of syntax-tweaking I’ve had to introduce. Once I do that, I’ll make 1.04 available from the “proper” place.

In the meantime, here’s a quick explanation of that syntax change, brought about because of a silly design flaw I introduced to begin with.

When you’re setting up a Salisbury RAC, you probably and usually want the Oracle software copied across to the first node, but not to the second (because the second node doesn’t need it: it gets the software ‘pushed’ to it during the Oracle installation anyway, from the first, as part of the standard RAC installation process). To accomplish this, I originally had you say ORAVER=1120x on the bootstrap line when building your first node, and ORAVER=NONE when building the second.

Even though you said ORAVER=NONE, I still set up paths and environment variables which are correct for running Oracle 11g …because that was the only version of Oracle then available.

You now see the problem, I hope! Saying ORAVER=NONE certainly tells me you don’t want the Oracle software copied to your new server… but now I don’t know whether I should set paths and variables to expect, eventually, an 11g or 12c installation. The arrival of a new Oracle version creates an ambiguity that using one bootstrap parameter cannot overcome.

The solution was to invent a new bootstrap parameter: FILECOPY=y or FILECOPY=n. It does what it says on the lid: a value of “y” means you do want the Oracle software copied from Salisbury to the new server’s /osource directory. A value of “n” means you don’t. Meanwhile, ORAVER changes meaning ever-so-slightly: it now says what version of Oracle you intend to run, regardless of whether the installation software is to be copied to the new server or not.

In other words, for the first node of a new RAC, you’d say something like:

...oraver=12101 filecopy=y

…and for the second node, you’d say:

...oraver=12101 filecopy=n

This applies to 12c installations and to 11g ones, equally well. Technically, you can still say “ORAVER=NONE”, but this now means you don’t intend to run Oracle at all, so no directories or environment variables associated with running Oracle will be created for you at all. If you’re building 11g RACs using Salisbury, you will need to remember this new need to specify two parameters where one previously sufficed.

Other than that slight change to bootstrap options, everything else remains as it was. In particular, the “speed keys” still work for 12c, just as they did for 11g, so “sk=1″ builds you a node called “alpher” with IP address 192.168.8.101, “sk=2″ builds you “bethe” on 192.168.8.102, and so on.

Of course, you will need to upload 12c software to the Salisbury server before you can build subsidiary Salisbury nodes at all: Oracle themselves made a change here by making the Grid Clusterware come in two zip files instead of one.

As before, you are required to change the names of the downloaded files before Salisbury can make use of them. In the case of 12c, you will need to end up with files named:

  • oradb-12101-1of2.zip
  • oradb-12101-2of2.zip
  • oragrid-12101-1of2.zip
  • oragrid-12101-2of2.zip

Under the hood, as I explained in my last post, I’ve had to relax the NFS settings to be “insecure” so that 12c’s propensity to use Direct NFS doesn’t cause the database creation process to blow up. This new setting back-applies to 11g installations, too -not that you’d notice.

As I say, once I get a chance to update the doco linked to the home page, I’ll link to version 1.04 from there, too. In the meantime, this post will be the only place to link to it. Have fun!

It would be a shame if something happened to it…

I have finally gotten around to documenting the Salisbury approach to building an Active Data Guard set-up (that is, 2-node RAC replicating to a 2-node RAC, with the standby in open read-only mode), thereby protecting your data from anything that might unfortunately befall your production RAC.

The article is here.

The article concludes with ARCH doing the log shipping, which isn’t actually the best way of going about things, though it does achieve a high-availability objective. I’ll follow up shortly with altering protection modes and configuring data guard broker… but the article was so long as it stands that I felt compelled to relegate those subjects to follow-up articles rather than the main billing itself.

Keen eyes will note that the screenshots in the latest article are distinctly different from those in the build-a-2-node-RAC one: it’s what happens when Fedora is wiped from your laptop and Windows 8 replaces it part-way through!