Today is an important day…

As the Ferryman in Benjamin Britten’s Opera “Curlew River” puts it, “Today is an important day”.

For today would have been Britten’s 101st birthday. Exactly one year ago today, I was settling down at the back of the Maltings Concert Hall, Snape for the Centenary concert (and a good one it was, too!) Twenty-six years ago, I was settling down in my seat at the Wigmore Hall for his 75th anniversary concert. And thus it has often been for more than half my life: today is spent playing pretty much nothing but Britten from dawn to dusk, and we pray that ToH thinks to do the vacuuming tomorrow rather than today!

Birthdays are for giving, of course (as I constantly have to remind ToH!) In this case, I’ve decided to release version 1.3 of Churchill, which has now been tested for 11.2.0.3, 11.2.0.4 and 12.1.0.2, for standalone, RAC, RAC+Data Guard and 12c Cloud Control installations. I’ve also taken the opportunity to tidy things up a lot, so necessary files are housed more appropriately, rather than all being plonked into a single directory. There are some more documentation issues that arise as a result of the clean-up, but those are relatively minor and should be done by tomorrow. Assuming I am not made to do the vacuuming tomorrow as penance…

Update 25th November: Beware of birthday gifts bought in a hurry! The new 1.3 ISO of Churchill was missing a key file (the ksh RPM), without which all attempts to run the root scripts at the end of a Grid Infrastructure install would fail. Oops. Now corrected (without incrementing the version number again: call it “1.3 Update 1” if you like… Microsoft can be such an inspiration!).

Gone

As promised, Salisbury and Asquith have been “retired” and have accordingly disappeared from the front page. They can still be reached from the Downloads page, though, should anyone still need them.

Churchill is now very nearly completely documented and replaces both. The only thing still missing is the description of how to create a Wilson server to act as an Enterprise Manager Cloud Control, and that should be finished by the end of the week.

I’ve also set up my own “ownCloud” hosting service and am hosting the Churchill ISO from there rather than from Dropbox. I think it’s all working and the necessary files are available to all, but if you run into any problems, just drop me a line and I’ll get it sorted ASAP.

Salisbury and Asquith, my ‘frameworks’ for automated, nearly-hands-free building of Oracle servers, are retiring. Which is to say, I’m not going to maintain them any more.

My attempts over the years to persuade my System Admin colleagues at work that RAC via NFS (as Salisbury uses) might be a good idea have all fallen on deaf ears, Kevin Closson’s fine articles on the subject notwithstanding. So Salisbury became a bit of a dead end after that, which is why I cooked up Asquith. Asquith uses real iSCSI (as real as anything a virtual environment can cook up, anyway!) and layers ASM on top of that and thus provided me with a playground that much more faithfully reflects what we do in our production environment.

But it’s a pain having two frameworks doing pretty much the same job. So now I’m phasing them out and replacing them with Churchill. The Churchill framework uses NFS (because it’s much easier to automate the configuration of that than it is of iSCSI), but it then creates fake hard disks in the NFS shares and layers ASM on top of the fake hard disks. So you end up with a RAC that uses ASM, but without the convoluted configuration previously needed.

The other thing we do in production at work is split the ownership of the Grid Infrastructure and the Database (I don’t know why they decided to do that: it was before my time. The thing is all administered by one person -me!- anyway, so the split ownership is just an annoyance as far as I’m concerned). Since I’ve been bitten on at least one significant occasion where mono- and dual-ownership models do things differently, I thought I might as well make Churchill dual-ownership aware. You don’t have to do it that way: Churchill will let you build a RAC with everything owned by ‘oracle’ if you like. But it does it by default, so you end up with ‘grid’ and ‘oracle’ users owning different bits of the cluster, unless you demand otherwise.

Other minor changes from Asquith/Salisbury: Churchill doesn’t do Oracle 11.2.0.1 installations, since that version’s now well past support. You can build Churchill infrastructures with 11.2.0.3, 11.2.0.4 and 12.1.0.2. Of those, only the last is freely available from otn.oracle.com.

Additionally, the bootstrap lines have changed a little. You now invoke Alpher/Bethe installations by a reference to “ks.php” instead of “kickstart.php” (I don’t like typing much!). And there’s a new bootstrap parameter: “split=y” or “split=n”. That turns on or off the split ownership model I mentioned earlier. By default, “split” will be “y”.

Finally, I’ve made the whole thing about 5 times smaller than before by the simple expedient of removing the Webmin web-based system administration tool from the ISO download. I thought it was a good idea at the time to include it for Asquith and Salisbury but, in fact, I’ve never subsequently used it and it made the framework ISO downloads much bigger than they needed to be. Cost/benefit wasn’t difficult to do: Webmin is gone (you can always obtain it yourself and add it to your servers by hand, of course).

The thing works and is ready for download right now. However, it will take me quite some time to write up the various articles and so on, so bear with me on that score. All the documentation, as it gets written, will be accessible from here.

The short version, though, is you can build a 2-node RAC and a 2-node Active data guard with six basic commands:

  • ks=hd:sr1/churchill.ks (to build the Churchill Server)
  • ks=http://192.168.8.250/ks.php?sk=1 oraver=11203 ksdevice=eth0 (to build Alpher)
  • ks=http://192.168.8.250/ks.php?sk=2 oraver=11203 filecopy=n ksdevice=eth0 (to build Bethe)
  • ks=http://192.168.8.250/atlee.ks (to build Atlee, the file server for the Data Guard nodes)
  • ks=http://192.168.8.250/ks.php?sk=3 oraver=11203 ksdevice=eth0 (to build Gammow)
  • ks=http://192.168.8.250/ks.php?sk=4 oraver=11203 filecopy=n ksdevice=eth0 (to build Dalton)

With Churchill and the rest of the crew, I can now build a pretty faithful replica of my production environment in around 2 hours. Not bad.

Salisbury and Asquith will remain available from the front page until the Churchill documentation is complete; after that, they’ll disappear from the front page but remain available for download from the Downloads page, should anyone still want them.

Asquith 2.0

I’ve decided: There will be no Asquith 2.0 that runs on Red Hat/CentOS 7.

There are a lot of stumbling blocks, some of which I’ve documented here recently -including things like iSCSI target configurations no longer being easily scriptable, the use of systemd and the use of dynamic names for network devices. No doubt, all of these problems will be resolved over time by the upstream developers, but they currently make it practically impossible to construct a highly-automated, self-building Asquith framework. (Salisbury, needing only NFS, is a much better proposition, but even there the network device naming issue presents automation difficulties).

Since Red Hat 6.x is supported until 2020, I’ll pass on Red Hat 7 and its assorted clones. I rather imagine quite a lot of real-life enterprises might do the same!

Indirection

Another in the series of the fun things that are new about CentOS 7: setting up an iSCSI target. Asquith used to do this with this sort of code:

lvcreate -l 70%VG -n asquith-dbdata vg1
echo "<target iqn.2013-08.home.dizwell:asquith.dbdata>" >> /etc/tgt/targets.conf
echo "       backing-store /dev/vg1/asquith-dbdata" >> /etc/tgt/targets.conf
echo "</target>" >> /etc/tgt/targets.conf
chkconfig tgtd on
service tgtd start

This is all good, old-fashioned stuff involving writing something to a configuration file and then starting a daemon to use it.

But this is not how we do it now. Oh no. In CentOS 7 (and its Red Hat-y and Oracle-y equivalents, of course), we use an interactive tool called targetcli to invoke an independent ‘shell’ in which these sorts of commands are issued instead:

backstores/block create name=dbdata dev=/dev/mapper/vg1-asquith--dbdata
iscsi/ create iqn.2014-07.home.dizwell:asquith.dbdata
cd iscsi/iqn.2014-07.home.dizwell:asquith.dbdata/tpg1
portals/ create
luns/ create /backstores/block/dbdata
set attribute authentication=0
set attribute generate_node_acls=1
exit

…and the final exit there takes you back to your original shell. (I probably should say that targetcli itself is not actually that new, having first been released in about 2009… but it’s now the default way of doing things in the 7 release of Enterprise Linux, and that’s new -at least, to me!).

Anyway, targetcli is definitely nice and easy and there are no services to worry about: it all just starts getting shared by magic. About the only way to check anything is actually working after you’ve issued all those targetcli commands is to do:

netstat -ant

…before and after. If port 3260 is not in use before, but is in use afterwards, then you know it’s working properly.

The real bummer about the new technique, however, is that it’s not really very scriptable. It’s an interactive tool after all, and appears to expect a system admin to be sitting at the end of the keyboard… which is not much use if you’re trying to get this all to happen as part of a Kickstart auto-build, say!

I did work out that the old trick of piping things together will help. For example, if the above commands are re-written slightly to be:

echo "cd /" | targetcli
echo "backstores/block create name=dbdata dev=/dev/mapper/vg1-asquith--dbdata" | targetcli
echo "iscsi/ create iqn.2014-07.home.dizwell:asquith.dbdata" | targetcli
echo "cd iscsi/iqn.2014-07.home.dizwell:asquith.dbdata/tpg1" | targetcli
echo "portals/ create" | targetcli
echo "luns/ create /backstores/block/dbdata" | targetcli
echo "set attribute authentication=0" | targetcli
echo "set attribute generate_node_acls=1" | targetcli

…then each of the commands in double-quotes will be passed through to the targetcli shell in turn and executed in just the same way as if you’d typed things interactively.

Excellent… but I need these commands in a slightly different context. What I’m after is for Kickstart to create a shell script that contains these commands so that it can then execute that shell script later on to automagically set up iSCSI target sharing when building a CentOS 7 Asquith Server. That means I need Kickstart to run commands which create a script which contains these commands. And at that point, I’m asking for the commands to be re-written in the following manner:

echo "echo \"cd /\" | targetcli" >> /root/iscsiconfig.sh
echo "echo \"backstores/block create name=dbdata dev=/dev/mapper/vg1-asquith--dbdata\" | targetcli" >> /root/iscsiconfig.sh
echo "echo \"iscsi/ create iqn.2014-07.home.dizwell:asquith.dbdata\" | targetcli" >> /root/iscsiconfig.sh
echo "echo \"cd iscsi/iqn.2014-07.home.dizwell:asquith.dbdata/tpg1\" | targetcli" >> /root/iscsiconfig.sh
echo "echo \"portals/ create\" | targetcli" >> /root/iscsiconfig.sh
echo "echo \"luns/ create /backstores/block/dbdata\" | targetcli" >> /root/iscsiconfig.sh
echo "echo \"set attribute authentication=0\" | targetcli" >> /root/iscsiconfig.sh
echo "echo \"set attribute generate_node_acls=1\" | targetcli" >> /root/iscsiconfig.sh

The levels of indirection here start to do my head in!

Take the first command: echo “echo \”cd /\“ | targetcli” » /root/iscsiconfig.sh

That’s my earlier echo-and-pipe-to-targetcli command wrapped up in an echo command of its own. Why? Because Kickstart will perform the ‘outer echo’ and thus write a line of text reading just echo “cd /” | targetcli to a shell script called iscsiconfig.sh. So when Kickstart later runs iscsiconfig.sh, the correct targetcli command is finally run.

So basically, we’re nesting an echo inside an echo. Kickstart will run the ‘outer echo’ so that the ‘inner echo’ command gets written to a script file. Only when that script is itself later run will the ‘inner echo’ actually be run and do anything.

Now, the way I’ve written my original inner echoes is perhaps peculiar to me: double quotes make visual sense to me and there’s no major difference in Bash between scripting with double or single quotes. Without a major functional difference, I prefer doubles. But if you start nesting your echoes, the “inner echo” has to escape its double quotes, otherwise they get taken literally, as characters, not command delimiters.

In other words, the command we eventually want to run might be:

echo "cd /" | targetcli

…where the double-quotes are not escaped, because they delimit what is to be echoed. But the command we have to run to get this command written into a shell script is:

echo "echo \"cd /\" | targetcli" >> /root/iscsiconfig.sh

So the ‘outer’ echo is now a command to write the words echo “<something>“ into a shell script -but the double-quotes used by this inner echo have to be regarded as literal text, not as parts of the outer echo command. Hence they need to be preceded by a “\” to turn them into literals (“escaped”, in the lingo).

Is your head hurting yet?! It gets worse (a bit)!

Remember that these echoed echo commands are being written into a shell script. Shell scripts need to start with a line saying where the shell executables are to be found. In the world of the Bourne Again Shell, that means starting things with a line which reads:

#!/bin/bash

Now, we want a command that echoes that into a shell script, before we later go on to execute the shell script. No problems …we just do this:

echo "#!/bin/bash" > /root/iscsiconfig.sh

This is merely as before: we’re wrapping the command we eventually want executed inside an echo statement, using double quotes as delimiters to define what gets echoed, and finishing off with a redirection out to the shell script that will contain the command.

Except that it won’t work:

[[email protected] ~]# echo "#!/bin/bash" > /root/iscsiconfig.sh
-bash: !/bin/bash": event not found

You might think that one or more of the “shebang” characters (the ”#!” at the start of the line being echoed) need escaping, so that something like

echo "\#\!/bin/bash" > /root/iscsiconfig.sh

…would do the trick. And indeed, the above escaped command will “work” instead of producing an ‘event not found’ error, but it doesn’t work very well! Just try displaying the contents of the file created with that modified command:

[[email protected] ~]# cat iscsiconfig.sh
\#\!/bin/bash

This shows you that the escape characters have actually become part of the contents of the shell script, rather than interpreted as escape characters. Present as literals, though, the escape characters mean the shell script can’t actually work when invoked. So this won’t do.

Odd though it might seem at first sight, this behaviour is actually perfectly cromulant and well-documented in the Bash manual. Specifically:

A double quote may be quoted within double quotes by preceding it with a backslash. If enabled, history expansion will be performed unless an ‘!’ appearing in double quotes is escaped using a backslash. The backslash preceding the ‘!’ is not removed.

So what’s the fix? Well, the simplest I can think of is to …er, use single quotes. You’ll find that:

echo '#!/bin/bash' > /root/iscsiconfig.sh

…works in the sense of not itself returning an error AND works in the sense that it writes the correct command into the shell script file we’re trying to create.

But now, at this point, you realise it’s a bit silly to have a mix of single and double-quotes in the same set of commands, so you think that you could go back to the original targetcli commands and re-write them using single quotes (after all, the manual makes it clear that there’s no real difference between single and double quotes except for the way four literal characters are treated).

This means your ‘doing it directly’ commands would be written as:

echo 'cd /' | targetcli
echo 'backstores/block create name=dbdata dev=/dev/mapper/vg1-asquith--dbdata' | targetcli
echo 'iscsi/ create iqn.2014-07.home.dizwell:asquith.dbdata' | targetcli
echo 'cd iscsi/iqn.2014-07.home.dizwell:asquith.dbdata/tpg1' | targetcli
echo 'portals/ create' | targetcli
echo 'luns/ create /backstores/block/dbdata' | targetcli
echo 'set attribute authentication=0' | targetcli
echo 'set attribute generate_node_acls=1' | targetcli

And they do, indeed, all work as advertised in this form. But now you want to apply that ‘layer of indirection’ that comes from the fact that you’re writing a script to write a script… so you might end up with this:

echo '#!/bin/bash' > /root/iscsiconfig.sh
echo 'echo 'cd /' | targetcli' >> /root/iscsiconfig.sh
echo 'echo 'backstores/block create name=dbdata dev=/dev/mapper/vg1-asquith--dbdata' | targetcli' >> /root/iscsiconfig.sh
echo 'echo 'iscsi/ create iqn.2014-07.home.dizwell:asquith.dbdata' | targetcli' >> /root/iscsiconfig.sh
echo 'echo 'cd iscsi/iqn.2014-07.home.dizwell:asquith.dbdata/tpg1' | targetcli' >> /root/iscsiconfig.sh
echo 'echo 'portals/ create' | targetcli' >> /root/iscsiconfig.sh
echo 'echo 'luns/ create /backstores/block/dbdata' | targetcli' >> /root/iscsiconfig.sh
echo 'echo 'set attribute authentication=0' | targetcli' >> /root/iscsiconfig.sh
echo 'echo 'set attribute generate_node_acls=1' | targetcli' >> /root/iscsiconfig.sh

I have to say, I don’t like the look of that, because gut instinct tells me double-quotes are needed there somewhere (and since, in this case, single and double quotes are interchangeable, there’s no reason why you couldn’t come up with a rule that says ‘inner echoes get doubles, outer echoes use singles’ and thus end up with something that looks a bit clearer than the above and yet remains consistent!).

But it will all work.

Which is the main thing 8-o

Though my head still hurts!

(Incidentally, you don’t need to point knowingly, laughing your head off all the while, exclaiming that I should have used a here document technique to avoid all those nested echoes in the first place. It’s true and I know (and knew) it. But lots of single-line echoes are how I start writing stuff. Only when I know it works do I start applying layers of ‘elegance’ by code tidy-ups such as that. This blog was dedicated to those whose heads hurt, not those who can write shell scripts more elegantly than me… there are far too many of those!)

CentOS 7 and Kickstart

I have long since given up hoping that things which work fine in one version of Red Hat/CentOS/etc will continue to work in the next version, unmolested. But it’s still darn’d annoying when stuff you know works fine in version X breaks in slightly mysterious ways in version X+1. Kickstart (the tool for automating CentOS/Red Hat deployments) is a case in point.

A Kickstart file which worked fine for CentOS 5.x, for example, turns out to contain entirely the wrong syntax as far as CentOS 6.x is concerned -so a re-write is required to make it functional once more.

The bad news is that this pattern continues with CentOS 7, to the point where even invoking a Kickstart installation has changed (and accordingly cost me quite a lot of time tracking down precisely where the problem is).

Short version: in the past, you invoked a Kickstart installation by pressing Tab on the first boot menu and then typing something like ks=hd:sr1/kickstart.ks. Now you do it by pressing Tab on the first boot menu and typing ks=cdrom:/dev/sr1:/kickstart.ks. It’s a subtle change but it makes all the difference!

Longer version: the change is explained pretty well in this bug report for Fedora 19 (Red Hat 7 and its clones is really a re-jigged version of Fedora 18/19, so the bug reports for the one often apply to the other).

Actually, the new syntax is clearer and more logical than the old, so I probably shouldn’t complain… but I’m in that sort of the mood at the moment, so I will :-)

There are lots of other changes, too, of course: software package groups have changed, which makes a version 6.x script useless for doing 7.x installs, just for starters.

All of which is by way of explanation for late delivery on promised RH7-ish versions of Asquith and Salisbury. They are coming (update: no they’re not!), but somewhat more slowly than I had anticipated, because of the “fun” I’ve been having with version 7’s Kickstart!

Introducing Asquith

In life, Herbert Henry Asquith was prime minister of the United Kingdom from 1908 to 1916.

In the context of this blog, however, his is the name that will be attached to a new way of auto-building Oracle servers, of the standalone, RAC and RAC+Data Guard variety.

Salisbury, of course, has been doing that job for several months now, so why the need for Asquith? Well… Salisbury works fine… but is maybe not very realistic, in the sense that Salisbury’s use of NFS for shared storage has put some people off. So Asquith is effectively the same as Salisbury -except that he uses ASM for his shared storage, not NFS.

In my view, that perhaps makes him a little more ‘realistic’ than the Salisbury approach, but definitely results in a more useful learning environment (because now you can get to play with the delights of ASM disk groups and so forth, which is an important part of managing many production environments these days).

1. Asquith v. Salisbury

Other than his choice of storage, however, Asquith is pretty much identical to Salisbury: an Asquith server, just like a Salisbury server, provides NTP, DNS and other network services to the ‘client servers’, which can be standalone Oracle servers, part of a multi-node RAC or even part of a multi-node, multi-site Data Guard setup. If you’re doing RAC, the shared storage needed by each RAC node is provided by Asquith acting as an iSCSI target. The clients act in their turn as iSCSI initiators.

The only other significant difference between Salisbury and Asquith is that Asquith never auto-builds a database for you, not even in standalone mode. I figured that if you’re going to go to the trouble of using ASM, you’re doing ‘advanced stuff’, and don’t need databases auto-created for you. If automatic-everything is what you’re after, therefore, stick to using Salisbury. For this reason, too, Asquith does not provide an auto-start script for databases: since it uses ASM, it’s assumed you’ll install Oracle’s Grid software -and that provides the Oracle Restart utility which automates database restarts anyway. A home-brew script is therefore neither needed nor desirable.

All-in-all, Asquith is so similar to Salisbury that I’ve decided that the first release of Asquith should be called version 1.04, because that’s the release number of the current version of Salisbury. They will continue to be kept in lock-step for all future releases.

And this hopefully also makes it clear that Asquith doesn’t make Salisbury redundant: both will continue to be developed and updated, and each complements the other. It’s simply a question of which shared storage technology you prefer to use. If you like the simplicity of NFS and traditional-looking file systems, use Salisbury. If you want to learn and get familiar with ASM technology, then use Asquith. Each has its place, in other words, and both are useful.

2. Building an Asquith Server

In true Salisbury fashion, the job of building the Asquith server itself is completely automated, apart from you pointing to the asquith.ks kickstart file when first building it.

Your Asquith server can run OEL 6.x, Scientific Linux 6.x or CentOS 6.x -where x is either 3 or 4. In all cases, only 64-bit OSes are allowed. The Oracle versions its supports, like Salisbury, are 11.2.0.1, 11.2.0.3 or 12.1.0.1 The Asquith server needs a minimum of 60GB disk space, 512MB RAM, one network card and two DVD drives. The O/S installation disk goes in the first one; the Asquith ISO goes in the second.

The server is built by hitting <Tab> when the installation menu appears, and typing this on the bootstrap line:

ks=hd:sr1/asquith.ks

Once built, you need to copy your Oracle software to the /var/www/html directory of the new Asquith server, using file names of a specific and precise format. Depending on which version you intend to install on other client servers, you need to end up with files called:

  • oradb-11201-1of2.zip
  • oradb-11201-2of2.zip
  • oragrid-11201.zip
  • oradb-11203-1of2.zip
  • oradb-11203-2of2.zip
  • oragrid-11203.zip
  • oradb-12101-1of2.zip
  • oradb-12101-2of2.zip
  • oragrid-12101-1of2.zip
  • oragrid-12101-2of2.zip

You can, of course, have all 10 files present in the same /var/www/html directory if you intend to build a variety of Oracle servers running assorted different Oracle versions.

You can additionally (but entirely optionally) copy extra O/S installation media to the /var/www/html directory if you want future ‘client’ servers to use an O/S different to that used to build Asquith itself. Asquith automatically copies its own installation media to the correct sub-directories off that /var/www/html folder -so if you used CentOS 6.4 to build Asquith, you’ll already have a /var/www/html/centos/64 directory from which clients can pull their installation media. You would need to copy the DVD1 installation media for OEL and Scientific Linux to corresponding “oel/xx” and “sl/xx” sub-directories if you wanted to use all three Red Hat clones for the ‘client’ servers (where ‘xx’ can be either 63 or 64).

3. Building Asquith Clients

When building Asquith clients, you need to boot them with appropriate, locally-attached installation media. The netinstall disks for each distro are suitable, for example.The distro/version you boot with will be the distro/version your Asquith client will end up running. You cannot, for example, boot with a Scientific Linux netinstall disk, point it at Asquith and hope to complete a CentOS kickstart installation. As a consequence, what you boot your clients with must match something you’ve already copied to Asquith in full. If you boot a client with an OEL 6.4 netinstall disk, the DVD 1 media for Oracle Enterprise Linux 6.4 must already have been copied to Asquith’s own /var/www/html/oel/64 directory, in other words.

4. Asquith Bootstrap Parameters

You build an Asquith client by again pressing <Tab> on the boot menu at initial startup and then passing various parameters to the bootstrap line that’s then revealed. All bootstrap lines must start:

ks=http://192.168.8.250/kickstart.php?

You then add additional parameters as follows:

Parameter Compulsory? Possible Values (case sensitive)
distro Yes centos, oel or sl
version Yes 63 or 64
hostname No any valid name for the server being built
domain No any valid domain name of which the server is a part
rac No Is this server to be part of a RAC? If so, it will find its shared storage on the Asquith server. If not, no shared storage will be configured (any future database would be stored on the local server’s disk).
ip No IP of the server (the public IP if a RAC)
ic No IP of the server’s interconnect (if it’s to be part of a RAC)
dg No Is this server to be part of a Data Guard site? If so, it will find its shared storage on a Rosebery server, not on Asquith.

The parameters can come in any order, separated by ampersands (i.e., by the & character), and there must be no spaces between them. For example:

ks=http://192.168.8.250/kickstart.php?distro=centos&version=64&hostname=my_racnode&domain=dizwell.com&rac=y&ip=16.25.34.23&ic=10.0.0.2

(That example might wrap here, but is in fact typed continuously, without any line breaks or spaces).

Note that “rac=” and “dg=” are mutually exclusive. One causes the built server to use Asquith as its source of shared storage; the other directs the server to use Rosebery for its shared storage (I’ll talk more about Rosebery in Section 7 below). If your Data Guard servers are themselves to be part of a cluster, therefore, you just say “dg=y”, not “rac=y&dg=y”.

After you construct an appropriate bootstrap line, you must additionally add three space-separated Kickstart constants, as follows:

Constant Compulsory?
ksdevice= No eth0, eth1 or any other valid name for a network interface
oraver= Yes 11201, 11203, 12101 or NONE
filecopy= No y or n

ksdevice and filecopy are only relevant if you’re building a RAC: a RAC node must have two network cards, and you use ksdevice to say which of them should be used for installation purposes. The usual answer is eth0. If you miss this constant off, the O/S installer itself will prompt you for the answer, so you only need to supply one now if you want a fully-automated O/S install.

The second node of a RAC needs to have paths and environment variables set up in anticipation of Oracle software being ‘pushed’ to it from the primary node -but it itself doesn’t need a direct copy of the Oracle installation software. Hence ‘filecopy=n’ will suppress the copying of the oradb…zip files from Asquith to the node. If you miss this constant off, an answer of ‘y’ will default, which will mean about 4GB of disk space may be consumed unnecessarily. It’s not the end of the world if it happens, though.

The oraver constant is required, though. It lets the server build process create appropriate environment variables and directories, suitable for running Oracle eventually. You can only specify 11201, 11203 or 12101 depending on which version of Oracle you intend, ultimately, to run on the new server. If you don’t ever intend to run Oracle on your new server, you can say “oraver=none”, and after a basic O/S install, nothing else will be configured on the new server.

A complete bootstrap line, suitable for the first node of an intended 2-node RAC, might therefore look like this:

ks=http://192.168.8.250/kickstart.php?distro=centos&version=64&hostname=my_racnode1&domain=dizwell.com&rac=y&ip=16.25.34.21&ic=10.0.0.1 oraver=12101 filecopy=y ksdevice=eth0

Notice there are spaces between the three constants, and between them and the original part of the bootstrap line. Here’s another example, this time for the second node of a Data Guard RAC:

ks=http://192.168.8.250/kickstart.php?distro=oel&version=63&hostname=my_dgnode2&domain=dizwell.com&dg=y&ip=16.25.34.26&ic=10.0.0.6 oraver=12101 filecopy=n ksdevice=eth0

5. Asquith Speed Keys

It’s not really that much typing when you come to do it, but if you want to make things even quicker, there are four ‘speed keys’ available to you:

Speed Key Effect
sk=1 The server will be called alpher.dizwell.home, with IP 192.168.8.101 and Interconnect IP of 10.0.0.101. It will run as the first node of a RAC and is configured to look to Asquith as its shared storage source.
sk=2 The server will be called bethe.dizwell.home, with IP 192.168.8.102 and Interconnect IP of 10.0.0.102. It will run as the second node of a RAC and is configured to look to Asquith as its shared storage source.
sk=3 The server will be called gamow.dizwell.home, with IP 192.168.8.103 and Interconnect IP of 10.0.0.103. It will run as the first node of a RAC but is configured to look to Rosebery as its shared storage source.
sk=4 The server will be called dalton.dizwell.home, with IP 192.168.8.104 and Interconnect IP of 10.0.0.104. It will run as the first node of a RAC and is configured to look to Rosebery as its shared storage source.

If you want to use one of these speed keys, your bootstrap line becomes:

ks=http://192.168.8.250/kickstart.ks?sk=2 oraver=11203 filecopy=n ksdevice=eth0

Note that you still have to supply the three Kickstart constants -but at least you don’t have to supply any of the normal parameters. In fact, you only have to supply the oraver constant, so it could be even shorter to type, if you’d prefer.

6. Creating Databases and Clusters

All Asquith client servers end up being created with a root user, whose password is dizwell and an oracle user whose password is oracle. Use the operating system’s own passwd command to alter those after the O/S installation is complete if you like.

All Asquith client servers are also built with an appropriate set of Oracle software (if requested), stored in the /osource directory. Grid/Clusterware will be in the /osource/grid directory and the main Oracle RDBMS software will be in the /osource/databasedirectory. Your job is therefore simply to launch the relevant installer, like so:

/osource/grid/runInstaller
/osource/database/runInstaller

If you don’t want to run a RAC or use ASM, just pretend the grid software’s not there! If you do, standard operating procedures apply:

  • Run the /osource/grid/runInstaller
  • Do an advanced installation
  • Select to use ASM, keep the default DATA diskgroup name
  • Change the Disk Discovery Path to be /dev/asm*
  • Use External redundancy levels (at this stage, Asquith doesn’t do redundancy)
  • Click ‘Ignore All’ if any ‘issues’ are discovered
  • Run the root scripts on the various nodes when prompted

Once the Clusterware is installed, you can install the database in the usual way:

  • Run /osource/database/runInstaller
  • Do a typical installation
  • Select to use Automatic Storage Management -the DATA disk group should be automatically available
  • Supply passwords where appropriate
  • Ignore any prerequsite failures
  • Run the root script when prompted.

It’s all pretty painless, really -which is precisely the point!

7. Rosebery

Just as a Salisbury server is accompanied by a Balfour server when building a Data Guard environment, so Asquith has his Rosebery. (Archibald Primrose, 5th Earl of Rosebery, Prime Minister of Great Britain 1895-1896). A Rosebery server is built in the same way as an Asquith server (that is, 60GB hard disk minimum; 512MB RAM minimum, 1 NIC), but doesn’t need a second DVD drive from which to find its kickstart file: for that, you simply point it at Asquith.

The bootstrap line to build a Rosebery server is thus:

ks=http://192.168.8.250/rosebery.ks

After that, the Rosebery server builds automatically. It then provides a new iSCSI target for client servers built with the dg=y parameter in their bootstrap lines to connect to. In short, Rosebery provides shared storage to clients, just as Asquith does -and therefore provides a secondary, independent storage sub-system for Data Guard clients to make use of.

8. Conclusion

Asquith (and Rosebery) provide a conveniently-built infrastructure in which Standalone, RAC and Data Guard Oracle servers can be constructed with ease. It automates away a lot of the network and storage “magic” that is usually the preserve of the professional Systems Administrator, leaving the would-be Oracle Database Administrator to concentrate on actual databases! By employing ASM as its shared storage technology, Asquith/Rosebery allow the DBA to explore and learn an important aspect of modern Oracle database management.

I’ll be putting up a section of the site for Asquith to match the one that already exists for Salisbury. Until then, the only place for any Asquith documentation (and the only link to download the all-important Asquith ISO) is this article itself.

Oracle 12c and Salisbury

Version 1.04 of Salisbury is now available. It contains two key enhancements over the previous version: (1) it automatically initialises hard disks, even when they contain no previous partition information; and (2) it works to create standalone, RAC or RAC+Data Guard Oracle Version 12c setups.

I have not updated the Salisbury home page yet, though, to link to the new release (this post is the only place to do so at the moment). That’s because I have yet to update all the associated articles to reflect a bit of syntax-tweaking I’ve had to introduce. Once I do that, I’ll make 1.04 available from the “proper” place.

In the meantime, here’s a quick explanation of that syntax change, brought about because of a silly design flaw I introduced to begin with.

When you’re setting up a Salisbury RAC, you probably and usually want the Oracle software copied across to the first node, but not to the second (because the second node doesn’t need it: it gets the software ‘pushed’ to it during the Oracle installation anyway, from the first, as part of the standard RAC installation process). To accomplish this, I originally had you say ORAVER=1120x on the bootstrap line when building your first node, and ORAVER=NONE when building the second.

Even though you said ORAVER=NONE, I still set up paths and environment variables which are correct for running Oracle 11g …because that was the only version of Oracle then available.

You now see the problem, I hope! Saying ORAVER=NONE certainly tells me you don’t want the Oracle software copied to your new server… but now I don’t know whether I should set paths and variables to expect, eventually, an 11g or 12c installation. The arrival of a new Oracle version creates an ambiguity that using one bootstrap parameter cannot overcome.

The solution was to invent a new bootstrap parameter: FILECOPY=y or FILECOPY=n. It does what it says on the lid: a value of “y” means you do want the Oracle software copied from Salisbury to the new server’s /osource directory. A value of “n” means you don’t. Meanwhile, ORAVER changes meaning ever-so-slightly: it now says what version of Oracle you intend to run, regardless of whether the installation software is to be copied to the new server or not.

In other words, for the first node of a new RAC, you’d say something like:

...oraver=12101 filecopy=y

…and for the second node, you’d say:

...oraver=12101 filecopy=n

This applies to 12c installations and to 11g ones, equally well. Technically, you can still say “ORAVER=NONE”, but this now means you don’t intend to run Oracle at all, so no directories or environment variables associated with running Oracle will be created for you at all. If you’re building 11g RACs using Salisbury, you will need to remember this new need to specify two parameters where one previously sufficed.

Other than that slight change to bootstrap options, everything else remains as it was. In particular, the “speed keys” still work for 12c, just as they did for 11g, so “sk=1″ builds you a node called “alpher” with IP address 192.168.8.101, “sk=2″ builds you “bethe” on 192.168.8.102, and so on.

Of course, you will need to upload 12c software to the Salisbury server before you can build subsidiary Salisbury nodes at all: Oracle themselves made a change here by making the Grid Clusterware come in two zip files instead of one.

As before, you are required to change the names of the downloaded files before Salisbury can make use of them. In the case of 12c, you will need to end up with files named:

  • oradb-12101-1of2.zip
  • oradb-12101-2of2.zip
  • oragrid-12101-1of2.zip
  • oragrid-12101-2of2.zip

Under the hood, as I explained in my last post, I’ve had to relax the NFS settings to be “insecure” so that 12c’s propensity to use Direct NFS doesn’t cause the database creation process to blow up. This new setting back-applies to 11g installations, too -not that you’d notice.

As I say, once I get a chance to update the doco linked to the home page, I’ll link to version 1.04 from there, too. In the meantime, this post will be the only place to link to it. Have fun!

Salisbury Fun and Games

Salisbury isn’t particularly clever in the way that it manages to combine an Oracle installation with an Operating System installation: the “magic” is in these few lines of code:

echo "#!/bin/bash" > /home/oracle/installoracle.sh
echo "/osource/database/runInstaller -waitforcompletion -ignoreSysPrereqs -ignorePrereq -responseFile /osource/standalonedb.rsp" >> /home/oracle/installoracle.sh
chmod 775 /home/oracle/installoracle.sh

su oracle -c "/home/oracle/installoracle.sh"

That’s to say, it creates a little shell script that, when called, runs Oracle’s runInstaller with a bunch of switches. And then it calls it. Not exactly difficult.

Except that it doesn’t work for 12c.

It starts well enough, but then just stops working, for no apparent reason:

Starting Oracle Universal Installer...Checking Temp space: must be greater than 500 MB.   Actual 13296 MB    
PassedChecking swap space: must be greater than 150 MB.   Actual 4095 MB    
PassedPreparing to launch Oracle Universal Installer 
from /tmp/OraInstall2013-07-07_08-32-58AM. Please wait ...
[[email protected] ~]$

…and that’s the last that’s ever heard from it, for it seemingly just dies shortly afterwards.

If you let the O/S installation finish and then execute exactly the same shell script, though, it works perfectly. So it’s clearly not a syntactical thing: the same commands work post-O/S installation but fail during it. My best guess is that it’s a runlevel thing. Database Configuration Assistant (dbca) has long complained about needing to be in a certain runlevel before it can work; now it seems that the OUI feels the same way, though 11g’s OUI never did

Anyway, as a result of this change in behaviour by Oracle’s software, I’ve had to rejig Salisbury quite a bit so that it doesn’t try launching the Oracle installation during the O/S install. Instead, it merely creates a set of scripts -which are then executed on first reboot.The O/S installation phase takes a lot less time than before, of course; the time taken to complete the first reboot commensurately shoots through the roof! But at least it all works, for both 11g and 12c.

So now, as a result of this rejigging, you can press ESC during the first reboot and see this sort of thing:

You will still need to manually invoke the createdb.sh shell script (as the oracle user) to have a single-instance database created post-install, however.

So, it’s all working as I’d expected, but I have now to test on all the other distros, make sure I haven’t accidentally broken anything … and that it also still works for creating RAC+Data Guard setups. I’ll have the 12c-enabled version of Salisbury uploaded just as soon as all that testing is completed.. Watch this space…

No more disk re-initialization

It has long bugged me that my Kickstart scripts will quite happily build an entire virtual machine without you having to lift a finger …but not if it’s a virtual machine that’s using brand new virtual hard disks. If you’re installing onto virgin hard disks, you’ll likely get prompted with something like this:

Today, it annoyed me enough that I actually decided to do something about it. The fix turns out to be a simple one-word addition to your Kickstart script: zerombr. Stick that above the ‘clearpart’ line which actually partitions your hard drive, and it will have the effect of auto-initializing any drive it needs to.

In Salisbury Kickstart files, for example, you’ll currently find this code:

clearpart --all
part / --fstype=ext4 --size 20000 --grow
part swap --size 1024

…which means “clear all partitions, then create a root partition of at least 20GB, and a swap partition of 1GB”. This works fine unless there are no readable partitions to clear (such as when your disk has never previously been used). So the new code will read:

zerombr
clearpart --all
part / --fstype=ext4 --size 20000 --grow
part swap --size 1024

…and that means your Salisbury servers can now be built truly and completely without manual intervention, after the first bootstrap line has been typed.

The code change hasn’t made its way into the Salisbury ISO as yet: there are a couple of other changes I’ve wanted to make to be wrapped up first. But it will be there soon.