Dizwell Informatics

News from Nowhere

The Dizwell Blog

Virtualization with KVM

No matter how much I’d rather not run Windows, there are times I have to -principally because work insists on using Checkpoint’s VPN software for which no Linux client exists. So, when I want to work from home, I have to connect to the office in a Windows 7 VM and use tools like Putty or NX Client to manage the various work PCs and servers (all of which are now, ironically enough, Linux boxes). It’s a pain, and if anyone knows how to use openssl or openvpn to connect to a Checkpoint VPN1 SecuRemote VPN, I’d love to be let in on the secret!

Anyway, a Windows VM is essential -and for years I’ve been using VMware Workstation to run one. I paid my US$189 several years ago (interesting to see that price hasn’t budged a cent since!), and I’ve always found it just a fraction more intuitive and well-behaved than, say, Parallels or VirtualBox. VirtualBox has the distinct advantage of being free, of course -and is now owned by Oracle, which seems to be continuing development efforts quite nicely. But the fact remains, I’ve never really warmed to it: I’m just a VMware Workstation fanboy, I guess! (I stress the Workstation in that product name, however: I’ve never liked the zero-cost VMware Server product, since it seems to require klunky web-based interfaces to achieve anything much. On the other hand, I got VMware’s ESXi bare metal virtualisation installed at work and it’s never missed a beat, running all of our Oracle dev and test environments extremely well. (Though I will point out the irony that ESXi lacks a native Linux client and I am therefore forced to use a VMware Workstation VM running Windows 7 on my Linux-running Work PC just so I can manage the ESXi box, which is running a Linux kernel! Go figure!!)

Anyway, I have dabbled in various virtualization technologies in my time, both hypervisors and host-based ones. Citrix Xen Server, for example, was a good hypervisor, but a little inflexible to manage as compared to VMware’s ESXi similar offering. Microsoft’s Hyper-V was certainly slick, but I had terrible performance issues in the presence of an Nvidia graphics card -and I wasn’t the only one. See, for example, this page of complaints. It’s been a year since I ran any Windows OS natively, either at home or at work, so I’ve not tried Hyper-V since -but according to this Wikipedia article -see the Graphics issues on the host paragraph-, the graphics problems persist (but who trusts Wikipedia?!). Funnily enough, using the Xen virtualization features in Red Hat Enterprise Linux 5.5 is very similar to using Hyper-V: both installations slot ‘underneath’ your physical host’s OS install, turning it, effectively, into a virtualized guest (albeit a “parent” one). The moment Xen goes in, for example, a uname -a command in a terminal will reveal that you’re no longer running a standard linux kernel, but a special “xenified” one (which poses all sorts of problems when you are running proprietary graphics drivers which expect only ever to have to compile against ‘standard’ kernels, for example).

But there’s been one virtualization technology I’ve not used before now: KVM (stands for ‘kernel-based virtual machine’, not ‘keyboard, video, mouse’ as in a KVM switch!). As it’s name suggests, it’s built into the Linux kernel -and has thus been shipping as a standard part of Red Hat Enterprise Linux since 5.4 days (around about this time last year, basically). Fedora 13, too, includes KVM ‘out of the box’ (as do a lot of other distros, including Ubuntu). It’s not installed or enabled by default, but it’s right there, in the repositories, just waiting for a simple one-line installation command. What’s more, when you do install those KVM packages, unlike when you install Xen, you don’t end up altering the host OS’s status: uname -a still outputs exactly the same as it always did, in other words. This is simply because (the clue is in the name!) the hypervisor is already built into your existing kernel, so you don’t need a special kernel to make use of it. Not disturbing the host’s kernel in this way makes installing things like Nvidia graphics cards (see posts passim!) not a drama, and is thus a Very Good Thing™.

Installing KVM on Fedora 13 is simple:

su - root
yum install qemu-kvm virt-manager virt-viewer python-virtinst
libvirtd

Once the libvirtd daemon is running, you can fire up Applications→System Tools→Virtual Machine Manager. Click the ‘new virtual machine’ icon in the top-left and then, basically, follow the prompts of the ensuing wizard to build your first virtual machine. And that’s about it! It’s really incredibly simple.

The only tricky bit comes if you want your new VM to look like an independent host on your network. That requires “bridged” networking, which doesn’t exist until you manually create it (it would be nice if someone was to develop a graphical tool for achieving this!) Worse, bridged connections don’t work with the fancy new ‘network manager’ way of doing networking that Fedora (and Ubuntu, actually) has adopted. So, if you want bridged connections for your VMs on those distros, here’s what you have to do:

As root, issue the command

system-config-network

Find the eth0 item and click the Edit button. Switch ‘Controlled by NetworkManager’ off, ‘Activate device when computer starts’ on and ‘Allow all users to enable and disable the device’ to on. Click OK and then File→Save to preserve the changes.

Now you’ve just disabled the new-fangled Network Manager, so you have to make sure the old-fashioned network control starts at each reboot:

chkconfig network on

You now create a new bridge network interface by issuing the command:

gedit /etc/sysconfig/network-scripts/ifcfg-br0

Add the following lines to the new text file thus created:

DEVICE=br0
TYPE=Bridge
BOOTPROTO=dhcp
ONBOOT=yes
DELAY=0

The typing here has to be precise -it’s very case-sensitive, for example, so ‘bridge’ as a “type” entry won’t work, where ‘Bridge’ will!

You now tell the eth0 interface that it is to be bridged. Do that by issuing the command:

gedit /etc/sysconfig/network-scripts/ifcfg-eth0

Add the following line to the file’s existing contents:

BRIDGE=br0

Now you can re-start the network so the new configuration is activated:

service network restart

Note that your physical PC now connects to the rest of the world via the br0 interface, which happens to know (thanks to the edits above) that the physical eth0 is responsible for handling its traffic. But, as far as your physical PC is concerned, eth0 is actually a non-active interface in its own right. Br0 takes over that role, though functionally it all amounts to the same thing.

Finally, the trouble with this setup is that br0 is a physical network interface, seen and used by your physical PC. But that’s not much use to a virtual guest machine! So now we have to add a virtual interface to our physical interface -and that’s a job for a utility called tunctl. That utility probably needs to be installed to start with, so the relevant command is:

yum install tunctl

Next, issue these commands in sequence:

tunctl -t tap0
brctl addif br0 tap0

The first command creates an interface called “tap0″; the second command says it’s to be a virtual representation of the ‘br0′ physical network interface.

Once all that’s done, you can go back to virtual machines you’ve already created and add new network hardware -this time, a bridged interface will be available to you. You can remove the previous NAT one, if you like (or simply disable it within the guest OS). New guests can be created, obviously, that use the right sort of ‘let me at the world!’ interface from the get-go.

One final bit of advice as far as KVM experiments are concerned: having to start libvirtd manually before you begin is a bit of a pain. If you want to ensure libvirtd is started automatically whenever your PC reboots (and thus avoid the need to run it manually in a terminal session), just go to System→Administration→Services and click the libvirtd item, then the [Enable] button. Once it has a green check mark next to it, it’s scheduled to auto-start.

Apart from the bridged network issue, however, KVM is an absolute doddle to install, configure and run. Performance in the Windows 7 virtual machine I use is excellent -the only drawback is that the virtualized graphics hardware isn’t up to displaying the fancy, semi-transparent Aero interface. But that’s not much of a problem for me. I miss only two other things from my VMware Workstation days: movie capture and snapshots. KVM provides a menu option to take a still screen capture of your guest, which is fine. But it doesn’t have the option to capture screen motion/activity as a movie (this is something the freebie VMware Server product also lacks). There are workarounds, of course (yum install recordmydesktop puts a movie-capturing application at your disposal which will more-or-less do the job), but it would be nice to have the functionality built-in.

The lack of snapshots is a bit more of a drama, to be honest. There are snapshot capabilities that can (probably!) be used, thanks to the use of the qcow2 virtual hard disk format -and you’re supposed to be able to drop into a terminal and issue a qemu-img command that will do the necessary. But I haven’t tried it, I believe it only works for a VM that’s been shut down… and in any case, it all sounds a bit tricky at this stage. I’m really more after a ‘take snapshot’ button in the Virtual Manager window, to be honest! Meanwhile, there is a simple button to do VM cloning (though, again, the VM has to be shut down for the duration), which will do me well enough in the meantime. But this is certainly an area of VM management that it would be nice to see some development on in the next year or two!

Other than those slight niggles (oh, there’s one more: no drag-and-drop between host and guest), I think KVM is an excellent virtualization platform, and my trusty copy of VMware Workstation has remained firmly on the bookshelf for this PC’s recent rebuild.

Quandary Resolved. For now.

It’s been a long-running saga, so let me first summarise:

  • Bored with Ubuntu and thinking that I’d prefer something more Red Hat-ish, because that’s what we now run our Oracle databases on at work
  • Therefore, install Fedora -and quite enjoy it
  • Except that CD ripping is found to be broken and Stellarium refuses to display properly using the open source drivers
  • Installing “proper” ATI drivers is impossible, because ATI don’t support the xorg version used by Fedora
  • Because K3B is such a good CD ripper (and burner), I investigate KDE-based distros, but can’t stand any of them for long. Discover, however, that living with a mix of KDE and Gnome apps isn’t actually a bad thing but rather gives you the best of both worlds.
  • CD ripping resolved, therefore, by running a Gnome distro but with some KDE apps installed (like K3B). Still leaves the Stellarium/ATI Graphics problem…
  • So I install OpenSuse 11.3 -and hate every moment of it! Stellarium works and the ATI drivers install, but the distro sucks in lots of little ways (in addition to the litany mentioned last time, I should add that discovering sshd is not enabled by default was a bit of a surprise!)
  • After two days with OpenSuse, I reverted to Ubuntu 10.04: Stellarium worked, K3B still does CD ripping. But I’m still bored with Ubuntu. Worse, I find some of its control changes now annoying: I want my Log Out option to be under my System menu, thanks all the same, not a little button tucked over the right-hand side of the top panel. Also, I know I can switch the windows close/minimise/maximise buttons back to the right-hand side of the window, but I don’t see why I should have to -and I know that the developers are cooking things up for 10.10 that will expect the controls to be on the left, where they put them without much consultation. All a bit Microsoft-ish if you ask me.

So, having been through just about every vaguely-plausible distro and desktop environment out there and received only disappointment for his pains, what’s a boy to do??

Buy an Nvidia graphics card is the answer!

Actually, I happened to have one sitting around in a cupboard, so I whipped out the ATI monster and slipped it into the PCI Express slot in its stead. It sounds a tad drastic but it means I’ve been able to re-install Fedora, have no graphics problems, Stellarium works, K3B works, menus are where I expect them to be, ditto windows controls …and it all behaves very like a Red Hat Enterprise distro, so I feel at home at work, if you get my drift.

Everything is thus tickety-boo …if you overlook the minor matter of having to trash more than $300-worth of ATI graphics card to get there. I have said it before, but it bears mentioning once again: ATI (i.e., AMD) should be ashamed of themselves. The quality of their Linux drivers, the convoluted installation process and the tendency of any system fitted with them to crash or otherwise have “interesting” graphical glitches happen at random moments -it all adds up to an abysmal way of doing business. I won’t say that Nvidia are completely blameless (their installation procedure isn’t exactly brilliant -on Fedora, at least, you have to manually disable the open source drivers by editing the grub.conf file before the installation will succeed, which isn’t what I’d call terribly user-friendly), but they make ATI look like a bunch of incompetent amateurs by comparison.

Funnily enough, my PC at work -which I try to keep more-or-less in synch, distro-wise, with my home PC- experienced exactly the same grief with ATI drivers, even though it was running a copy of Centos 5.5 (which is a clone of Red Hat Enterprise Linux, which ATI claims to be a fully-supported distro). I gave in there, too, and did actually go out and buy a $50 Nvidia Geforce 8400GT… which also immediately made all my graphical and stability problems melt away. So it seems to me to be a generic “feature” of ATI cards that they screw up most Linux distros!

I never had that problem with Ubuntu with the ATI card, I will admit -but then I never tried to install my own ATI drivers in that distro, either. Clicking the ‘activate proprietary drivers’ button is all it takes in Ubuntu (and is all it really ought to take anywhere else, Nvidia and ATI included), but I have no idea which drivers it actually causes to be installed. Had such an ‘automated installation’ feature been available in Fedora, I guess none of this saga would have arisen -but ATI haven’t exactly come to the party in terms of supporting Fedora nearly six months after its release, so I still say it’s more ATI’s fault than anyone else’s.

Anyway, I’m a happy Fedora man again -and discovering the joys of KVM virtualisation for the first time (very impressive, is the short version). And if anyone wants a $300 ATI graphics card, feel free to ask.

Let me count the ways

Well, that didn’t last long!

OpenSuse 11.3, I mean. It’s quite possibly the nastiest distro I’ve used in a very, very long time. Let me count the ways!

  • The default ‘start’ menu is horrible. Novell in its wisdom decided that the standard Gnome Apps/Places/System menu is not good enough for their distro and thus replaced it with something that more resembles the giant thing you get in Vista/Windows 7 when you click the ‘Start Orb’. It’s also at the bottom of the screen, not the top. Clearly, a lot of design thought has gone into this change to ‘standard’ Gnome layouts -but I hate it. Happily, by adding a new panel here, adding ‘Main Menu’ items to those panels there, and generally buggering about for long enough, you can get things back to the way Gnome usually is -but it’s effort that shouldn’t be required.
  • Assuming you’ve added back the traditional “Applications/Places/System” Gnome menu, you may think you’re on the home straight. But alas, the menu structure revealed ‘underneath’ those three menu headings is completely non-standard and utterly bizarre. When you install the VLC media player, for example, in every other distro I’ve seen, it gets added as an item under ‘Audio/Video’ or ‘Multimedia’ off the main Applications menu. Not in OpenSuse, however. There, it gets added as an item under another menu, so you end up having to click Applications → Multimedia → Video Player → VLC. Similarly, Handbrake doesn’t appear as its own item, but gets rolled onto a new ‘Media Editing’ submenu. I hate extra mouse clicks for no reason, and that’s two of them too many! I won’t even get into the business of why one menu sports a noun (“Video Player“) and one a present participle (“Video Editing” …why not “video editor”?). The same sort of thing happens under the Games menu: we get “Board Games” and “Card Games”, which is all well and good… but then an item called “Puzzle”. Not even plural puzzles, note. Let alone “Puzzle Games”. Trivia, I suppose, but annoying all the same: a bit of grammatical consistency wouldn’t go amiss.
  • How many different ways are there to skin a cat? OpenSuse lets you install software at the command line with Zypper. Then there’s System → System → Install/Remove Software (and I just love the double-up on ‘System’ in the menu structure at this point!) But there’s also System → System → Yast → Software → Software Management. And, just in case you didn’t think that was enough, there’s Applications → System → Yast → Software → Software Management, too. How many menus called “System” do you need in, er, a system, anyway? (It makes writing directions/guides a pain in the neck, if you really wanted to know). And how many menu items pointing to Yast is overkill? Whatever the answer to that, OpenSuse has too many. One more example, then: to update your system, you could do System → System → Software Update. Or you can do Applications → System → Configuration → Software Update. Exactly the same option in two completely different places! OpenSuse basically renders the System menu completely pointless, in fact.
  • Chromium is broken. I don’t know if this is an OpenSuse thing or a Google thing: I’ve seen reports of it mentioning Ubuntu, for example. But it was all working just fine for me in Fedora. The problem is the Sync tool that allows you to have one set of bookmarks, themes, extensions, preferences and autofill details shared amongst all the desktops on all the PCs you happen to have installed Chrome onto. It’s a great feature -and it’s broken in OpenSuse. The thing authenticates well enough. Then it asks you which bits of data you want to sync. And then it sits there, rotating its hourglass-equivalent thing for ever and ever. It’s bug 51829, if you’re interested.
  • ATI graphics drivers work. Eventually. Sort of. One of my major issues with Fedora is that there are no official ATI graphics drivers available for it, because Fedora uses a very recent xorg version (as I mentioned last time). The good news is that ATI drivers are available for OpenSuse. The bad news is that their installation procedure is Byzantine, prone to failure (resulting in no X session at all, but unceremonious dumping at a command line), and liable to break at the drop of the hat. This morning, for example, I booted a VMware virtual machine that had virtual accelerated graphics and got a warning saying the drivers had crashed and would therefore be disabled for the duration. It was only a virtual machine affected, and it’s probably ATI’s fault, not OpenSuse’s, but it’s the sort of thing that leaves a nasty taste in the mouth. Or, again, take the fact that as I’m writing this post, my cursor has simply disappeared. Only to re-appear at a time and place of its choosing. Graphical weirdness like that I can do without, frankly. When the drivers are installed, however, I will admit: Stellarium displays and functions flawlessly, which is more than can be said of what’s possible on Fedora.
  • It all looks a bit weird. Yup, I agree that one’s a bit vague… but it’s the best I can do! The whole thing looks a bit ‘spidery’ for my tastes: the menu fonts are a bit thin and weedy, for example. In fairness, it could be said that the fonts were ‘precise’ and ‘sharp’… but they just look a bit thin and weedy to me!

Well, I could go on, but I don’t think I need to. It’s not that OpenSuse is a bad distro, you understand. Just that it’s peculiarly different in lots of niggly little ways from ‘standard’ distros -and I can’t see any real justification for the departures decided upon by the developers. Aside from the fiddly, niggly differences, there are quite a lot of just plain badly thought-out things (like the bazillion different ways to launch the same program) that really annoy me. I can tell I’m never really going to feel entirely at home with it, to be frank… so two days after installing it, it has to go.

Which leaves me in a bit of a quandary, I guess. With Fedora I can have sensible, default Gnome with a Stellarium that won’t work at all and a CD ripper that’s fundamentally broken. With Ubuntu, I can have everything apparently working, but in a “kiddies distro” kind of way. Or I can endure the peculiarities of OpenSuse and have an adult system with a broken menu structure, no Chrome synchronisation but a functional CD ripper and Stellarium. As they say, Linux certainly gives you lots of choices!

The Green One

I wrote some time ago now about how it’s impossible to rip CDs properly using the application the Gnome desktop is supplied with by default to perform precisely that task. As a result, I spent a couple of weeks wandering around the byways and alleyways of the KDE desktop -and, generally, I was a little bit impressed and quite a bit put off! Most of the put-off came from the fact that every distro seems to do KDE slightly differently, and I can’t stand having to make choices like that!

However, as I write this, I am still using a trusty, Gnome-based Fedora 13 desktop, albeit one with a few KDE apps strewn about it: I do all my DVD burning using K3B, for example, and not the Gnome default application of Brassero. In the end, Gnome is the environment I like best (not sure that will always be true… there are some pretty dramatic developments in the Gnome desktop in the pipeline), though I am grateful to have the CPU and RAM grunt necessary to run particular KDE apps when occasion arises. I think we call this ‘eclecticism’: pick and mix the best bits from whatever takes your fancy and don’t get hung up being especially ‘purist’ about which desktop you use. So, in the end, it wasn’t necessary to switch distros or desktops to achieve functional CD ripping: stick with what you know and add in functionality as needed. Suits me just fine.

However, I am now officially bored with Fedora. It’s not taken too long to get there: I blogged about switching to Fedora (from Ubuntu) only as far back as July 11th last… so about six weeks, then! As well as the lack of Gnome-based CD ripping, the other thing that’s really annoyed me about Fedora this time round is its inability to run Stellarium properly:

You’ll note from that screen capture that the bottom controls consist of nothing but greyed-out parallelograms. The red triangle in the bushes is supposed to be the letter ‘S’, to indicate we’re looking south. Lord alone knows what the white triangle next to the bright star is supposed to be -a star or constellation label, I think.

That’s how all text appears, anyway.

The net result, as you can see, is that it’s basically unusable and has been this way since day 1 of my Fedora experiment. I believe it has something to do with another “issue” that’s been bugging me with Fedora all that time, too: the lack of proper graphic drivers. The Fedora team like to be ‘cutting edge’ in most things and, when it came to Fedora 13, they decided to use bang up-to-date Xorg libraries. That’s version 1.8 rather than, say, Ubuntu’s version 1.7. The problem is, I have an ATI graphics card, and ATI only compile their drivers for use with version 1.7 Xorg libraries. That’s ATI’s fault, of course, and no reflection on the Fedora folk -but it doesn’t really matter whose fault it is, functionally. Either way, I end up not being able to install ATI drivers and instead have to rely on opensource equivalents (which wobble the windows well enough -but, clearly, bomb out when it comes to rendering the heavens properly in Stellarium!)

I’ve put up with this state of affairs for six weeks because Fedora has otherwise been perfectly good enough. And it’s mid-Winter here, so it’s been bum-numbingly cold to be out at night star-gazing! The need for Stellarium has not, therefore, been particularly great. As we approach September and its equinox, however, we move into Spring -and that’s when the need for Stellarium be properly functional starts to increase.

Long story short, therefore: I’ve decided to give OpenSuse 11.3 a run. I’ll keep you posted on how it turns out. Whilst it’s yet another OS change, I would like to just mention in passing that if I can make it to October, I’ll have been running a purely Linux desktop for a year, with no recourse to a sly Windows re-installation or two in the meantime. True of work and home, too, which is even more surprising (since work is, aside from my PC and the database servers, an entirely Windows-running shop). Anyway: watch this space.

Enkidu

Rachel has been her usual busy self! Her latest joey is this little one:

We’re calling “it” (who knows if it’s male or female at this stage?!) Enkidu, given that we ran out of US presidents and sitcom characters quite a while ago. No doubt Gilgamesh will be along soon enough!

Sound Juicer Not So Juicy

There is a nasty bug in Sound Juicer, the Audio CD ripping application that ships with Fedora 13 (and every other Gnome distribution on the planet, I expect).

First you insert a CD -and if you like listening to classical music, it is quite likely you will see a yellow panel declaring that ‘Could not find Unknown Title by Unknown Artist on MusicBrainz’. Fair enough: I usually end up supplying my own titles, artists and track names anyway, because other people submitting to these CD databases often have a peculiar idea of how to do it properly.

But anyway: you finish ripping that CD; you insert another, and this time you get this:

Since Google is my friend, I’m fairly confident I am a victim of bug 544843, which was described back in April 2010.

I have to admit, of course, that I’ve not paid a cent for Sound Juicer and I’m not on a support contract for it either… but I still can’t help feeling disappointed that an application whose sole purpose in life is to rip an audio CD can’t do it, apparently because of some weird interaction with a music track lookup service (which is very much its secondary role in life).

You’ll note from that bug report that, apart from being able to describe what happens accurately enough, no-one has seen fit to explain what exactly is causing the problem, what the workarounds are and when a fix is to be expected.

So basically, the hot news is: Gnome desktops can’t currently rip CDs with the tool provided for the job.

I don’t really have a practical workaround, other than not using Sound Juicer. Instead, I installed asunder (a simple yum install asunder works for Fedora 13). That seems a bit archaic, but does the job. Actually, it’s not necessary for it even to do its job: if you run it, get it to find the track listing for the same CD that caused the problem for Sound Juicer, then shut it down without having ripped anything, and then launch Sound Juicer again -well, this time Sound Juicer will display the correct number of tracks and let you rip them. It will still moan about not knowing what those tracks are, because MusicBrainz is, apparently, so useless. But at least the ripping functionality will be there… until the next CD.

So the routine becomes: insert CD. Run Asunder. Close Asunder. Run Sound Juicer. Manually edit tracks and artist details. Rip. Repeat as necessary.

Which is, of course, utterly bonkers and the reason why I’m now planning on ditching Fedora. Or, at least, since it’s not particularly Fedora at fault, of course, but Gnome’s, I might just be tempted to plunge headlong into the choppy waters that is KDE -simply because Sound Juicer is Gnome’s default audio ripping application, and so KDE should be free of its curse.

I’ll keep you posted, anyway…

Update 27/1/17: Seven years after I first wrote about this, the bug linked to says “closed won’t fix”… but that’s because it’s a bug raised against Fedora 14 which is now long out of support. As far as I can tell, Soundjuicer itself works fine in Fedora 25, the latest Gnome distro I’ve worked with).

PostgreSQL versus Oracle

It’s been an interesting few days, comparing the performance of a production Oracle database performing full-text search on about 10 million records with a rough-and-ready PostgreSQL prototype (with the aim of working out whether the business can save itself a couple of hundred thousand dollars in licensing fees, of course!)

Both systems ran on separate servers with identical hardware: 24GB of RAM; a pair of mirrored 1TB hard drives for the OS and software installation; a pair of mirrored 256GB solid state hard disks for the database storage itself; a pair of quad-core, hyperthreaded Xeon CPUs; 64-bit Centos 5.5 (yup, we run production Oracle on Centos… gasp!) The data structures were near-identical in each case: two tables, each containing about half the records, queried individually with the results then “union all’d” into a single set of matching records. One column in each table contained all the comma-separated keywords describing the documents’ contents. That column was indexed with Oracle Text in one case, or a gin/tsvector index in the case of PostgreSQL. The data used was almost identical in each case, too: 9.4 million rows selected from a separate, third database and copied into the two sets of local search tables. Apart from a couple of thousand records that crept into the Oracle database that weren’t there at the time the PostgreSQL one was populated, there was no real difference between the two data sets. The Oracle version in use was 11.2; the PostgreSQL version in use was 8.4.4.

Each search performed on the production Oracle database is always captured and timed for monitoring and management purposes. I mined the record of those captures for a set of about 27,000 distinct search ‘phrases’ which I coule run against the PostgreSQL database. In this way, production Oracle timings could be compared to equivalent PostgreSQL benchmark times.

If you add up the total number of seconds it took the Oracle database to respond to all 5000 searches (and to return all the results), you get a grand total of 3,448 seconds. On PostgreSQL (identical hardware, near-identical data), it took 2,207 seconds. Overall, therefore, PostgreSQL is about 36% faster than Oracle on full text searches. Not bad for free!

But I wondered how each database handled searches that resulted in many matches. In other words, was PostgreSQL consistently faster than Oracle, or did it handle high-match searches better or worse than Oracle? Well, here’s the graph (click on it to make it more legible):

As you can see, as the searches return more and more records, both Oracle’s and PostgreSQL’s response times start to slow down (not, perhaps, unsurprisingly)… but PostgreSQL slows down a lot earlier than Oracle and a lot more dramatically. For a search term that matches only 15,000 records (which happens to be our default match, hence that vertical red line), PostgreSQL is faster than Oracle by quite a margin (about 60%). But for a search that matches 200,000 records, PostgreSQL is about twice as slow as Oracle; and by the time you’re matching more than 500,000 records, PostgreSQL is more than four times slower than Oracle. Given that we very rarely perform searches that match that many records (on the grounds that that’s not so much a search as a trawl!), here are the same figures, but zoomed in a bit to the left-hand side of the graph:

I think the speed advantage PostgreSQL has over Oracle for a “reasonable” number of matches is much clearer here. But I couldn’t help but notice that the orange line (PostgreSQL) was a lot ‘wavier’ than the blue (Oracle): that basically means that whilst Oracle is slower than PostgreSQL, it’s more consistent in its response times. PostgreSQL is always faster than Oracle, but it behaves in a rather less predictable fashion, especially when the number of records being returned is more than about 25,000. I suspect this is a consequence of Oracle managing its own buffer cache, whilst PostgreSQL relies much more heavily on the file system cache -and Lord knows what the file system might want to do to that at any given time!

Finally, I wondered how each database coped with search strings of different lengths. Some people might just want documents about “negligence”, for example; others might ask for “negligence, banking, insurance, fraud, liability”… would the length of the search term submitted affect the performance of search at all?

I find this graph particularly fascinating. Oracle (blue) has no problem coping with more and more words in the search phrase: search response times come down almost linearly as the search terms proliferate, probably because each new search term makes the search more precise -and therefore the set of results gets smaller and smaller. But PostgreSQL (orange) behaves in almost completely the opposite way: the more search terms you supply, the slower full text search gets (despite the size of the results set getting ever-smaller). It would seem that Oracle has an optimisation here that PostgreSQL lacks. I can only guess that Oracle constructs a bitmap of each search term, direct from the Oracle Text index, and then does a trivial bit of bit-wise addition -it now knows which records are true for all search terms and then goes to fetch them. Similarly, I can only guess that PostgreSQL has to scan for the results of each search term, and only at the end performs row-elimination to arrive at the set of rows which are true for all search terms… and that it’s this multiple scanning of the large amount of data which causes the orange line to slope upwards as the number of search terms increases.

So, all in all, it’s an interesting result. Do you get anything worth $10,000 when you buy your (dual-CPU SE1) Oracle license? Sure you do: you get a full text search that can handle huge numbers of matches and large numbers of search terms with aplomb. PostgreSQL in contrast struggles with both these extremes. But, on the other hand, for absolutely zero dollars, you get a screamingly-fast full text search capability with PostgreSQL -one that can perform excellently provided the ‘extremes’ aren’t met. Considering that any search containing more than a half-dozen search terms is more like an essay than a realistic search; and considering that returning half a million matches is more a data dump than a sensible search facility, I’d have to conclude that PostgreSQL is more than capable of being the basis of a very cheap, very viable, very capable search engine for quite large data sets.

I have to say, I like PostgreSQL enormously, despite my very-much-Oracle background. It’s highly standards compliant, which I like from a theoretical perspective. It’s nicely hands-on -you can fiddle, and tweak, and delve into it, in a way that Oracle has deliberately tried to move away from since about version 8.0. It has all the functionality most Oracle users could ever ask for, plus some. And it performs really very well indeed. Impressive, actually.

Gentium

A completely free font called Gentium (and its denser cousin, Gentium Book Basic) can be obtained from this website.

Just download the “Gentium Basic/Gentium Book Basic” link, all 848KB of it, and then unpack the resulting zip file. In Fedora, you can then view each font in turn by right-clicking it and selecting the Open with Font Viewer menu option. A button to Install Font will then be visible: click it, and you’re done. Repeat for each additional font variant (bold, bold italic, italic). The uncompressed folder can be deleted once all eight font files have been installed in this way:

Note that the font(s) may already be available from your distro’s standard package manager.

Linux Mint, for example, definitely has ‘ttf-sil-gentium’ and ‘ttf-sil-gentium-basic’ as packages in the standard repositories (so, presumably, Ubuntu does too). Try there first, anyway, before seeking to install from the website previously linked to.

Update (27/1/2017): In Fedora 25, you can install the font and its various weights with the commands:

sudo dnf install sil-gentium-fonts
sudo dnf install -y sil-gentium-basic-book-fonts

(The ‘book’ fonts are slightly heavier/bolder versions of the basic fonts).

Heterogeneous Database Connectivity

I’m not sure I’ve ever put quite so many syllables in a Blog Post Title before?! But this is a little cri du coeur on the topic of getting Oracle databases talking to non-Oracle databases, such as PostgreSQL: because the technology is bloody difficult and doesn’t work properly!

There are two aspects to the problem, of course: getting Oracle talking to the ‘foreign’ database and/or getting the foreign database talking to Oracle. In the first case (Oracle-to-other), you can use the Oracle Gateways products to do the deed. A lot of vendor-specific ones are licensable, but a generic ODBC one is actually part of a standard 11g install and doesn’t, therefore, involve any additional software purchases or installations at all. In the second case (Other-to-Oracle), there will be all sorts of options -but the one I am specifically aware of in the PostgreSQL context is something called DBI-Link, which uses a bit of Perl, a sprinkling of YAML and a dash of ODBC to make the connection.

Neither approach is particularly difficult to set up, though I think it involves more typing than I’m entirely comfortable with!

In the case of Oracle’s ODBC Gateway, you first use UnixODBC on your Oracle box to create a Linux-to-other connection. This involves creating a System data source in /etc/odbc.ini, specifying where the PostgreSQL ODBC drivers can be found by editing /etc/odbcinst.ini. Such connections are almost trivial to set up (and can be tested with the isql tool that ships with UnixODBC). That done, you then visit $ORACLE_HOME/hs/admin and create an init.ora with a name that matches the data source name you’ve just created (so, if you edited odbc.ini and created a data source called hjr, you’d now create an inithjr.ora file in the hs/admin directory). Then you move to $ORACLE_HOME/network /admin and edit your listener.ora, adding a SID_LIST entry which says ‘HS=OK’ and ‘program=dg4odbc’. Finally, you edit your tnsnames.ora so that there’s an alias which points to the local host, requesting a connection to a SID whose name matches the data source name previously created. Restart your listener to get the new settings picked up, and you should be able to fire up SQL*Plus, create a database link that uses the new tnsnames alias and be able to select * from [email protected]_new_link. The listener will receive the call, know to invoke the dg4odbc program, which will know how to read the /etc/odbc.ini stuff: shortly thereafter, your PostgreSQL data will be winging its way into the Oracle database.

Except that it almost certainly won’t! I can get the above working when both the Oracle and PostgreSQL databases are 32-bit, almost with my eyes shut. But you make either one of them (let alone both!) a 64-bit database running on a 64-bit server and you’re stuffed. I spent six hours trying to get it working earlier this week, and nothing I did made the magic moment, where foreign data finally turns up in an Oracle context, happen.

I’m not even going to get into the pig’s breakfast which is PostgreSQL’s DBI-Link software. I’ll just say that, for starters, the software which you download from the official PostgreSQL ‘foundry page’ is 3 years out of date and the author has a more up-to-date version available which, if you’re lucky, you’ll find by burrowing about a bit through a mountain of mailing list contributions. Secondly, there are approximately four hundred and thirteen elements which may, or may not, affect your chances of success. LD_LIBRARY_PATH might need changing, or it may not; it might need to reference /lib64 before, or maybe after, /lib. Your setting of NLS_LANG may or may not be an issue. You might have to set TWO_TASK, but then again you may not have to -and if you do, don’t ask too closely what it’s supposed to be set to. Feel free, variously, to experiment with your path, your ORACLE_HOME, your environment variables… there’s a very, very long list of them, any one of which might or might not be contributing to the problem! Oh, and whether or not you auto-start your PostgreSQL database may make a difference, too! Each time you adjust one of these degrees of freedom, you can certainly try selecting data from your Oracle database, but it probably won’t work.

Frankly, I’ve never seen anything like it. It’s the technological equivalent of a soufflé: if it stays up, it’s a miracle, but chances are it will just collapse in a non-functioning heap and look a mess. After 11 hours of trying, I gave up.

I thought I might put together a short series of showing the “fun” you can have with this stuff. We would start with a strictly 32-bit world, where everything more or less works as advertised. Then we’d try exactly the same thing in a 64-bit universe and see how hilarious it is when nothing at all works. But I’m so exhausted by my efforts earlier this week that I don’t think I can look a database in the eye again for quite a while! So, consider it an item on the to-do list!

Meanwhile, I did finally get 64-bit Oracle data into a 64-bit PostgreSQL database by the not-so-simple expedient of using three servers. One running Oracle, one running PostgreSQL and one running EnterpriseDB (which is PostgreSQL with “Oracle-compatible” knobs on). EnterpriseDB makes talking to Oracle a piece of cake -incredibly simple, actually, via JDBC as far as I can tell. Being at heart a PostgreSQL database itself, EnterpriseDB also is trivially easy to get talking to a ‘native PostgreSQL’ database, via PostgreSQL’s dblink software. The syntax is ghastly, but it is easy to set up. So, if you create views in EnterpriseDB on tables selected from the Oracle database, you can subsequently query those views in the native PostgreSQL database via the dblink software. It is, of course, slow as all buggery, but it is at least functional!

Personally, I don’t think Oracle-to-Other should be that hard -and in 32-bit land, it isn’t. But in 64-bit land, it’s a mess, and it oughtn’t to be. So there!

Unmetered

Living where I live (i.e., slightly less than 100Km from Sydney city centre), we cannot get “proper” broadband. No cable company flogs its wares in our neighbourhood; no local telelphone exchange is equipped for anything more than tin-can-and-string telephony, so ADSL is right off the options list! Our phone lines are so bad, we can’t even get ISDN -and that’s supposed to be accessible by 98% of the population! In desperation, we use Wireless Broadband, courtesy of Telstra Bigpond.

The plans available are enough to make you cry: the best I get is a 10GB monthly download limit for an eye-watering $129. Once you hit 10GB, there’s no extra data allowance available for purchase (and, in fairness, no extra charges): you just get shaped to speeds that would make a lethargic snail look sprightly. Friends and colleagues remark casually about their 200GB plans for $90 and wonder why I walk away in a hurry!

Nevertheless, I am generally happy with Bigpond: the connection seems always-on for about 95% of the time; the reception is good; speeds are excellent; and, when I need to, I can pack the whole thing, together with my netbook, and have Internet access on the morning train. Convenient, speedy, reliable -what more could you want (apart from more bandwidth and lower costs!)?

And my feeble 10GB allowance goes a lot further than you might think, thanks to the wonders of files.bigpond.com. Here, you’ll find Linux distro ISOs galore, a yum or apt-get update repository for the likes of Ubuntu or Fedora… and every single byte of this data munificence is completely unmetered, meaning that none of it counts in any way towards your 10GB monthly limit. Some months, therefore, I’ll actually download in excess of 25 or 30GB of stuff: 10GB on the ‘plan’ and maybe 20GB from files.bigpond.com, which magically ‘doesn’t count’!

Cue the inevitable sting in the tail: on June 30th, with all of three days’ notice, files.bigpond.com was taken down. No more unmetered access to anything… and therefore about 2/3rds of my effective monthly download limit abolished, at a stroke. This, as you can imagine, did not make for the happiest day of my life when I found out about it the moment I tried to download a new Fedora 13 ISO and kept getting redirected to the Bigpond home page. (Why they couldn’t email us to warn us, I have no idea: they’re happy enough to email a notification every time a phone bill arrives, after all!)

This was actually a deal-breaker for me. A loss of effective functionality so severe meant that I was fed up enough to go find another ISP. Actually, it turns out that there isn’t a single ISP in Australia that offers unmetered downloads for wireless broadband accounts, which is a bummer of major proportions! But there are lots of ISPs who will sell you a 6GB plan for $50 or so, with extra 6GB data blocks available on demand for about the same price (think Internode, for example). Once you’ve resigned yourself to never having access to unmetered downloads again, it’s a simple calculation to work out that Internode will sell you 12GB for $100, compared to Bigpond’s 10GB for $130: it’s not hard to work out where to go to!

The only thing you can hold against Internode is that they use the Optus wireless network, which is half the speed of Telstra’s on a good day -and I’ve had reception difficulties with them in the past (though their coverage maps now indicate a lot of that should be ancient history). But still, more data for less money: what’s not to like?!

More out of a sense of duty than actually expecting a decent reply, I took the trouble to write to the Bigpond sales people in these terms: I like Bigpond’s wireless service; I don’t want to change providers; but without that unmetered content, your product is sub-standard and non-competitive. Please tell me some good news that means I won’t have to change.

The usual two days’ wait for a reply ticked by.

Then yesterday, I got it: dear Howard, please be advised that we’ve made http://mirror.aarnet.edu.au unmetered.

Well, this a game-changer …and very, very unexpected (it’s always unexpected when Telstra/Bigpond actually listen to their customers!) Aarnet is an excellent mirror -much better, in fact, than the original files.bigpond.com. It has all the appropriate distros in DVD ISO format (apart from Centos, which is a bit of a bummer), and yum and apt-get repositories for updates. CPAN is there, so is Mozilla, Apache and a lot of others. For that to be unmetered makes me even happier than I was before and renders any thought of moving to the likes of Internode completely moot. Well done, Bigpond!

Which begs the question, I suppose: why pull the plug on a valuable resource, only to put the plug back in after you’ve pissed off a significant proportion of your customers? If it’s that easy to unmeter a site like aarnet, why not arrange to do that first, and then announce that since a gold-plated unmetered site is now available, there’s no need for the home-brew bronze alloy version? It would have been the sensible thing to do, I think (unless they simply had no idea about their customers and honestly weren’t expecting the storm of protest and discontent their original switch-off decision provoked).

It reminds me a bit of Julia Gillard’s approach to being Prime Minister: announce a regional processing centre for refugees in East Timor one day and only then start negotiating with the government of that country as to whether it’s actually possible to do! Surely the negotiations might usefully have preceded the announcement? But then that would mean having to hold off on the announcement whilst the practicalities were nailed down. It’s always harder to actually achieve something (i.e., actually do some governing!) and then announce it than the other way around, of course: which is presumably why it’s so often the other way around these days!

Anyway, Bigpond get at least half a thumbs-up from me for being relatively nimble in their ability to turn a mess of their own making into a positive. And I shall now get back to downloading some more ISOs… unmetered!