SQL Server Replication to Oracle 12c

No version of SQL Server supports replicating to Oracle 12c out of the box.

This took us a little by surprise at work, since things had been replicating just fine in our equally-12c test environment. The difference in Production that arose this week, however, was that somebody decided to re-initialise the SQL Server publish-and-subscribe replication in that environment. This caused SQL Server to drop the replication-controlling table from Oracle and to re-create it from scratch -and only at that point does the lack of 12c support reveal itself.

The symptoms are easy to describe: if you check your replication monitor in the SQL Server Management Studio, you’ll see this error (or variations on it):

sqlserverrepl01

The cause of this is quite easy to find. In the SQL Server msdb database (under Programmability > Stored Procedures > System Stored Procedures, if you’re browsing in the Management Studio), you’ll find a procedure called sys.sp_MSrepl_createdatatypemappings. It contains this code snippet:

IF OBJECT_ID(N'dbo.MSdbms_datatype', 'U') IS NULL
 BEGIN
 print 'Creating table MSdbms_datatype'

create table MSdbms_datatype
 (
 datatype_id int NOT NULL IDENTITY,
 dbms_id int NOT NULL,
 type sysname NOT NULL,
 createparams int NOT NULL DEFAULT 0,
 CONSTRAINT pk_MSdbms_datatype PRIMARY KEY (datatype_id),
 CONSTRAINT fk_MSdbms_datatype_dbms_id FOREIGN KEY (dbms_id) REFERENCES MSdbms (dbms_id)
 )
 
 exec dbo.sp_MS_marksystemobject 'MSdbms_datatype'

--
 -- Define default dbms data types
 --
 exec sys.sp_MSrepl_MSSQLdatatypes 'MSSQLServer'
 exec sys.sp_MSrepl_DB2datatypes 'DB2'
 exec sys.sp_MSrepl_ORAdatatypes 'Oracle'
 exec sys.sp_MSrepl_ORAdatatypes 'Oracle', '8'
 exec sys.sp_MSrepl_ORAdatatypes 'Oracle', '9'
 exec sys.sp_MSrepl_ORAdatatypes 'Oracle', '10'
 exec sys.sp_MSrepl_ORAdatatypes 'Oracle', '11'
 exec sys.sp_MSrepl_SASdatatypes 'SYBASE'
 END

You’ll note that it’s creating a table to govern the ‘data type mappings’ between SQL Server and various other databases -in other words, and as an example, where SQL Server says ‘INT’ or ‘BIGINT’, Oracle will say ‘NUMBER’. SQL Server needs to be able to translate its data types to those used by ‘foreign’ databases, and this code sets that translation mechanism up.

Unfortuately, you’ll also note that whilst the code goes on to populate this data mapping table with translation entries for Oracle versions 8, 9, 10 and 11… there’s nothing at all for Oracle 12! It’s this lack of an explanation for what SQL Server data types should become which Oracle 12c ones that explains replication’s inability to create its MSREPL7 table correctly.

If, despite receiving that earlier error when first attempting to create a new subscription to an Oracle database, you describe the MSREPL7 table that will actually have been created in Oracle, you’ll see this sort of thing:

sqlserverrepl02

Notice all those “INTERVAL DAY(2) TO SECOND(6)” columns? They’re actually supposed to be NUMBER(10) and NUMBER(3) columns. Those BINARY FILE LOB columns are all supposed to be RAW(16), too.

MSREPL7 is the ‘governor’ for SQL Server replication: it records what tables are supposed to be replicating and what transaction they’re up to in the destination (Oracle) database. When the SQL Server source database next wants to send transactions to Oracle, it’s this table which tells it which transactions have already been sent and which have not. So when this ‘governor’ table is screwed up, you won’t get very far with any replication!

As an aside, things are slightly more complicated if you upgrade to 12c from a previously-working 11g. In that case, you’d already have had replication working nicely, even after completing the 12c upgrade process, and thus the MSREPL7 table would have all the right data types and there’d be no impediment to replication continuing to work. The other tables you’re replicating will also already exist in your newly-upgraded 12c database and they’ll have all the right columns too. So they will replicate nicely as well. So, you will think that all is well… because it is!

It’s only when someone decides to reinitialize  replication that it will all go pear-shaped, because it’s only at that point that SQL Server will drop the tables being replicated and attempt to re-create them… with all the wrong data types! When you then attempt to push a number from SQL Server into a column in Oracle that’s been freshly-created as an INTERVAL data type, that’s when sparks will fly.

This is why we didn’t realise that upgrading to 12c would be an issue in our various test environments: no-one had attempted to re-initialize the replication; and with many multi-hundred-million rows to be pushed across into Oracle, it was a fair bet that no-one would be daft enough to do that in Production either! Unfortunately, it turned out that my gambling instincts were wrong on this occasion: someone was daft enough to hit the wrong menu options, and thus the trouble began.

I couldn’t find much about this problem on Google, which was rather surprising (if one major database has trouble replicating to another major database, I would have expected to hear a lot about it by now!)

I did find enough to make me look at the column types in MSREPL7, and notice that all sorts of odd data types were in use. That got me hunting down any references to how SQL Server knows what data type maps to what in Oracle. And finally, that got me to the stored procedure the code from which appears above.

Looking at that code, it’s just begging for an extra line to be added to it, no?! I mean, if it read:

exec sys.sp_MSrepl_ORAdatatypes 'Oracle'
 exec sys.sp_MSrepl_ORAdatatypes 'Oracle', '8'
 exec sys.sp_MSrepl_ORAdatatypes 'Oracle', '9'
 exec sys.sp_MSrepl_ORAdatatypes 'Oracle', '10'
 exec sys.sp_MSrepl_ORAdatatypes 'Oracle', '11'
 exec sys.sp_MSrepl_ORAdatatypes 'Oracle', '12'

…then maybe all the troubles would go away, right?

But you can’t just run that sort of thing in the standard SQL Server Management Studio. For this, you need to run it in DAC (or ‘dedicated administrative connection’) mode. To do that, fire up SSMS as normal and connect to your SQL Server instance as normal. Click New Query to start a new query window. Finally, click the ‘change connection’ button, just to the left of the database selector:

sqlserverrepl03

That will pop-up a new dialog box to allow you to make a fresh connection to your SQL Server instance:

sqlserverrepl04

But notice this time that I’ve stuck ADMIN: in front of my usual ‘server name’ connection string. That will connect you in the right ‘god mode’. You can’t really tell anything is different from a non-DAC connection, apart from the ‘admin:’ identifiers scattered all round the SSMS interface if you look in all the right places!

Anyway, once you’ve got a DAC connection, you can type in this lot of SQL and run it:

USE master

GO

exec sys.sp_MSrepl_ORAdatatypes 'Oracle', '12'

GO

exec sys.sp_MSrepl_MSSQL_ORA_datatypemappings @source_dbms = N'MSSQLSERVER', @destination_dbms = N'ORACLE', @destination_version = '12'

GO

If all goes according to plan, you should see this sort of response:

sqlserverrepl05

Once the data types have been created for Oracle 12, drop the MSREPL7 table from your Oracle database and then try re-initializing your replication subscription once again. This time, it should just work:

sqlserverrepl06

That shows you me dropping MSREPL7 in a SQL*Plus window in blue; then I went and re-initialized the replication in SQL Server (which you can’t see me doing). You then see me describing the structure of MSREPL7 in the blue window: notice (a) the table has been re-created despite just being dropped; and (b) it’s been created with NUMBER and RAW data types for its columns, not INTERVALS and BLOBs. The Synchronization Status progress window finally shows that replication transactions were successfully delivered to the Oracle database.

You can also issue the required fixup SQL in the command-line sqlcmd tool. Connect to your database in DAC mode with the “-A” switch:

sqlcmd -S winsql\winsqldb -E -A

The -S takes the form it does here because it describes the server AND the instance to connect to (I use named instances). Hence “winsql” is the server name and “winsqldb” is the instance to connect to running on that server. The -E switch means “use Windows authentication”, because that’s what I do! If you use named users, you’d use the -U switch and specify the username (and the -P switch if you fancied typing his password in clear text on the command line!). The -A switch means, as I’ve mentioned, connect in DAC mode.

Note that DAC mode is only available locally: that is, you have to be working on the actual server that’s hosting the SQL Server instance. If you insist on running the fixup SQL remotely, you can enable remote DAC access by right-clicking the instance in SSMS, clicking Facets, selecting Surface Area Configuration from the drop-down menu and then setting RemoteDacEnabled to True. You also need the SQL Browser service to be running (which you can switch on from the normal services utility, or by running the SQL Server Configuration Manager, clicking the SQL Server Services item and then the SQL Browser item in the right-hand pane (right-click it and click Start).

Incidentally: you may have thought (as I innocently did) that upgrading your SQL Server from (say) 2008 R2 to (say) 2016 would help. All I can tell you is that the code snippet which appears at the top of this piece is identically 12c-deficient in SQL Server 2016 as it is in any previous version (which I found surprising and completely inexplicable: I’m prepared to forgive SQL Server vintage 2008 for not knowing about Oracle 12c; I find it much harder to understand why SQL Server 2016 is silent on the subject!) So, as far as I know, this trick of manually adding a reference to 12c in DAC mode is the only way to get any version of SQL Server replicating to Oracle at all.

Whether bolting on a ’12’ to the end of a bit of code and running it is actually supported by Microsoft, I can’t say: I’ve looked for any indication it’s a properly-supported technique in all the usual places and turned up nothing either way.

But if you’re replicating SQL Server to Oracle 11g and thinking of upgrading to 12c any time soon, it might be useful to bear in mind that you’re going to need this technique should you ever need to re-initialize your subscriptions 🙂

We interrupt our broadcast…

…to announce that we’ve just sold our house. So it’s bye-bye wallabies and kangas. 🙁

The plan is now to rent a small-ish apartment somewhere nearer the city for a while. In the longer term, we’re planning to buy a house back in the UK: our days in Oz are numbered (simply because we want to travel more -and whereas 8 hours in the air from Heathrow gets you to most of Europe and lots of North America, 8 hours in the air from Sydney gets you to about the edge of Australian airspace).

Moving house is never fun. So expect the blog upkeep to suffer in the meantime. But I’ll do my best to keep things ticking over.

State of Play revisited

churchill150I mentioned some time ago that Churchill was looking pretty sick since no RHCSL 6.x distro with a version number greater than 6.6 seemed to deliver its software on a single DVD any more, and since multi-DVD installs break Churchill, Churchill was doomed.

But I was (happily) quite wrong on that point. It turns out that, in fact, CentOS is the only RHCSL distro that has chosen to package its distro in such a way that it breaks Churchill.

I’ve checked OEL 6.6, 6.7 and 6.8; and then I checked Scientific 6.6., 6.7 and 6.8; and I even checked genuine Red Hat 6.6, 6.7 and 6.8… and every single one of them works fine with Churchill.

So, as I mentioned last time, that now means Scientific Linux is the distro that Churchill will assume is in use if you use its ‘speed keys’ to build your Oracle server nodes. (Rather more bluntly, it means I’ve stopped using CentOS at all, for anything). You can still use OEL or Red Hat if you prefer, by spelling out your requirements on the boot strap line. But use a speed key to save some typing, and SL will be assumed.

And, of course, if you want to use CentOS 6.7 or 6.8 to build your Oracle servers using Churchill, forget it: because they just don’t work.

Why is the new default Scientific Linux and not, say, OEL? Simply because it’s (a) free to obtain without signing up for anything and (b) it’s not as butt-ugly as OEL when you boot it up (white text on a red background is neither pretty nor particularly legible).

Other than this issue of CentOS stopping working, the latest version of Churchill (we’re now up to version 1.6) introduces a rather more radical change of tack: no more support for Oracle 11g installations, of any sort, whatsoever. Of course, version 1.5 is still available which does still do 11g quite nicely -but 11g is so long in the tooth now, and completely out of support unless you happen to be using 11.2.0.4, for which an expensive paid support agreement is needed before it can even be downloaded, that it seemed a waste of time to keep 11g support in newer Churchill releases. So, 12c only it is.

Which of course means that Churchill’s hardware requirements have just gone through the roof to match Oracle’s own. A 2GB VM just doesn’t cut it any more. The new hardware minima are: 5GB RAM and a 40GB hard disk. So building a 2-node RAC and a 2-node Active Data Guard setup on your desktop now basically demands a 32GB desktop… and no laptops need apply.

All of which in turn means that I think it might be time to re-write the Churchill documentation (especially as it currently exists only as pages on the old version of this site): take the opportunity to spruce it up, refresh it, make it part of the native WordPress site and so on. Expect something soon-ish.

You can’t kick this around anymore…

nixonscan-finalRichard Nixon is the US president I admire the most, I think. He had a grasp of strategic thinking that few have matched. My admiration is often a source of amusement (and embarrassment) to my American friends, though, because that grasp of strategy failed him spectacularly during the Watergate affair, resulting in his resignation and disgrace.

But from my father, I sought, and was granted, special permission to watch his resignation speech, live, at 2AM London time in August 1974. I was a pretty committed 10 year-old! I also strongly suspect that without Nixon, none of us might be here today: the Cold War could have got very hot and very nasty on numerous occasions, until Nixon’s policy of détente made us all a lot safer.

Anyway, my fascination with the man lead me, several years ago now, to look to obtain his autograph (as one does with heroes). But they were (and are) not cheap, and ToH would not permit money-siphoning to take place on such a grand scale, despite my entreaties that a great man warrants grand money-wastage! Cheaper ones can be found if you look long enough, though-but I wasn’t that diligent back then, so everything seemed awfully pricey.

And then I happened to stumble across an auction on Ebay, for a letter allegedly signed by Nixon whilst he was serving as Vice President -so back in 1960. The asking price was about US$89, and no international shipping.

Well, at that price, it couldn’t be genuine, could it?! But the lady selling it said it had been sent to her father for some reason, he had died recently, she had no use for it and therefore wasn’t asking the world for it either. However, she couldn’t substantiate it more than that and knew nothing about its authenticity. But she would ship it to me in Australia if I paid some token sum or other (maybe an extra US$25).

And that is how I came to acquire my Nixon autograph for the piffling price of about US$115. It eventually arrived safely in the post -and this was my first chance to actually look and touch what I’d paid for. Well: I wasn’t disappointed. It certainly looked genuine (the signature is strong and definitely from a fountain pen). It also felt genuine: the paper is definitely old, the typewriting is definitely from a real typewriter.

Looks can be deceptive, of course! Nevertheless, the thing was framed and put in a place of honour on my study wall, where it has remained to this day.

But was it genuine??! It would bug me occasionally.

So early this week, I finally decided to do something about it: I sent the scanned copy of it you can see at the top of this piece to the Nixon Presidential Library, in Yorba Linda, California. I asked them if they had any records proving that, as the letter claimed, Nixon had been in Akron, Ohio on October 1st 1960 -and whether they could tell me anything about the addressee, Mr. Giampetro.

Their website is not the best! They don’t appear to know how to get https certificates to work, either! So I wasn’t hugely hopeful… but efficiency lurks in other corners of the library, it would seem, because this evening, not two days after I emailed, I received a very nice email from their archivist, one Dorissa Martinez, saying (and I quote):

According to our Pre-Presidential Materials (Laguna Niguel) Appearances (Series 207) finding aid, Vice President Nixon attended a rally at Memorial Hall in Akron, Ohio on October 1, 1960.

After searching through the Pre-Presidential Materials (Laguna Niguel) General Correspondence (Series 320) collection and Campaign 1960: PPS 57: Election. 1960. Post Election Correspondence, Acknowledgments, and Thank You’s, box 5, folder Thank you Letters Completed – Nationwide, I was unable to locate materials relating to Frank Giampetro.

I’m not sure I made entire sense of either sentence! But the general gist of it is that if you poke around the Pre-Presidential Materials long enough, you can indeed find evidence that Nixon was in Akron, Ohio on the date the letter claims he was. It doesn’t prove my letter is genuine… but it goes a long way to reassuring me on the point anyway.

It’s a shame there weren’t records that Frank Giampetro was, for example, a Republican Party hired driver, routinely used to transport any bigwigs that came by Ohio… so that part of the mystery remains. I’d certainly like to find out more about him (but the letter stays with me, come what may!)

Anyhow: I don’t think it would have mattered to me either way, but it’s nice to have my own little bit of near-confirmed Nixon-alia in front of me as I type. Let’s just hope greatness rubs off and shady dealings don’t!

State of Play

churchill150It has been almost a year since I did anything to my Churchill automation framework. (If you didn’t know or realise, Churchill lets you easily create virtual multi-node RAC and Active Data Guard environments).

I took a look at it lately to see about refreshing it. I discovered that whilst Churchill’s speed keys will assume that you’re using CentOS 6.6, CentOS has actually released versions 6.7 and 6.8 since the last Churchill refresh (as have Red Hat, Scientific Linux and Oracle, of course)! It is also still the case that no version of Churchill works with a RHCSL 7.x O/S (because systemd screws up network automation).

No worries, I thought to myself: tweak a few things and at least bring Churchill up-to-date-ish with version 6.8 as the new speedkey default.

No chance!

In their wisdom, the CentOS developers decided to split the main DVD release of 6.8 over two DVDs. That means Churchill’s trick of copying its own installation media onto its own hard disks (for subsequent automated network installs onto the RAC nodes to use) fails, because it can only copy one of the two install DVDs. (There is a special, 6.8-only Dual Layer DVD you can use, and that would work… but I decided not to go there for now).

This, combined with the systemd debacle, basically means that Churchill is dead in the water, and I therefore have made only one more tweak to it -whereby Scientific Linux version 6.7 is assumed to be the speedkey default- and after that, I won’t be maintaining it further.

Something new is required to automate 12c RACs on RHCSL 7.x environments… and I don’t know what that will be yet! Watch this space, I guess…

Fun Fedora 24

Just as my playing with the new Linux Mint release begins, so the Fedora team finalise a new version of their distro: Fedora 24 was released on 21st June.

It’s still very blue; it’s still very Gnome-y and therefore pretty awful as far as I’m concerned and I wouldn’t personally touch it with a feathered hat-band, let alone a bargepole.

But it’s out and therefore my Bogart preinstaller script, which makes Fedora a suitable platform for running Oracle Enterprise Edition, needs a run in the park to make sure it still works with the new version. Happily it does without any substantial changes at all.

However, I took the opportunity to do two things with Bogart. One was to remove its ability for preparing for an 11g installation. I know 11.2.0.4 is still supported, but you can’t get hold of that without a support contract; and if you’ve got a support contract, you won’t likely be wanting to run Oracle on an unsupported platform like Fedora! Meanwhile, any other version of 11g you can get your hands on has long-since been de-suppported… so Bogart is now 12c only.

And that means, two: I’ve re-written the Oracle-on-Fedora article to reflect it’s new only-12c-ness.

The revised article is here, and the updated Bogart preinstaller script is here.

Minty Fresh

Mint2Having just finished a push for domestic consistency by installing Ubuntu 16.04 on practically every machine I control (which therefore definitely excludes a certain Significant Other’s Windows PC!), I now see that the Linux Mint crew have just released a new version of their green and fragrant distro.

Never one to let the opportunity of a new distro installation pass me by if I can help it, I have accordingly just installed the Mate version onto my ancient(circa 2009) laptop… and it’s running nicely. Pretty slow, of course, but hardly unbearable. I probably don’t want to be doing Blender renders on it any time soon, but it’s fine for a bit of web editing, browsing, video playing and photo up-touching.

It also has its windows controls (maximise, minimise and close) on the right side of the window title bar (which is to say, yes it’s on the right-hand side, but it’s also the correct side as far as I’m concerned!) It is a small cosmetic change from vanilla Ubuntu, I guess; but it’s enough that it might persuade me to deploy it on my main PC instead of the current Ubuntu install. Time will tell.

I noticed in passing that though I had written simple installation scripts for Oracle 12c for a lot of other distros I’ve dabbled with over the years, I had curiously neglected to write one for Linux Mint, of whatever vintage.

So I’ve rectified that. On the Oracle articles page is a new one for installing 12.1.0.2 on Linux Mint 18. I’ve run out of inspiration for my script names now, so this one is just called Mentha -which, as we all know, is the Latin for ‘mint’.

I tested it on both the Cinnamon and Mate spins, but the screenshots in the article are all from Mate (for absolutely no reason at all, other than I happened to have it handy at the time the screenshots needed to be taken).

I should perhaps mention that I’m trialling the use of new ‘slideshow’ technology in this new article: it means the bazillion screenshots don’t take up nearly so much room on the screen and fade neatly from one to the other as you step through the slideshow. The image captions provide the instructions. It looks good -but there are at least two issues with it. One, if you’re using script blockers, the slideshow won’t work properly. And two, the plain-text version of the article (obtainable by clicking the ‘print’ link at the very end of the it) displays all the screenshots sans instructions, which is somewhat less than useful! If that is an issue for you, let me know and I’ll re-think… I am not entirely sure how many people bother with the plain-text versions of articles, so I don’t know how much of a deal-breaker this is. As I say, tell me (in the comments) if it is one for you…

Server Error: 550

RH_bestpractice_img_commonBouncesAs I mentioned last time, I have abandoned efforts (for the moment) to get my own postfix/dovecot/etc/etc mail server working due to security issues that are non-trivial to work around. I also mentioned that I only found out about the security issues after I’d written six articles about putting the pieces together!

I suppose it was inevitable that a number of people would ask me to publish the six articles despite the end result not being entirely satisfactory from a security point of view… so I have, and they are available from the Linux Articles page.

Return to Sender

Postfix_logoI have spent the past week or so -and not an inconsiderable amount of coin- trying to build a CentOS7 server that uses Postfix, Dovecot, Spamassassin, Amavis and Roundcube so that I can finally ditch the ever-more-privacy-intrusive mail services from the likes of Gmail and Outlook. It is a madness that comes over me once every year or so. The enormous, steaming pile of disparate software pieces that have to bolted together just-so to get anything working at all has never really put me off trying… but my previous efforts, though functional, haven’t really been fully functional and never took security too seriously.

This time I got really close to doing it properly. Everything worked fine after just the seventh installation-from-scratch attempt, and six nice how-to articles ready to publish. 95% of the work had been sorted out on the first pass, I have to say -but the missing 5% was the very devil to get right. It was all to do with delivering Spam (because no spam detection algorithm is perfect), but into a junk mail folder rather than the main inbox. The tool to do that is called ‘dovecot-pigeonhole’ and the protocol involved is called ‘sieve’… and configuring sieve and dovecot in a way that actually works is a nightmare. But I did it. It worked. Spam moved all by itself to a place where I could save it if I wanted to or delete it if I didn’t.

And then I logged on with the wrong password… and Roundcube (the webmail interface) let me in regardless!

WTF?!

Turns out that by design and default, MySQL (and thus Dovecot and thus Roundcube) uses ‘CRYPT’ as its password encryption technology, which is based on DES, and (here’s the punchline) only uses the first 8 characters of your supplied password. So yes, I had mis-typed the 23rd character of my 26-character long password (it happens!)… but Roundcube had let me in anyway because the first 8 characters had been typed in an impeccably accurate manner.

Several things to say about this. The first one is that it’s bloody annoying! Second, it’s bloody stupid. And third, after an hour reading about how you can configure things to do SHA512-CRYPT (which doesn’t chop your passwords in thirds without telling you), I spent 45 minutes trying to implement it without success before giving up in utter frustration.

I am now using outlook.com again!

Of the 98 hurdles between me and a working, secure, web-accessible, flexible, virus-blocking, spam-delivering email server, I had managed to leap, more or less gracefully, 97 of them. But truncating passwords is too much like plain stupidity to make the effort involved in limping over the 98th just too much bother.

I wonder if there is a non-postfix, non-dovecot means of doing a mail server… I have a spare server in Paris to make use of, after all…

All done

finish01The move to a new server has now completed, and if you’re reading this then I assume everything worked as intended.

There are a few minor, known issues: anything I’ve ever linked to in my file repository at diznix.com is now broken, because it was simpler to create a new Owncloud file respository from scratch than to try to migrate the thing over. Fortunately, there are only a dozen or so such links: they should be sorted by the end of the day.

Otherwise, it’s all lovely, clean and fresh and working well!

(But if you do encounter glitches or problems with anything not working, feel free to let me know in the comments).

There was one nasty surprise that arose as a result of switching from ye olde, stable CentOS (MySQL version 5.1.something) to flash-harry Ubuntu (MySQL version 5.7.something). Between those two versions, MySQL invented something called “Strict Mode” -and in 5.7, this actually became the default operating mode. Specifically, the “STRICT TRANS TABLES” mode was enabled -and this governs the way MySQL reacts to things like implicit data conversions in a SQL statement. In non-strict mode, for example, an attempt to set a VARCHAR column to a value of 1 would raise a warning, but not fail. In strict mode, it fails. (As any fule knowe, if you want a varchar to contain the string value 1, you need to wrap your ‘1’ in quotation marks!)

The relevance to my server move? As per this article, I now re-compute the comment counts for my posts every time I publish a new one -and my query to do that is this:

UPDATE dizwp.wp_posts wpp
LEFT JOIN
(SELECT comment_post_id AS c_post_id, count(*) AS cnt FROM dizwp.wp_comments
 WHERE comment_approved = 1 
 GROUP BY comment_post_id) wpc
ON wpp.id=wpc.c_post_id
SET wpp.comment_count=wpc.cnt
WHERE wpp.post_type IN ('post', 'page')
      AND (wpp.comment_count!=wpc.cnt OR (wpp.comment_count != 0 AND wpc.cnt IS NULL));

Spot the error? Neither did I for a long time!! However:

mysql> desc wp_comments;
+----------------------+---------------------+
| Field                | Type                |
+----------------------+---------------------+
| comment_ID           | bigint(20) unsigned |
| comment_post_ID      | bigint(20) unsigned |
| comment_author       | tinytext            |
| comment_author_email | varchar(100)        |
| comment_author_url   | varchar(200)        |
| comment_author_IP    | varchar(100)        |
| comment_date         | datetime            |
| comment_date_gmt     | datetime            |
| comment_content      | text                |
| comment_karma        | int(11)             |
| comment_approved     | varchar(20)         |

Spot it now??

COMMENT_APPROVED is declared to be of type varchar(20).

My update statement says: WHERE comment_approved = 1

The WHERE clause is treating the number ‘1’ as a number, because it hasn’t any quotation marks around it. That is not good in strict mode when the column you are querying is a VARCHAR. 🙁

Two solutions, then: disable strict mode by editing /etc/mysql/mysqld.conf.d/mysqld.cnf and appending the line:

sql_mode=ONLY_FULL_GROUP_BY,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION

…after which you do a service mysql restart and the whole of MySQL is now running in non-strict mode and will do implicit data conversions without falling over.

Or you just re-write the original update statement to put quotation marks around the 1 in the WHERE clause:

WHERE comment_approved = '1'

I assume that WordPress knows all about strict mode and doesn’t need it disabled, therefore the addition of the quotes to my query seemed the more sensible option… but I imagine this is one of those niggly little issues that is quite likely to break an awful lot of MySQL applications if you’re not careful!

Anyway: apart from that, no major damage done, and I’ve edited the original article to add mention of this not-so-minor potential deal-breaker!