Tuesday, April 24, 2007

RAC Alert Log Consolidation Script from M2

M2 - aka Michael Möller - of Miracle A/S has created a neat, little program for Oracle RAC (ab)users. Extremely open source. Windows only. Beta, until proven otherwise.

It will neatly display time-coordinated output from 2-8 alert logs side-by-side.

Send me an email at mno@MiracleAS.dk and I'll send you source and documentation.

Looks cool, Michael!

Mogens

Friday, April 20, 2007

Me & Spiderman

Don't worry about the Danish speak here:

http://crn.dk/index.php/news/video/id=23459

It's my video blog on Computer Reseller News (crn.dk), and it's been going on for a while.

This time I'm talking about Fast Beer (TM) by the big standard breweries like Carlsberg, where they use enzymes to speed up the brewing process from 3-4 weeks or more to two-nice days (max).

But watch the Spiderman trailer before and after that enzyme talk. That's cool :-))).

Monday, April 16, 2007

So few really need uptime

Here's my latest take on high availability:

1. Very, very few really need it.
2. The vast majority can live with being down for days.
3. Those who really, truly need it must be able to spend a lot of everything on it - continutally.
4. Those who buy HA-gear (software and hardware) without realising how much they have to spend on it will suffer downtime much worse than without the HA-gear.

Pity those folks who buy the gear and think it works like advertised, out of the box, like the vendor said, like the POC showed, and so on. They're the true victims.

Typically, it takes 8-9 months to truly test and stabilise a RAC system. As I've said somewhere else, some people elect to spend all of those nine months before going production whereas others split it so that some of the time is spent before and, indeed, some of it after going production.

But that's not all: Even when the system has been stabilised and runs fine, it will a couple of times a year or more often go down and create problems that you never saw before.

It's then time to call in external experts, but instead of just fixing the current cause of your IT crisis, I'd like to suggest that you instead consider the situation as one where you need to spend a good deal of resources in stabilising your system again - until the next IT crisis shows up.

Your system will never be truly stable when it's complex. The amount of effort and money you'll need to spend on humans being able to react to problems, running the system day-to-day, and - very important - keep them on their toes by having realistic (terribly expensive) test systems, courses, drills on realistic gear, networks of people who can help right now, and so forth... is huge.

The ironic thing is this: If you decide that you can live with downtime, and therefor with a much less complex system - your uptime will increase. Of course.

Mark Souza of Microsoft said it: Complexity is the enemy of availability.