What I did with my Saturday
Aug. 22nd, 2005 01:30 pmSaturday, was, as has been the tendancy for the last couple of months, spent in work.
Bright and early, I bounced out of bed, filled with enthusiasm for the mission.
Actually, it was more like my crawling out of bed, thinking 'oh god, not another saturday in work'.
However, first order of the day was logging in remotely to shutdown oracle databases and start backups.
After doing so, figuring they won't take all that long, I set out for work. Stopped for a bite to eat in Tescos, and mooched on into work for about 10:30.
Found that the backups hadn't finished yet, so instead carried on SAN attaching another server. (Broadvision box, and not in production, so wasn't urgent).
After lunch, found that the backups had completed about 15:30, took down said oracle server to install a patch cluster.
And because it's a slow beast, was still installing the patch cluster at 19:30.
At which point, I rebooted, and got a wonderful error:
few more errors, followed by a kernel panic, and a reboot.
Which was nice.
Took down into single user again, booting from my handy Solaris 8 CD.
RE-patched the patch in question. It had failed, apparantly, due to 'corrupt pkginfo'. Same problem.
Reboot again (taking 5-10 minute each time, because CD boots have to load an OS into RAM) try and remove the package:
So, was stuck with 'can't install, can't uninstall'.
Plan B was to 'hand install' the files from the patch in question.
It failed.
Plan C was to copy the 'failing' binaries from a known good server. That kind of worked, but basically each time I rebooted found another one with the problem.
Plan D was an ad-hoc backout of the patch, by grabbing relevant files from said 'known good' server, and splatting them all over the files this patch had killed.
That kind of worked. By which time, it was 23:30.
So I went home.
Telnet appears to be somewhat broken, but the server is ssh-able and oracle is running.
So it's 'servicable'
But so nicely fucked that I can't migrate it to the new SAN. I can't add patches. I can't remove patches. I'm dubious about even installing anything else, as /var seems to be doing bogus things to inodes.
Plan E is 'find another box, rebuild as solaris 8, and move everything across'. Which is probably yet another weekend.
Joy.
Bright and early, I bounced out of bed, filled with enthusiasm for the mission.
Actually, it was more like my crawling out of bed, thinking 'oh god, not another saturday in work'.
However, first order of the day was logging in remotely to shutdown oracle databases and start backups.
After doing so, figuring they won't take all that long, I set out for work. Stopped for a bite to eat in Tescos, and mooched on into work for about 10:30.
Found that the backups hadn't finished yet, so instead carried on SAN attaching another server. (Broadvision box, and not in production, so wasn't urgent).
After lunch, found that the backups had completed about 15:30, took down said oracle server to install a patch cluster.
And because it's a slow beast, was still installing the patch cluster at 19:30.
At which point, I rebooted, and got a wonderful error:
/kernel/drv/su: Undefined kernel symbol "miocpullup".
few more errors, followed by a kernel panic, and a reboot.
Which was nice.
Took down into single user again, booting from my handy Solaris 8 CD.
RE-patched the patch in question. It had failed, apparantly, due to 'corrupt pkginfo'. Same problem.
Reboot again (taking 5-10 minute each time, because CD boots have to load an OS into RAM) try and remove the package:
Cannot remove patch as a prior installation failed, re-run pkgadd
So, was stuck with 'can't install, can't uninstall'.
Plan B was to 'hand install' the files from the patch in question.
It failed.
Plan C was to copy the 'failing' binaries from a known good server. That kind of worked, but basically each time I rebooted found another one with the problem.
Plan D was an ad-hoc backout of the patch, by grabbing relevant files from said 'known good' server, and splatting them all over the files this patch had killed.
That kind of worked. By which time, it was 23:30.
So I went home.
Telnet appears to be somewhat broken, but the server is ssh-able and oracle is running.
So it's 'servicable'
But so nicely fucked that I can't migrate it to the new SAN. I can't add patches. I can't remove patches. I'm dubious about even installing anything else, as /var seems to be doing bogus things to inodes.
Plan E is 'find another box, rebuild as solaris 8, and move everything across'. Which is probably yet another weekend.
Joy.
no subject
Date: 2005-08-22 03:12 pm (UTC)I'll think about it :)
no subject
Date: 2005-08-23 08:57 am (UTC)no subject
Date: 2005-08-23 09:08 am (UTC)