I've just been told by 'directorial fiat' that the backups of our oracle cluster shall be stopped henceforth, because one of the apps is running a bit slow.
I don't like that.
We are tasked with looking for workarounds for the problem. The problem is that our backup data volumes have increased dramatically (we're moving around 2-3 Tb / night) and our network infrastructure hasn't been upgraded in 5 years. (We have 100mb to every desktop, and a 155mb backbone). So we are faced with wallpapering over cracks.
Again.
I don't like that either.
Oh, and the person having the problem whinged at his director, who has in turn whinged and our director, who has whinged at our management team, who have whinged at us. Which is good really, because that was the first we'd heard about it.
I don't like that either.
I don't like that.
We are tasked with looking for workarounds for the problem. The problem is that our backup data volumes have increased dramatically (we're moving around 2-3 Tb / night) and our network infrastructure hasn't been upgraded in 5 years. (We have 100mb to every desktop, and a 155mb backbone). So we are faced with wallpapering over cracks.
Again.
I don't like that either.
Oh, and the person having the problem whinged at his director, who has in turn whinged and our director, who has whinged at our management team, who have whinged at us. Which is good really, because that was the first we'd heard about it.
I don't like that either.
no subject
Date: 2006-01-20 11:18 am (UTC)no subject
Date: 2006-01-20 11:59 am (UTC)no subject
Date: 2006-01-20 12:01 pm (UTC)WHAT?!
No backups, are they mad?
Are they sure it's a network capacity issue? are they even sure what a network is?
Just when is this random act of management supposed to kick in?
no subject
Date: 2006-01-20 12:07 pm (UTC)They are not sure it's a capacity issue. That's only what myself and my collegue have been saying; we haven't had a consultant in therefore we cannot have an official opinion on the matter.
We have a network bottleneck, that's running at 40-80% all day. We have a backup server that's (over) running at 100Gb/hour + for 14-16 hours per day. With 8 tape drives, any 'glitch' has notable knock on effects (such as a server with a bad network link, tying up a tape drive for 16 hours)
no subject
Date: 2006-01-20 12:59 pm (UTC)no subject
Date: 2006-01-20 01:04 pm (UTC)Presumably your manager and the director have gone back to the business and told them that if they want a faster network/no backups then they will have accept the risk, in writing. Or would that be putting too much trust and credibility in the management
no subject
Date: 2006-01-20 02:03 pm (UTC)If a car park is full, it is not faulty, it is just full.
If a road is congested, it is not faulty, it's just busy.
analogies.
no subject
Date: 2006-01-20 02:34 pm (UTC)no subject
Date: 2006-01-20 02:37 pm (UTC)no subject
Date: 2006-01-20 01:21 pm (UTC)I'd also raise the issue with management about correct error reporting procedure. Do you have Service Level Requests or equivalent business mumbo-jumbo over at your end?
Having said that, it could be one of those unintentional chinese-whisper cascades you get with management, where you say 'this network is a bit slow', your boss overhears and goes to his manager with a 'my people can't work due to slow IT', and so forth.
no subject
Date: 2006-01-20 02:02 pm (UTC)Chinese whispers is a definite possibility, especially when you have techy -> manager -> director -> director -> manager -> techy.
Which is one of the reasons We Don't Do It.
no subject
Date: 2006-01-20 02:29 pm (UTC)Your job & reference history could depend on it.
no subject
Date: 2006-01-20 02:44 pm (UTC)I'm doing arse covering at the moment.
I _do_ have a fallback plan on how to recover, but it's ugly.
Oracle databases really suck to rebuild :/
no subject
Date: 2006-01-20 03:10 pm (UTC)When was the last time you were asked to submit a buildup plan for disapproval? I recall in the last year you've added at least one storage appliance, and most of your kit is ready for gigabit or fibre-channel, right?
The way I feel it, the data is looming ever closer to crashdown, simply by the way the fates conspire to make it increasingly difficult to do effective recovery.
What happens if you start cold calling offsitedata storage shops and get big fat blue-sky quotes from them to contrast against a backbone upgrade?
This may not be a terribly dangerous situation now, Ed. You know, though, that the length of time left before it does is directly proportional to the MTBFR! (at this point you don't *really* trust this kit, so much as add another wad of chewing gum, aye?) And didn't you just come back up from some nightmare a couple weeks ago??
no subject
Date: 2006-01-20 03:22 pm (UTC)We have a _fair_ number of gigabit capable machines. However our backbone remains ATM, 155mb. And our 'server' network is 95% 100Mb, with a few gigabits that don't do a lot of good, because they have no where to go.
Basically, we've already had the 'we're understaffed' meltdown, we're getting to the 'our kit is too old' meltdown.