(no subject)
Jul. 10th, 2008 06:42 pmTonight I have work to do.
This ... as a rule, I find somewhat annoying, irritating and depressing. However tonight I'm doing something a little bit different, and when there is novelty value, there is at least a little less irritating.
So anyway. Tonight, I'm doing an online volume expansion. I'm taking a 100Gb volume (LUN), and expanding it to 300Gb. Whilst the server keeps running, and the data stays available.
Which is actually a pretty neat trick.
Here's how it's done.
First off, you need a BCV device (BCV stands for business continuance volume, which is a fancy way of saying 'a disk you use for backups and disaster recovery'). This device needs to be the same number of tracks as the device you're expanding, and it can't be RAID-5*. You also need to create new volume components, of the same geometry. Since this primary device is a Raid 5+0 device, 5x20, I need to create 10x more 20GB Raid 5 devices. I also need to create 5x20Gb 'normal' devices, to use for this expansion.
Which gives the result of:
New symdevs: 2B61:2B6F
To make this a little more interesting, the device I'm trying to expand is synchronously replicated to a remote site. I can't do a config change on it in that state, so before doing anything more, I need to stop that:
(the file RDF_Pairs.txt contains just the device id I'm working with)
I also need to turn my BCV devices, into a BCV meta device:
This also needs to be done on both local and DR.
Then, I need to start the expansion running - however, I'll only need to preserve the data at our local site, because the remote will have to be resynced anyway. So what i'll be doing at the remote site is using the BCV device I've created, to act as a 'gold' copy (the remote copy is identical to the local anyway).
And that'll take a few hours to run, since it essentially has to reshuffle 100Gb of data around, to be spread out across 15 disks, rather than 5.
In the meantime, over on the DR site:
This'll get the two devices synchronising, and once that's done, I'll split them again.
So, after that happens, and probably tomorrow, on the windows server we'll do a SCSI rescan (under disk manager) and then use 'diskpart' to extend the volume:
And it's actually that easy.
Then it's necessary to resync the SRDF device:
When talking about SRDF, the R1 device is the 'source' and the R2 is the target. This commmand declares that the R2 should be 'trashed' and have the R1 data replace it. Which is right, because it's only the R1 that we'll be actively extending.
Having done that, it's a case of leaving it to sync for a while acp is 'adaptive copy', which is less intensive than synchronous when there's a backlog (and there will be, as we've got 300Gb to sync).
After that, it's dissolving the BCV meta device:
And then delete the devices:
About 10 hours work, over two evenings.
Yay for overtime, boo for working out of hours.
This ... as a rule, I find somewhat annoying, irritating and depressing. However tonight I'm doing something a little bit different, and when there is novelty value, there is at least a little less irritating.
So anyway. Tonight, I'm doing an online volume expansion. I'm taking a 100Gb volume (LUN), and expanding it to 300Gb. Whilst the server keeps running, and the data stays available.
Which is actually a pretty neat trick.
Here's how it's done.
First off, you need a BCV device (BCV stands for business continuance volume, which is a fancy way of saying 'a disk you use for backups and disaster recovery'). This device needs to be the same number of tracks as the device you're expanding, and it can't be RAID-5*. You also need to create new volume components, of the same geometry. Since this primary device is a Raid 5+0 device, 5x20, I need to create 10x more 20GB Raid 5 devices. I also need to create 5x20Gb 'normal' devices, to use for this expansion.
create dev count=5, emulation=FBA, size=21846, config=BCV, disk_group=1;
create dev count=10, emulation=FBA, Size=21846, config=RAID-5, disk_group=2;
Which gives the result of:
New symdevs: 2B61:2B6F
To make this a little more interesting, the device I'm trying to expand is synchronously replicated to a remote site. I can't do a config change on it in that state, so before doing anything more, I need to stop that:
symmrdf -g RDFDG split DEV001
symmrdf -f RDF_Pairs.txt -sid 760 -rdfg 3 deletepair -force
(the file RDF_Pairs.txt contains just the device id I'm working with)
I also need to turn my BCV devices, into a BCV meta device:
form meta from dev 2B61 config=stripe, stripe_size=2 cyl;
add dev 2b62:2b65 to meta 2b61;
This also needs to be done on both local and DR.
Then, I need to start the expansion running - however, I'll only need to preserve the data at our local site, because the remote will have to be resynced anyway. So what i'll be doing at the remote site is using the BCV device I've created, to act as a 'gold' copy (the remote copy is identical to the local anyway).
add dev 2b66:2b6f to meta 0537, protect_data=TRUE, bcv_meta_head=2b61;
And that'll take a few hours to run, since it essentially has to reshuffle 100Gb of data around, to be spread out across 15 disks, rather than 5.
In the meantime, over on the DR site:
symdg create temp_gold_copy
symld -g temp_gold_copy add 0537
symbcv -g temp_gold_copy add ld 2b61
symmir -g temp_gold_copy establish -full
This'll get the two devices synchronising, and once that's done, I'll split them again.
So, after that happens, and probably tomorrow, on the windows server we'll do a SCSI rescan (under disk manager) and then use 'diskpart' to extend the volume:
diskpart
select disk 2
select partition 1
extend
And it's actually that easy.
Then it's necessary to resync the SRDF device:
symrdf createpair -f RDF_Pairs.txt -sid 760 -type RDF1 -rdfg 3 -invalidate r2
When talking about SRDF, the R1 device is the 'source' and the R2 is the target. This commmand declares that the R2 should be 'trashed' and have the R1 data replace it. Which is right, because it's only the R1 that we'll be actively extending.
symrdf -g devicegroup set mode acp_disk DEV001
symrdf -g devicegroup establish -full DEV001
Having done that, it's a case of leaving it to sync for a while acp is 'adaptive copy', which is less intensive than synchronous when there's a backlog (and there will be, as we've got 300Gb to sync).
After that, it's dissolving the BCV meta device:
dissolve meta 2b61;
And then delete the devices:
delete devs 2b61:2b65
About 10 hours work, over two evenings.
Yay for overtime, boo for working out of hours.
no subject
Date: 2008-07-11 07:55 am (UTC)Yours sound more interesting though, since you're actually doing something - I'm just sitting around while someone else does an upgrade. As you say, Yay for overtime, boo for working out of hours.
no subject
Date: 2008-07-15 06:52 pm (UTC)Mine did. Nicely volume expanded, and resynced over SRDF. Which was nice.
no subject
Date: 2008-07-16 12:45 pm (UTC)Our Flare upgrade went fine, but the dial-in software took a couple of hours to get nowhere, only just sorted today.
Jon spotted on Friday that when Networker went in, the EMC consultant zoned the NAS in wrongly, so it's actually now in a configuration unsupported by EMC... EMC to correct it before we can go on with the next Flare / NAS upgrade. Fun isn't it :-)