sobrique | Sep. 21st, 2006

Well, the coffee provided is nescafe gravy granuals. Which isn't nice.
Thankfully starbucks haven't closed down.

Useful info coming here, this time about data alignment on our SAN.

A 'feature' of Windows, and 'intel' based systems, is that of the MBR (master boot record) on a SAN.
And basically, your default stripe size on your SAN will be about 64k. But windows will be using 63 blocks, or 31.5k as an MBR, meaning that a whole bunch of your IOs will cross a stripe. Meaning it'll have to work twice as hard with a few of those.

Trick is to use diskpar/diskpart to 're-align' (128 blocks on diskpar = 64k, or align on 64 for diskpart, because it works in k already).

However one of the other cool features on a clariion is 'full stripe writes' on raid 5. Basically it's a bit of a cheat, because normally writing on raid 5 requires a read to calculate parity. However if it can write 'all' (or most, it'll actually read a little bit if yuo don't have it) at once, then it doesn't need the read, making it actaully more efficient than raid 1 for seqential writing.

Raid 1, you need 'physical' writes per write. Raid 5 using MR3 'full stripe writes' you only have the overhead for parity, so might only need 5 writes per 4 (on a 4+1 Raid 5).

But this also works notably better if properly aligned, so the suggestion is to align on your full stripe width, e.g. 256k on 4+1 raid 5, 192 on 3+3 raid 10, that kind of thing.

Better yet, VMWare ESX has this problem too. But has no 'built in' utility to fix it. So you get to mount your volumes on a windows machine _first_ set a partition, and then mount on vmwaer and format.

Joy.

Looks like I'm getting some work when I get back.

Other useful snippets from training:
Engineering mode on clariion disk arrays are enabled by pressing ctrl+shift+f12. The password is 'messner'. However, this is one of these immensely powerful things that can lead to disaster, and mockery from tech support if you screw up.

Also, queue depth on clariion sps has a limitation of about 1600 items. Which is quite a lot. However, if you've just accepted default settings for your LUNs on windows hosts, they each have a queue depth of 32, 64 or sometimes 128. Which means at about 50-100 LUNs, you've got the possibility of overloading your queue. This will trigger SCSI QFULL errors to your hosts. Which is bad. With powerpath, or UNIX, this will lead to a 60 second 'throttle' on the queue, which basically means severely degraded performance.

Without powerpath, the 'throttle' doesn't get managed automatically, so your windows host will just stop queueing at all. This is bad. However, if you're got more than a possible 1600 'queue items' per service processor, you can end up with transient, 60 second 'chugs' on your servers, with a fairly random distribution.

I have a feeling, I've seen this happening. (100 luns x 16 queue depth brings you close to this limit, and I'm fairly sure we're about that mark on our clariion)

S	M	T	W	T	F	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Ed's journal

Sep. 21st, 2006

Sep. 21st, 2006

Training, day 2

Training, more

Profile

December 2015

Most Popular Tags

Page Summary

Style Credit

Expand Cut Tags