Pertinent Info
Mark Lloyd
mlloyd@cbis.com
Fri, 6 Jun 1997 15:27:45 -0400 (EDT)
Obviously 2.3 should follow the first occurrence of VxVM in my note below.
I apologize for the confusion.
Mark E. Lloyd
Cincinnati Bell Information Systems (CBIS)
voice : (513) 784 - 7455
email : mlloyd@cbis.com
> From owner-ssa-managers@Eng.Auburn.EDU Fri Jun 6 15:10:36 1997
> From: Mark Lloyd <mlloyd@cbis.com>
> Date: Fri, 6 Jun 1997 15:06:46 -0400 (EDT)
> To: ssa-managers@Eng.Auburn.EDU
> Subject: Re: Pertinent Info
> Mime-Version: 1.0
> Content-Transfer-Encoding: 7bit
> Content-Md5: WWFCEsfWaCyGOgv8gYJl9w==
> X-Info: To unsubscribe, send 'unsubscribe ssa-managers' to
majordom@Eng.Auburn.EDU in message body
>
> Am I the only confused by this Alert ? There's certain parts that seem to
imply
> if you are running VxVM then everything should be ok. However, there's another
> line stating they are working for a patch for VxVM version 2.3
>
> What's reality ?
>
> Mark E. Lloyd
> Cincinnati Bell Information Systems (CBIS)
>
> voice : (513) 784 - 7455
> email : mlloyd@cbis.com
>
>
>
> > From owner-ssa-managers@Eng.Auburn.EDU Fri Jun 6 14:34:02 1997
> > From: Doug Hughes <Doug.Hughes@Eng.Auburn.EDU>
> > Date: Fri, 6 Jun 1997 13:30:44 -0500
> > To: ssa-managers@Eng.Auburn.EDU
> > Subject: Pertinent Info
> > X-Info: To unsubscribe, send 'unsubscribe ssa-managers' to
> majordom@Eng.Auburn.EDU in message body
> >
> > Just got this:
> >
> > (We've already fixed this here, by the way, and had gotten bitten on it
> > many many times in the past, and submitted many many bug reports before
> > it finally got fixed. It's nefarious.. You go to move some subdisks.
> > Everything goes fine. Then, at some later time (usually within a week)
> > your computer crashes on free freeing free frag, or bad inode, or FS
> > corruption of some other kind. We're REALLY GLAD they finally fixed it...
> > Saves having to get up at 3am to fsck 40GB of data.. thank goodness
> > for remote consoles.. enough rambling..)
> >
> >
> >
>
================================================================================
> > SunService
> >
> > SUNSOLVE EARLYNOTIFIER(SM) ALERT
> >
> >
> > SunSolve EarlyNotifier Alert is published periodically to provide
> > SunService customers with the latest and most important technical
> > information regarding Sun hardware and software.
> >
> >
******************************************************************************
> >
******************************************************************************
> >
> > DATE: May/22/97
> >
> > SYNOPSIS: "Bad parity" with RAID-5 and SPARCstorage Array Volume Manager
> > Software version 2.x.
> >
> > PRODUCT CATEGORY: Storage/Software
> >
> > PRODUCTS AFFECTED:
> >
> > Any SPARCstorage Array Volume Manager Version 2.X Software releases,
> > patched and unpatched, that support RAID-5 configurations.
> >
> > PART NUMBERS AFFECTED:
> >
> > N/A
> >
> > REFERENCES:
> >
> > BUGID#s 1223482, 1242923, 4010911, 4043658
> > ESC #s 509297, 508222, 506310, 504099, 504031
> >
> > PROBLEM DESCRIPTION:
> >
> > In all Veritas releases that supported RAID-5 prior to VxVM 2.3, some
> > maintenance functions of RAID-5 volumes under Veritas control can create
> > bad parity. This could occur during maintenance of the RAID-5 volume when
> > growing the volume or file system, moving subdisk, vxevac, etc.
> >
> > For non-typical RAID-5 configurations which have the following
> > characteristics:
> >
> > - a RAID-5 column which is split into multiple subdisks where the
> > subdisks do NOT end and begin on a stripe-unit aligned
> > boundary
> >
> > - and a RAID-5 reconstruction operation was performed.
> >
> > Data corruption can be present, in the region of the RAID-5 column
> > where the split subdisks align.
> >
> > The following is an example of a RAID-5 column which is comprised of
> > split subdisks within a RAID-5 stripe-unit, which is not on a
> > stripe-unit boundary: (Note the split in column 2 composed of subdisk
> > 2 and subdisk 4)
> >
> > e.g.:
> > 3 column RAID-5 (using defaults):
> >
> > Column 1 Column 2 Column 3
> > ========= ========= =========
> >
> > subdisk 1 subdisk2 subdisk3
> > +---------------+ +===============+ +---------------+
> > |stripe-unit | | (subdisk 2) | | |
stripe-width
> > |is 16k | | | | | is 48k
> > +---------------+ +===============+ +---------------+
> > | | | (subdisk 2) | | |
> > | | | | | |
> > +---------------+ +===============+ +---------------+<-
> > stripe-unit
> > boundaries
> > ... ... ... / /
> > +---------------+ +===============+ +---------------+<--/ /
> > | | | (subdisk 2) | | | /
> > | | | | | | /
> > +---------------+ +=======+++++++++ +---------------+<--/
> > | | +-->| (sd 2)+ (sd 4)+ | |
> > | | | | + + | |
> > +---------------+ | +=======+++++++++ +---------------+
> > | | | + (subdisk 4) + | |
> > | | | + + | |
> > +---------------+ | +++++++++++++++++ +---------------+
> > | | | + (subdisk 4) + | |
> > | | | + + | |
> > +---------------+ | +++++++++++++++++ +---------------+
> > ... | ... ...
> > ... | ... ...
> > ... | ... ...
> > |
> > |
> > |
> > Note that this stripe-unit for this RAID-5 volume
> > is "split" by 2 subdisks (i.e. the end region of
> > subdisk 2 and the beginning region of subdisk
> > 4). Note also that subdisk 2 does not end on a
> > stripe-unit boundary, and that subdisk 4 does
> > not start on a stripe-unit boundary, but rather
> > somewhere within the stripe-unit.
> >
> >
> > If a RAID-5 column geometry has multiple subdisks where the subdisks
> > boundaries are not stripe-unit aligned, you may see data corruption
> > after a RAID-5 reconstruction operation in the RAID-5 volume at the
> > point of the split subdisks. If this RAID-5 volume contains a
> > filesystem, this could manifest itself as a failed "FSCK" pass.
> >
> >
> > CORRECTIVE ACTION:
> >
> > Upgrade the customer to VxVM 2.3 and execute the following procedure on
> > those RAID-5 volumes you suspect may be bad, and that a backup does not
> > exist for. If a current backup exists, then restore any RAID-5 volume
> > suspected of having bad parity.
> >
> > How to repair the parity if suspected bad:
> >
> > A raid parity resync can be forced by doing the following:
> >
> > 1) Unmount the file system on a RAID-5
> > 2) vxvol -g <diskgroup> stop <volume>
> > 3) vxmend -g <diskgroup> fix empty <volume>
> > 4) vxvol -g <diskgroup> start <volume>
> >
> > This will take a while and will resync the parity.
> >
> >
> > COMMENTS:
> >
> > Timing tests took 58 minutes to resync the parity on a 10 GB volume with
> > enough I/O load on the system to cause all other disks to be an average
> > of 75% busy.
> >
> > A patch (for Vm 2.3, only) is in test at this time and will be released
> > in the near future with a utility that will allow the checking of the
> > RAID-5 volumes to determine if they need to have their parity synced.
> >
> >
******************************************************************************
> >
******************************************************************************
> >
> > Other SunService Information Resources
> > --------------------------------------
> > SunSolve Online including SunSolve Bulletin Board(SM)
> >
> > World Wide Web:
> > North America: http://sunsolve.sun.com
> > UK: http://online.sunsolve.sun.co.uk
> > France: http://sunsolve.sun.fr
> > Germany: http://sunsolve.sun.de/sunsolve
> > Switzerland: http://sunsolve.sun.ch
> > Japan: http://sunsolve.sun.co.jp/sunsolve
> > Australia: http://sunsolve1.sun.com.au
> >
> > Telnet access is available on each of the above servers
> > (except in Germany, Telnet to suninfo.sun.de).
> >
> > SunSolve CD-ROM(TM):
> > Updated and distributed every six weeks to SunService contract customers.
> >
> > For a complete list of patches, check the patch reports in the EarlyNotifier
> > data collection on SunSolve.
> >
> >
>