Multiple snapshot problem.
Richard Yates SPG
R.J.Yates@open.ac.uk
Tue, 20 Jul 1999 17:37:09 +0100 (BST)
Please use a fixed-width font to read this.
Three independent systems, abc, def, ghi, which have local file systems
abc1, def1, ghi1 - these contain Ingres databases. System def is connected
to the SSA of abc and the A5200 of ghi, which each contain sufficient extra
disks to mirror def's local file system def1:
abc ---def--- ghi
/ \ .---------' / \ '----------. |
/ \ / / \ \ |
+------+ +------+ +------+ +------+ +------+------+
| Root=========Root | | Root=========Root | | Root===Root |
| | | | | | | | | |
| abc1=========abc1 | =====def1=========def1===== | ghi1===ghi1 |
| | | | // | | | | \\ | |
| | | def1==== | def2=========def2 | ======def1 |
+------+ +------+ +------+ +------+ | |
SSA SSA SSA SSA | def1===def1 |
+------+------+
A5200
The aim of our procedure is to copy and transfer def1 onto abc and ghi
for read access. The picture above shows the situation at transfer
time. The == denote mirrors or snapshots (N.B. UFS). def1 on ghi exists
as a mirrored volume for about 23hrs 59 mins of the day. def1 on abc
exists for about 22 1/2 hours of each day - in the picture, it belongs
to def.
So:
At transfer time, the disks which comprise the snapshot plex of def1
and are on abc'c SSA are snapshotted and transferred to the control of
abc, where they are mounted and used. At a certain time the disks go
back to def for snapstart.
Just after, the snapshot plex of def1 on the A5200 of ghi is snapshotted,
and control of these disks is passed to ghi; the database server running
on the already existing volume of def1 is stopped, the existing volume
sidelined, the new snapshot is mounted, and the database server started.
The old mirrors of def1 on ghi are then used to mirror the new snapshot
volume of def1; when that operation is finished, the snapshot disks are
removed again and "sent back" to system def for snapstart.
But:
There is a problem. def1 runs with two mirrors and two snapshots. At
snapshot time, sometimes the procedure that transfers the disks between
def and ghi grabs the snapshot destined for abc, and breaks. The procedure
that transfers the snapshot between def and abc then can't access its
disks, and breaks as well.
Is it possible to predict which snapshot plex will be taken by a snapshot
where there are multiple snapshots on a volume? It seems to be first in,
first out, but something still goes awry! Could this be because of the
difference in disks? The SSAs use 4.2GB drives (10 per plex), the A5200
uses 9GB drives (five per mirror/snapshot of def1).
I'm testing on an old SSA on a single system with a mixture of disks, but
any ideas will be welcome.
Richard.
--
The Open University is not responsible for content herein, which may
be incorrect and is used at reader's own risk.