HP3000-L Archives

August 2001, Week 4

HP3000-L@RAVEN.UTC.EDU

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Reply To:
Date:
Mon, 27 Aug 2001 15:32:22 -0400
Content-Type:
text/plain
Parts/Attachments:
text/plain (63 lines)
X-no-Archive:yes
Define "not taking down the network"...

What we used to do was stop the network, without performing an ;ABORT, so
existing sessions could stay on, and we benighted users could still sign on
;HIPRI. Our quiesce job would abort all remaining sessions except for
operator.sys and our production support (PS) / scheduler users, and politely
take down various known jobs, then abort anything that was left. IIRC, we
would put messages to the console for any users and jobs that had to be
aborted, so that the operators could ignore them (I guess I'm still bitter
over the on-going disagreement about what Ops and PS were actually supposed
to do). We also left our third-party scheduler up, since it ran and
"monitored" the backup job for catastrophic failure. I wanted to leave up at
least some inetd services such as Apache. We only used Apache to keep system
run info available on line, so this was exactly when much of this info might
be needed. That never became a high-enough priority for it to actually
happen.

So, our backup job excluded those files that we already knew would not get
backed up. For instance, several of the job scheduler's files would be
locked; and we would switch the logs as part of the quiesce job, so that
most of what got logged made it to tape. And, I had some hackery to know to
dynamically determined our various log files and the stdlists of our backups
and other remaining jobs, so those would be excluded, and the number of
files not backed up was zero, unless there were problems.

Now, our third-party scheduler recommended against this approach. So, we had
to add a few steps to our DRP, for how to get the scheduler working again
after executing the DRP.

These are, as I see them, probably the most important considerations. What
access will you allow or not allow, and how? What will not get backed up
because of these choices? How do you mitigate that? How do you recover a
fully functional system, should you need to restore or recover from a
disaster?

Then again, there is both Turbo-STORE's online options, and Orbit's ZDT
backup. Those are designed for "always-online" systems.

Greg Stigers
http://www.cgiusa.com

-----Original Message-----
From: Carl McNamee [mailto:[log in to unmask]]
Sent: Monday, August 27, 2001 2:58 PM
To: [log in to unmask]
Subject: Backups and the network?


I've been wondering to myself lately "What are the consequences of not
taking down the network during our full backup?"  Having the network up
would be a big help with some of the automation I'm doing.

Thought?  Comments?  Criticism?

Carl McNamee
Systems Administrator
Billing Concepts
(210) 949-7282

* To join/leave the list, search archives, change list settings, *
* etc., please visit http://raven.utc.edu/archives/hp3000-l.html *

ATOM RSS1 RSS2