5/25 I just installed a dedicated Linux software RAID server with five ATA
133 drives. It seems to run decently. I expect to get heavier usage
as people start using it (It's a CVS repository for our office).
Will I get much more out of a hardware RAID and/or switching to
SATA? I also plan to migrate people into the box with NFS/automount
for home directories. Also, what's a good backup system for a
a terrabyte RAID? DVD-RWs? 8 gig tapes?
\_ SATA: not really.
Hardware raid: possibly because its nice to have all those XORs done
for you in hw but 3ware is the major ide raid hw card maker and
3ware sucks.
Hardware raid: possibly because it's nice to have all those XORs
done for you in hw but 3ware is the major ide raid hw card maker
and 3ware sucks.
Backup system? These days multi-TB systems aren't backed up unless
the owner has big bucks. Tapes are almost as expensive as
getting a second unit and mirroring across systems. DVDRW?
the owner has big bucks. Tapes are almost as expensive as getting
a second unit and mirroring across systems. DVDRW?
Do you really want each backup to take 3 dozen or more dvds?
What slave is doing that job?
\_ that's ridiculous; anyone who has important data backs it up.
Mirroring doesn't help you if a file is corrupted or accidentally
deleted. Yes, tapes are expensive, but so what? -tom
\_ Ah yes, we once again hear from the loud but ignorant and
illiterate as well. There are lots of small companies that
have >1tb of data that can not afford to back it up. They
do not get funding from tax payer dollars. They do not make
grant proposals for the perfect system which is then paid
for by someone else. When you have some real world
experience with budgets and risk come back to the motd and
we'll talk about your childish notion of "tapes are
expensive, so what?" idiocy. If the guy had infinite money
from the tax payers he'd be doing 1+0 on all his data, have
an offsite location to copy snapshots at internet2 speeds,
and do full backups at both sites everyday. But he doesn't
'work' for the university and isn't sucking off the tax
payer's teat, like some ignoramouses around here.
\_ gee, the university has such infinite money that half
of the network infrastructure is still shared 10 megabit!
In the Math department they still have 50 Sparc 2s in
service. Let me know where I can go to run these
"perfect systems". -tom
\_ I see, so how does a poor person afford all of these
tbs worth of tapes and where did a poor uni worker
like yourself get the idea that money is no object
when it comes to data intregrity? Which way is it?
\_ Presumably if you have data, it's worth something.
For anyone with important data, the cost of
losing the data is less than the cost of tapes.
MTBF on human errors is much smaller than on
disks--the majority of requests we get for
file restores are due to user error, not disk
failure. Unless your data are read-only, you
need tape backup. And it's really not that
expensive. -tom
\_ You're missing out on cost/risk. Not all data
is worth 100% reliability on backups. If some
student loses their homework answers is it
really worth adding 80% of your costs to your
file system purchase to recover their hw from
last week for them? Only to the student who
isn't paying for it. Also, tapes *are*
expensive if you're doing enough backups to make
them worth doing. Tapes wear and break. Drives
do also but quality drives get a 3 year
warrantee and even crap drives get a 1 year so
you're ok with DOAs and early deaths. And the
thing you keep ignoring is doing tape backups on
multi terabyte systems takes a fucking long
time, restores are even more painful, and you
need some *very* expensive robots and software
to keep track of all that. This isn't your
grand daddy's world of dump/restore anymore.
In short, you just don't know what you're
talking about which is understandable since you
don't have to do real budgeting or cost/benefit
analysis or risk assessment. Finally, all data
is not necessarily worth backing up. Some data
is your entire life and must be, other data
should be but it's worth the risk or doing some
kludge, and other data can be recreated or it's
ok to lose it.
\_ Dude, if you can't figure out why tape backups are a
good thing, get out of the business. Or your CTO will
just kick your ass out for being an idiot.
\_ I know all about tape backups. Can you figure out why
you'd spend precious money on tapes when they cost
almost as much as drives and it could take days or
even weeks to do a tape restore? Have you ever dealt
with multiple tb of data before? If you had you
wouldn't see tapes as some backup panacea. They have
high cost and other issues on large data sets since
hard drive sizes and speeds have grown by leaps and
bounds while tapes have done very little relatively
speaking in the last 20 years.
\_ I remember, just a couple of years ago, when it was
the VC backed firms that blew all their cash on
over engineered systems. How the worm turns.
\_ Over engineered? Naw, it all went to salaries for
stupid useless people on sales and marketing so they
could keep up their coke habits.
\_ Indeed. Just clone the box and backup to that.
http://rdiff-backup.stanford.edu
\_ Why would I use this instead of rsync? What are the
advantages?
\_ It's basically an incremental rsync. All the
advantages of rsync with the ability to keep a
week's worth of changes in just a bit more space.
I think it even uses rsync internally.
\- hey does anybody know if rsync or some other
tool can make sort of an "incremental blob" ...
say i rsync from A to B at t0. i would
like to compute the "incremental" at t1 and
store that to some "blob" C, which could
be "applied" to B [say via tar or some rsync
merge option] to turn B into a an image of
A at t1. basically like a patch diff. so this
way you could say rsync A->B on sunday and then
just store the "diff blobs" for mon, tue etc.--psb
\_ It depends on your cpu power and usage. linux software raid does
use up sufficient amount of cpu. But unless you're running some
cpu intensive services as well on some older hardware, you'll be
fine with software raid. Just a few years back, the fastest IDE
RAID performance you can get was out of a software RAID. As the
above has mentioned, 3ware sucks ass, especially in linux. I've
wasted a lot of time on drives dropping out and data corruption
on a 3ware, but have yet to have any problem with any of the
software RAIDs I've done. To make sure you get maximum disk
bandwidth, make sure each IDE channel has only one drive attached
to it.
\_ You should be able to get 20-30 GB tapes running off a robot. For
1TB, you may need multiple tape drives to streamline backups. How
big are your filesystems? How vital is the data in each FS? What is
your data turnover (how much is altered daily)? What is your network
like? The answers will help you analyze your backup needs.
\_ What's a 128 tape multi drive tape robot go for these days? |