Berkeley CSUA MOTD 2011/09/14

Berkeley CSUA MOTD:2011:September:14 Wednesday

WIKI \| FAQ \| Tech FAQ
`http://csua.com/feed/`

2011/9/14-12/28 [Computer/SW/Unix] UID:54172 Activity:nil

9/12    We've restored CSUA NFS to something vaguely resembling normal
        functionality -- plus, with some luck, we should now have something
        vaguely resembling normal uptime, too!  Ping root@csua.org if you
        notice any problems.  --jordan
--------------------------------------------------------------------------------
        \_  Oh, and http://irc.CSUA.Berkeley.EDU is online again.
--------------------------------------------------------------------------------

2011/9/14-10/25 [Computer/HW/Drives] UID:54173 Activity:nil

9/13    Thanks to Jordan, our disk server is no longer virtualized. Our long
        nightmare of poor IO performance should hopefully be over. Prepare for
        another long nightmare of poor hardware reliability!
        ...
        Just kidding! (I hope)
        In any case, this means that cooler was taken out back and shot, and
        replaced with Keg, a real machine with real disks. Right now it's not
        running at 100%, but already you should notice that soda's not only
        fast, it's a fucking miracle compared to the past few years. I
        personally blame unforeseen edge cases in a poisonous combination of
        ZFS+NFS+OpenSolaris+1000s of users with a system too big to fail.
        Indeed, syncing the data away from cooler took two continuous weeks.
        It's no wonder it's taken until now for a very capable VP to be up to
        the task of partially unbreaking the setup. Note - we no longer have
        any VMs running off of virtualized disks stored on a NFS mounted disk
        which, itself, was virtualized. Hmmmmmmmm. Though those were mostly
        useless VMs you never saw. :P

        So anyways, as mentioned earlier, Keg isn't at 100%, but it's up. It
        looks good enough to keep for a bit, but it originally had a bunch of
        Raptors or some such. The disks are still there, but the RAID cards are
        most likely broken. We'll leave it to jordan to evaluate the server
        needs and fix accordingly. As it is, RAIDing fifteenish 10000RPM disks
        so you can edit motd SUPER-EXTRA FAST!!! is probably not a great use of
        time. We'll see where our less-shaky infrastructure takes us in the
        future. --toulouse
        \_ cooler is dead. Long live KEG!
        \_ Good work guys, thanks! #1 lesson here: don't virtualize i/o
        \_ Good work guys, thanks! #1 lesson here: don't virtualize disk i/o
           intensive applications. -ausman
           \_ That is a good lesson but definitely not the #1 lesson.
              * Exporting thousands of filesystems: bad idea, no matter how
                easy it makes backups and ZFS snapshotting.
              * Using an OS with superior filesystem support is a bad long-term
                solution if nobody but the original installer knows almost
                anything about it
              * Choosing ZFS...the jury's still out.
              * Maintaining FreeBSD 7 and 8 and OpenSolaris and Debian...kinda
                hard.
              * All of this, on top of virtualized disk i/o - bad news.
              \_ Even after I collapsed NFS down to one filesystem, when our
                 FreeBSD boxes came back online and started automounting
                 thousands of filesystems apiece, the NFS server again ground
                 to a screeching halt (taking soda and friends with it).
                 Switching to one /home mount per server restored NFS's
                 snappiness; I suspect that even a virtualized NFS server could
                 perform well without the filesystem woes.  --jordan

2011/9/14-21 [Computer/SW/Unix] UID:54174 Activity:nil

9/13    Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do
        eiusmod tempor incididunt ut labore et dolore magna aliqua.

Berkeley CSUA MOTD:2011:September:14 Wednesday