Entry 41761 (Berkeley CSUA MOTD)

Berkeley CSUA MOTD:Entry 41761

WIKI \| FAQ \| Tech FAQ
`http://csua.com/feed/`

2025/07/15 [General] UID:1000 Activity:popular

7/15

2006/2/7-9 [Computer/HW/Memory, Computer/HW/Drives] UID:41761 Activity:kinda low

2/7     I have a little multi-threaded server I'm writing, and I log at
        the start of each call and at the end of each call.  I log by
        having a global lock file, lock, write, flush, unlock.  This
        seems like a bottleneck, is there a better way to log from a
        multi-threaded server? Perhaps something like syslogd where I
        could send messages to another process that would log for me?
        (This would avoid the flush because it could keep running even if
        the server crashes)  Order in the log is not terribly important,
        since a quick run through sort on the date will order that for me.
        \_ What about a separate thread to manage the log? Put the msgs
           in a shared queue, and have the logging thread write out the
           messages later on.
           \_ Not a bad idea, but if the server crashes I won't get the
              messages just before the crash.
              \_ There isn't really a way to avoid this problem in a
                 threaded app, except to move logging into a separate
                 process, such as syslog. If you are going to use
                 syslog(3), then you probably should openlog(3) prior
                 to starting your threads.
        \_ How about each thread keeps its own log file, as well as writing
           into a shared buffer which is flushed periodically to the common
           log.  That way in normal operation you have just one log to look
           at, but when the server does down, you can examine the per-thread
           logs.  In normal operation, other threads will not have to wait
           while one thread flushes its own log.
        \_ I don't think there is a problem with just using write(2), it is
           atomic and writes to buffer cache so a crash of the process won't
           be a problem.  With some applications, mmap(2) is better, but
           depends on what you are doing.  --jwm
           \_ Note that write(2) is only atomic if nbytes is less than
              PIPE_BUF (which is at least 512 according to POSIX). That
              said, if you know your log messages will be reasonably short,
              this is the way I'd go. -gm
              \_ I was assuming he wanted to write a file, and in that case
                 I suspect that the atomicity extends at least to the page
                 size, though I may be wrong.  --jwm
              PIPE_BUF (which is at least 512 according to POSIX). You
              can also get short writes, even if nbytes is less than 512,
              if you're writing to a pipe or other space-limited fd. -gm
           \_ Really?  I didn't know that.  Related, why is it called
              write(2) rather than just write()?
              \_ That represents that it's in section 2 of the man pages which
                 is the system call section. "man write" shows you the page
                 for the write utility to send messages to someone's tty.
                 "man 2 write" gets you the write system call.
                 \_ On Slowlaris: man -s 2 write
                    \_ Unix needs it's own diversity day.
              \_ Stop geeking and find a hot gf during undergrad before it's
                 too late!
           \_ I'll add to my previous comment some speculation. If you're
              using stdio or iostreams, I suspect your flush is not syncing to
              disk, but merely calling write(2) to flush it's internal buffer.
              To test this you could use ktrace to see the calls it is making.
              Of course you will still may need a lock to protect the library
              you are using as I suspect these are not threads safe.  And if
              you don't like locks, you could use sprintf() and write(2) with
              no locks. --jwm
              \_ Tried using sprintf and write(2) with no locks, and it
                 doesn't quite work.  The log file is a bit messed up.  It
                 appears that concurrent calls to write can screw things
                 up.  (Not threadsafe)  But, it seems like in that case
                 the lock shouldn't be costing as much as I had supposed
                 anyway, since the write is just being buffered somewhere.
         \_ Having every thread in your application serially accessing
            a piece of code that does I/O is a really bad thing, you
            really don't want to do this.  Grabbing a lock and
            sticking data on some list that another thread comes and
            consumes will be a lot faster (just makes sure that other
            thread doesn't hold the lock while writing the data, then
            you've lost everything you gained in the first place.)  As
            for needing to get everything logged in case your
            application crashes, you are never going to get that
            anyway.  If you have 10 threads waiting to grab the lock
            and the server crashes those 10 logging statements will be
            lost no matter what.
            How often you log should influence your choices here, are
            we talking tons of debugging logging or just 1 or two
            lines a second?  If a lot of logging keep in mind that
            things like gettimeofday are syscalls and those are more
            expensive. Is it really that important that you get the
            date of the log statement exactly right?  Can you get the
            time before or after you get the lock?  If you really need
            to order your logs, maybe just use a long that you
            increment per statement?
            Getting a lock is done entirely in user space (and
            normally in just a few asm instructions AS LONG AS THERE
            IS NO QUEUE FOR THE LOCK.  Locks get a lot more expensive
            to use (by orders of magnitude) when any blocking has to
            occur.  If you are worried that a lot of threads are going
            to be blocking at a time keep the critical section as
            small as possible, it really helps.
            Finally, while logging can be more complex than it looks
            at first, it is also a pretty solved problem.  There are
            tons of free logging libraries out there that do all this,
            do it well, and do it fast.  It might be worth your while
            to just use existing code.
            \_ Interesting. Thanks.  What could I search for to find some
               of these logging libraries?  I didn't have much luck last
               time I tried.  (The critical section was already tiny, just
               the write and the flush.  The string is all built outside
               the critical section.) -op
               \_ You do understand that I/O is THE most expensive thing
                  you can do don't you?  Just because it is only 2 lines
                  of C code doesn't mean the critical section is fast.

2025/07/15 [General] UID:1000 Activity:popular

7/15

You may also be interested in these entries...

2009/12/3-26 [Computer/Companies/Google] UID:53563 Activity:nil

12/2    Google launches its own DNS. Google Public DNS:
        http://code.google.com/speed/public-dns
        8.8.8.8 and 8.8.4.4
        Kick ass!
        \_ http://radoff.com/blog/2009/12/07/google-dns-benchmarking-and-rationale
        \_ I get 1.7ms pings to 4.2.2.2 and 23ms pings to 8.8.8.8.
	...

2009/10/27-11/3 [Computer/HW/Drives] UID:53474 Activity:nil

10/27   I just read an article that Facebook had moved their database
        to all SSD to speed throughput, but now I can't find it. Has
        anyone else seen this? Any experience with doing this? -ausman
        \_ I hope you're not running mission critical data:
           http://ask.slashdot.org/story/09/10/27/1559248/Reliability-of-PC-Flash-SSDs?from=rss
        \_ Do you have any idea how much storage space is used by Facebook,
	...

2009/8/18-9/1 [Computer/SW/Database, Computer/SW/Languages/Perl] UID:53283 Activity:low

8/18    trying to write an intentionally slow regex.
        what is your worst regex ever?
        this is using MySQL regexp but I'll also accept
        perl format         --brain
        \_ you need to know how regex is implemented internally in order to
           have a worst regex in terms of running time. Something that uses
	...

2008/12/18-2009/1/2 [Recreation/Dating, Computer/HW/Memory, Computer/SW] UID:52277 Activity:nil

12/18   I'm loving the Privacy Browsing feature on Safari. I turn it
        on, browse porn, then turn it off, and no one using my
        computer would ever know that I browsed porn (no history,
        no cache, nada). Loving it man.
        \_ The guy with the backdoor knows. Also, the shadow knows.
           \_ That may be the case and I don't really care about the
	...

2008/11/15-26 [Computer/SW/OS/Windows] UID:51993 Activity:nil

11/14   I have a bunch of pictures and I find that thumbs.db on
        Windows XP to be very useful, especially when you're on NAS
        and the network is slow. Having that said, my Win XP has
        stopped generating thumbs.db even though I've set it to generate
        thumbs.db (Properties->View->Uncheck "Do not cache thumbnails.").
        How do I force Windows to generate Thumbs.db? Googling seems
	...

2008/9/4-8 [Politics/Domestic/Election, Computer/HW/Memory, Computer/SW/WWW/Browsers] UID:51059 Activity:nil

9/4     Someone asked yesterday about the Obama/Ayers link.  It took me a while
        to find a link I was happy with, this one is ok.  This guy goes into
        excruciating detail about the Annenberg Challenge and CAC over a
        number of blog posts.  The full Annenberg documents were only recently
        released, and this post predates the release.  The documents are still
        being digested, so this is far from the final word.
	...

2008/9/4 [Computer/SW/WWW/Browsers] UID:51062 Activity:nil

9/4     Regarding the Firefox 3 memory usage question that I asked on 8/29,
        here are some new numbers (VM Size in MB reported by Task Manager):
                                        2.0.0.16        3.0.1
        - Startup, empty the cache.     13              23
        - Go to http://www.yahoo.com          17              40
        - Open 6 news pages in          43              69
	...

2008/7/20-23 [Computer/Domains] UID:50638 Activity:nil

7/19    Why is OpenDNS popular? It still forwards me to ads when I go
        to an invalid domain. That is really annoying.
        \_ For one thing, they were never vulnerable to the latest round of DNS
           spoofing.
           \_ i think the latest round of dns spoofing attack was overblown.
              yes there was a security hole... but no one ever exploited it.
	...