Entry 48701 (Berkeley CSUA MOTD)

Berkeley CSUA MOTD:Entry 48701

WIKI \| FAQ \| Tech FAQ
`http://csua.com/feed/`

2025/07/06 [General] UID:1000 Activity:popular

7/6

2007/11/27-30 [Computer/SW/Languages/C_Cplusplus, Computer/SW/OS/Solaris] UID:48701 Activity:high

11/27   I'm using select to do a nonblocking check to see if a single socket
        has anything to read off it.  Problem is, I can have up to 12228
        file descriptors, and Linux fd_set only supports up to 4096.  Any idea
        what I can do about this?  (Or a better solution?) -jrleek
        \- 1. who are you
           2. i am busy this week and you didnt mention language
              [i am not fmailar with stuff like java nio] but you might
              look at this ucb/cs paper ... matt welsh et al "a design
              framework for highly scalable systems" as well as
              some of the discussion around libevent.
              see the links and graph at http://monkey.org/~provos/libevent
              \_ Ah, the program is all in 'C', but it needs to run on multiple
                 Unix variants.  -jrleek
                 \_ Have you profiled it?  Can you port to python or another
                    scripting language with reasonable performance (alas, not
                    ruby at this time)? -dans
                    ruby at this time)?  At http://Slide.com, a hot startup in
                    downtown San Francisco (we're hiring!), we open AND
                    close millions of socket connections every day. -dans
                    \_ How could I profile it?  This isn't a webserver, it's a
                       server that accepts, and acts on, messages based on a
                       protocol I wrote.  (Over TCP).  In this case, I need to
                       know about the performance of a tiny part of the code.
                       I'm not sure how to get that information. gprof, for
                       example, doesn't seem to allow me to choose just a small
                       section to profile, and lacks the necessary resoluton
                       anyway. -jrleek
                       \_ What you want is a profiling tool that doesn't work
                          via random sampling but that lets you add profiling
                          hooks into your code.  I've written some homegrown
                          things like this in the past to profile very tight
                          loops in massive projects, but I'm sure there are
                          plenty of better tools out there if you poke around.
                       \_ Hint: leek >> dans. Why are you listening to his
                          babbling?
                          \_ Well, I have a lot to learn about network
                             programming, so I'll take what I can get.  Thanks
                             for the compliment though. -jrleek
                             \_ Alas, I don't have any better suggestion that
                                gprof, though there must be better tools out
                                there.  Another alternative would be to compile
                                the source, look at the ASM output and try to
                                hand optimize.  Consider that a WAY last
                                resort, and not worth pursuing unless you're
                                already a fan of ASM.  Two problems though:
                                there.  Another alternative would be to
                                compile the source, look at the ASM output
                                and try to hand optimize.  Consider that a
                                WAY last resort, and not worth pursuing unless
                                you're already a fan of ASM.  Two problems
                                though:
                                a) it doesn't really scale if you need to
                                target multiple platforms and b) it's actually
                                tough to beat a *good* modern optimizing
                                compiler even if you really know what you're
                                doing. -dans
                            \_ UPDATE: I just asked one of the guru's and
                               he responded, 'books?  what are those' see:
                               http://www.kegel.com/c10k.html
                               It's more of a jumping off point, but it will
                               at least give you tools to work with and
                               references potential implementations. -dans
                          \_ Hint: In the last three years have you...
                             a) worked on a project with me?
                             b) read or hacked any code I've written?
                             c) used a service based on my code or systems I
                                administered?
                             Unless you can answer yes to at least two, you
                             have NO FUCKING IDEA WHAT YOU'RE TALKING ABOUT.
                             Why two?  Because I built the systems half (as
                             opposed to the network/routing half) of the
                             anycast DNS rig that runs the roots for over
                             fifty ccTLD's including, amusingly enough, .cx.
                             Thus, the answer to c) is almost always yes.
                             -dans
        \_ Use poll() instead of select, or do multiple selects with several
            different fd_sets .  -ERic
        \_ Can you increase the max size of fd_set in /proc?  I'm guessing not,
           but couldn't hurt to look.  Also, using select on that many file
           descriptors will probably result in sucky performance. -dans
           \_ Do you know where I can read up on getting really good
              performance out of the POSIX tcp codes?
              \_ I wish I did.  Most of what I know is a collection of voodoo
                 and lore.  It's not super complicated, basically you want to
                 use non-blocking sockets and poll.  Also, avoid threads
                 unless you know what you're doing.  Writing correct threaded
                 code is hard, writing high-performance threaded code is even
                 harder.  On Linux, processes are basically threads, but with
                 processes you don't have to handle any locking crap. -dans
        \_ epoll (linux) or kqueue (bsd)
           \_ Unfortunately, it needs to run on AIX (IBM's Unix) as well.
              \- arent you that fellow at livermore? if this is going to
                 run on ibm big iron, maybe if you have a "user services"
                 group they will know this. cray and ibm have some people
                 group they will know this. cray and ibm have had some people
                 stationed here as part of nersc. i am familar with assos,
                 fleebsd, and solaris [/dev/poll] but not aix. btw some of
                 the select vs poll people seem to be unaware of many
                 places where the interface is different, but under the hood
                 they are the same thing. --psb
                 fleebsd, and solaris [/dev/poll] but not aix. --psb
                          \- i have never heard of/used this [i no longer
                             work on aix] but check this out:
                              http://tinyurl.com/26z9jf [pollset]
                             often i would think if there was something
                             that was obscure it probably wouldnt be that
                             good, but in this case 1. ibm has a history
                             of sitting on good things that fail due to
                             obscurity 2. i'm not in the loop [no pun
                             intended] any more on ibm stuff --fmr ibm person
                 \_ We do have such a group, but they don't know much about
                    TCP.  It's kind of an odd thing to be doing.  I wrote this
                    TCP implementation as a proof-of-concept, but we've never
                    gotten to money to do something better, so I just keep
                    trying to improve it incrementally. -jrleek

2025/07/06 [General] UID:1000 Activity:popular

7/6

You may also be interested in these entries...

2014/1/14-2/5 [Computer/SW/Languages/C_Cplusplus] UID:54763 Activity:nil

1/14    Why is NULL defined to be "0" in C++ instead of "((void *) 0)" like in
        C?  I have some overloaded functtions where one takes an integer
        parameter and the other a pointer parameter.  When I call it with
        "NULL", the compiler matches it with the integer version instead of
        the pointer version which is a problem.  Other funny effect is that
        sizeof(NULL) is different from sizeof(myPtr).  Thanks.
	...

2013/4/9-5/18 [Computer/SW/Languages/C_Cplusplus, Computer/SW/Apps, Computer/SW/Languages/Perl] UID:54650 Activity:nil

4/04    Is there a good way to diff 2 files that consist of columns of
        floating point numbers, such that it only tells me if there's a
        difference if the numbers on a given line differ by at least a given
        ratio?  Say, 1%?
        \_ Use Excel.
           1. Open foo.txt in Excel.  It should convert all numbers to cells in
	...

2013/4/29-5/18 [Computer/SW/Languages/C_Cplusplus, Computer/SW/Compilers] UID:54665 Activity:nil

4/29    Why were C and Java designed to require "break;" statements for a
        "case" section to terminate rather than falling-through to the next
        section?  99% of the time poeple want a "case" section to terminate.
        In fact some compilers issue warning if there is no "break;" statement
        in a "case" section.  Why not just design the languages to have
        termination as the default behavior, and provide a "fallthru;"
	...

2012/7/19-11/7 [Computer/SW/Languages/C_Cplusplus] UID:54439 Activity:nil

7/19    In C or C++, how do I write the code of a function with variable
        number of parameters in order to pass the variable parameters to
        another function that also has variable number of parameters?  Thanks.
        \_ The usual way (works on gcc 3.0+, Visual Studio 2005+):
               #define foo(fmt, ...) printf(fmt, ##__VA_ARGS__)
           The cool new way (works on gcc 4.3+):
	...

2011/3/7-4/20 [Computer/SW/Languages/C_Cplusplus] UID:54056 Activity:nil

3/7     I have a C question.  I have the following source code in two identical
        files t.c and t.cpp:
                #include <stdlib.h>
                int main(int argc, char *argv[]) {
                  const char * const * p1;
                  const char * * p2;
	...

2011/2/5-19 [Computer/SW/Languages/C_Cplusplus] UID:54027 Activity:nil

2/4     random C programming/linker fu question.  If I have
        int main() { printf("%s is at this adddr %p\n", "strlen", strlen); }
        and soda's /proc/sys/kernel/randomize_va_space is 2 (eg; on)
        why is strlen (or any other libc fn) at the same address every time?
        \_ I don't pretend to actually know the right answer to this, but
           could it have something to do with shared libraries?
	...

2010/2/12-3/9 [Computer/SW/Languages/C_Cplusplus] UID:53708 Activity:nil

2/12    I need a way to make a really big C++ executable (~200MBs) that does
        nothing.  No static initialization either.  Any ideas?
        \_ static link in lots of libraries?
        \_ #define a   i=0; i=0; i=0; i=0; i=0; i=0; i=0; i=0; i=0; i=0;
           #define b   a a a a a a a a a a
           #define c   b b b b b b b b b b
	...

2009/9/28-10/8 [Computer/SW/Languages/C_Cplusplus] UID:53409 Activity:nil

9/28    http://www.tiobe.com/index.php/content/paperinfo/tpci/index.html
        Java is #1!!! Followed by C, PHP, C++, Visual Basic, Perl,
        C#, Python, Javascript, then finally Ruby. The good news is
        Pascal is going waaaay back up!
        \_ C is still more popular than C++?  I feel much better about myself
           now.
	...

2009/8/7-14 [Computer/SW/Languages/C_Cplusplus, Computer/SW/Languages/Java] UID:53252 Activity:high

8/6     In C one can do "typedef int my_index_t;".  What's the equivalent in
        C#?  Thanks.
        \_ C#? Are you serious? Is this what the class of 2009 learn?
           \_ No.  I have to learn .NET code at work.  I am Class of '93.
           \_ python is what 2009 learns, see the motd thread about recent
              cal courses and languages
	...

2009/7/21-24 [Computer/SW/Languages/Java] UID:53168 Activity:moderate

7/20    For those who care btw, it looks like eclipse is now A Standard Tool
        at UCB ugrad cs, probably replaced emacs.  Furthermore, people get
        angry at seeing Makefiles, (since eclispe takes care of that).  I
        guess it's just a sign of the times.
        \_ The more people at my work use eclipse the less the code is
           managable in emacs.  I'm not sure which application's fault
	...

2010/2/8-18 [Computer/SW/Apps/Media, Computer/SW/Apps] UID:53695 Activity:kinda low

2/5     I like Adobe Flash. When written correctly, it scales along
        with your browser size. It looks consistent on every single
        browser. It is predictable. On the other hand, I'm not a big
        fan of CSS/HTML, which for the most part, look wildly different
        between browsers, and don't even work consistently or
        correctly at times. So why do so many people (like Steve Jobs)
	...

2009/10/27-11/3 [Computer/SW/Unix] UID:53475 Activity:nil

10/27   http://www.maxgames.com/play/flash-mind-reader.html
        how does this work?
        \_ sh -c 'for ((i=0;i<10;i++)); do for ((j=0;j<10;j++)); do echo "$i$j-(\
$i+$j)" | bc; done ; done' | uniq
        \_ bash -c 'for ((i=0;i<10;i++)); do for ((j=0;j<10;j++)); do echo "$i$j\
-($i+$j)" | bc; done ; done' | uniq
	...

2009/4/20-23 [Computer/SW/Database] UID:52876 Activity:nil

4/19    ORCL u SUNW = ORCL.
        What is Larry Ellison thinking? What is he going to do with a bunch of
        legacy Sun hardware that no one uses anymore, its fading workstation
        customer base, and open source Sun MySQL that doesn't even generate
        revenue? I really don't get all this acquisition business.
        \_ A lot of big companies still use big, fat Sun hardware. Or use
	...

2009/1/15-23 [Computer/SW/OS/OsX] UID:52398 Activity:nil

1/15    can any serious development be done on OSX that is not *for OSX*.
        i'll grant that ruby on rails has excellent tutorials for the mac.
        discuss:
        \_ What kind of serious development?  If you want to use the standard
           OSX ui then your ui code will be pretty much useless elsewhere,
           but that's why concepts like MVC are so important.  Otherwise
	...

2008/11/29-12/6 [Computer/SW/OS/FreeBSD, Computer/SW/OS/VM] UID:52129 Activity:moderate

11/29   I'm experimenting with virtualization, and as a poor college student
        I'm wondering what the best alternatives for virtualization are, and
        how best to cut my teeth on messing with non-linux platforms (or I
        guess interesting stuff on Linux would work too). Right now I've got
        FreeBSD7 running on KVM on my home computer (on a Core 2 Quad), and am
        somewhat at a loss as to how to use it. (More details: bridged
	...

2008/11/14-26 [Computer/SW/Languages/Java, Computer/SW/OS/Solaris] UID:51970 Activity:moderate

11/13   http://sfgate.com/cgi-bin/article.cgi?f=/n/a/2008/11/14/financial/f051352S72.DTL
        http://preview.tinyurl.com/6nngpm
        Sun Microsystems Inc. plans to cut up to 6,000 jobs, or 18 percent of
        its global work force, as sales of its high-end computer servers have
        collapsed.  The drastic move announced Friday highlights Sun's
        desperation to cut costs and survive as an independent company. Sun's
	...

2008/11/14-26 [Computer/SW/OS/Linux, Computer/SW/OS/Solaris] UID:51989 Activity:moderate

11/14   lulz why doesn't GOOG buy JAVA i mean SUN i mean whatever the hell they
        are these days.
        \_ Even GOOG isn't THAT stupid
           \_ Sorry, but WHY would Google do something like that? They
              run 99.2% Linux servers on the backend. They don't use
              Solaris for development. I mean, what does Sun have to
	...

2008/9/24-29 [Computer/SW/OS/Windows, Computer/SW/OS/Solaris] UID:51283 Activity:nil

9/24    Why is nscd going crazy?  DoS?
        \- back in the solaris say 2.5-2.6 era, it had both some bugs
           (some malformed nis maps made it go crazy) and architectural
           flaws in the IPC/door+threading mechanism. if you are running
           OS-recent, dunno, but you can trace it.
           \_ Yeah, I think it's just buggy.  I've restarted it, and it seems
	...

2008/4/3-9 [Computer/SW/Languages/Misc, Computer/SW/OS/Solaris] UID:49658 Activity:nil

4/3     Solaris experts: I've never played with ZFS. Does it have a native
        dump command a la ufsdump?
        \_ This might be what you are looking for:
           http://preview.tinyurl.com/2xqkda [sun - bigadmin]
	...

2008/3/30-4/6 [Computer/SW/OS/Solaris] UID:49614 Activity:nil

3/30    Question: I just deleted 60 GB of files from an 80 GB disk. The
        disk activity lights were blinking like crazy and I could hear the
        drive crunch while the data was deleted. This is under Solaris.
        Anyway, I think UNIX uses unlink() when files are deleted. Shouldn't
        it just update the free list on the superblock and call it a day?
        What is all the crunching about?
	...

2007/10/5-9 [Computer/SW/OS/Linux] UID:48245 Activity:nil

10/5    Anyone used Veritas on Redhat?  I've used it on Solaris where it
        worked great but not on Redhat.  How well does it work?  Can it
        dynamically extend a pre-existing volume without unmounting?  Thanks!
	...

Cache (1996 bytes)

monkey.org/~provos/libevent -> monkey.org/~provos/libevent/Honeyd The libevent API provides a mechanism to execute a callback function when a specific event occurs on a file descriptor or after a timeout has been reached. Furthermore, libevent also support callbacks due to signals or regular timeouts. libevent is meant to replace the event loop found in event driven network servers. An application just needs to call event_dispatch() and then add or remove events dynamically without having to change the event loop. The internal event mechanism is completely independent of the exposed event API, and a simple update of libevent can provide new functionality without having to redesign the applications. As a result, Libevent allows for portable application development and provides the most scalable event notification mechanism available on an operating system. Libevent can also be used for multi-threaded applications; GPG Sig - Release 2007-09-24 Fix compilation on Solaris; from Magne Mahre Add a "Date" header to HTTP responses when it's missing, as required by HTTP 11 Original Patch from Ralph Moritz. Fix a memory leak in which failed HTTP connections whould not free the request object. from Trond Norbye Recalculate pending events properly when reallocating event array on Solaris; Benchmark Performance comparison using different event notification mechansims in Libevent. We declare interest in a large number of connections of which most are cold and only a few are active. The benchmark measures how long it takes to serve one active connection and exposes scalability issues of traditional interfaces like select or poll. The benchmark measures how long it takes to serve one hundred active connections that chain writes to new connections until thousand writes and reads have happened. Contributions If you like, use or appreciate libevent, you can contribute to continued development of the library. Funds are going to be used for development infrastructure. Contribution: USD Make payments with PayPal - it's fast, free and secure!

Cache (8192 bytes)

www.kegel.com/c10k.htmlHelp save the best Linux news source on the web -- subscribe to Linux Weekly News! It's time for web servers to handle ten thousand clients simultaneously, don't you think? You can buy a 1000MHz machine with 2 gigabytes of RAM and an 1000Mbit/sec Ethernet card for $1200 or so. Let's see - at 20000 clients, that's 50KHz, 100Kbytes, and 50Kbits/sec per client. It shouldn't take any more horsepower than that to take four kilobytes from the disk and send them to the network once a second for each of twenty thousand clients. com, actually handled 10000 clients simultaneously through a Gigabit Ethernet pipe. being offered by several ISPs, who expect it to become increasingly popular with large business customers. And the thin client model of computing appears to be coming back in style -- this time with the server out on the Internet, serving thousands of clients. With that in mind, here are a few notes on how to configure operating systems and write code to support thousands of clients. The discussion centers around Unix-like operating systems, as that's my personal area of interest, but Windows is also covered a bit. presentation about network scalability, complete with benchmarks comparing various networking system calls and operating systems. One of his observations is that the 26 Linux kernel really does beat the 24 kernel, but there are many, many good graphs that will give the OS developers food for thought for some time. Unix Network Programming : Networking Apis: Sockets and Xti (Volume 1) by the late W Richard Stevens. It describes many of the I/O strategies and pitfalls related to writing high-performance servers. ACE, a heavyweight C++ I/O framework, contains object-oriented implementations of some of these I/O strategies and many other useful things. In particular, his Reactor is an OO way of doing nonblocking I/O, and Proactor is an OO way of doing asynchronous I/O. libevent is a lightweight C I/O framework by Niels Provos. It supports kqueue and select, and soon will support poll and epoll. It's level-triggered only, I think, which has both good and bad sides. Poller is a lightweight C++ I/O framework that implements a level-triggered readiness API using whatever underlying readiness API you want (poll, select, /dev/poll, kqueue, or sigio). benchmarks that compare the performance of the various APIs. This document links to Poller subclasses below to illustrate how each of the readiness APIs can be used. rn is a lightweight C I/O framework that was my second try after Poller. It's lgpl (so it's easier to use in commercial apps) and C (so it's easier to use in non-C++ apps). a paper in April 2000 about how to balance the use of worker thread and event-driven techniques when building scalable servers. The paper describes part of his Sandstorm I/O framework. library - an async socket, file, and pipe I/O library for Windows I/O Strategies Designers of networking software have many options. Here are a few: * Whether and how to issue multiple I/O calls from a single thread + Don't; use blocking/synchronous calls throughout, and possibly use multiple threads or processes to achieve concurrency + Use nonblocking calls (eg write() on a socket set to O_NONBLOCK) to start I/O, and readiness notification (eg poll() or /dev/poll) to know when it's OK to start the next I/O on that channel. Build the server code into the kernel 1 Serve many clients with each thread, and use nonblocking I/O and level-triggered readiness notification ... set nonblocking mode on all network handles, and use select() or poll() to tell which network handle has data waiting. With this scheme, the kernel tells you whether a file descriptor is ready, whether or not you've done anything with that file descriptor since the last time the kernel told you about it. That's why it's important to use nonblocking mode when using readiness notification. An important bottleneck in this method is that read() or sendfile() from disk blocks if the page is not in core at the moment; setting nonblocking mode on a disk file handle has no effect. The first time a server needs disk I/O, its process blocks, all clients must wait, and that raw nonthreaded performance goes to waste. This is what asynchronous I/O is for, but on systems that lack AIO, worker threads or processes that do the disk I/O can also get around this bottleneck. One approach is to use memory-mapped files, and if mincore() indicates I/O is needed, ask a worker to do the I/O, and continue handling network traffic. in November 2003 on the freebsd-hackers list, Vivek Pei et al reported very good results using system-wide profiling of their Flash web server to attack bottlenecks. One bottleneck they found was mincore (guess that wasn't such a good idea after all) Another was the fact that sendfile blocks on disk access; they improved performance by introducing a modified sendfile() that return something like EWOULDBLOCK when the disk page it's fetching is not yet in core. There are several ways for a single thread to tell which of a set of nonblocking sockets are ready for I/O: * The traditional select() Unfortunately, select() is limited to FD_SETSIZE handles. This limit is compiled in to the standard library and user programs. benchmarks) for an example of how to use poll() interchangeably with other readiness notification schemes. The idea behind /dev/poll is to take advantage of the fact that often poll() is called many times with the same arguments. With /dev/poll, you get an open handle to /dev/poll, and tell the OS just once what files you're interested in by writing to that handle; from then on, you just read the set of currently ready file descriptors from that handle. according to Sun, at 750 clients, this has 10% of the overhead of poll(). Various implementations of /dev/poll were tried on Linux, but none of them perform as well as epoll, and were never really completed. kqueue() can specify either edge triggering or level triggering. It then assumes you know the file descriptor is ready, and will not send any more readiness notifications of that type for that file descriptor until you do something that causes the file descriptor to no longer be ready (eg until you receive the EWOULDBLOCK error on a send, recv, or accept call, or a send or recv transfers less than the requested number of bytes). When you use readiness change notification, you must be prepared for spurious events, since one common implementation is to signal readiness whenever any packets are received, regardless of whether the file descriptor was already ready. It's a bit less forgiving of programming mistakes, since if you miss just one event, the connection that event was for gets stuck forever. Nevertheless, I have found that edge-triggered readiness notification made programming nonblocking clients with OpenSSL easier, so it's worth trying. There are several APIs which let the application retrieve 'file descriptor became ready' notifications: * kqueue() This is the recommended edge-triggered poll replacement for FreeBSD (and, soon, NetBSD). To change the events you are listening for, or to get the list of current events, you call kevent() on the descriptor returned by kqueue(). It can listen not just for socket readiness, but also for plain file readiness, signals, and even for I/O completion. Note: as of October 2000, the threading library on FreeBSD does not interact well with kqueue(); evidently, when kqueue() blocks, the entire process blocks, not just the calling thread. This is just like the realtime signal readiness notification, but it coalesces redundant events, and has a more efficient scheme for bulk event retrieval. A patch for the older version of epoll is available for the 24 kernel. unifying epoll, aio, and other event sources on the linux-kernel mailing list around Halloween 2002. It may yet happen, but Davide is concentrating on firming up epoll in general first. some recent discussion * Drepper's New Network Interface (proposal for Linux 26+) At OLS 2006, Ulrich Drepper proposed a new high-speed asynchronous networking API. LWN article from July 22 * Realtime Signals This is the recommended edge-triggered poll replacemen...

Cache (52 bytes)

tinyurl.com/26z9jf -> publib.boulder.ibm.com/infocenter/systems/index.jsp?topic=/com.ibm.aix.basetechref/doc/basetrf1/pollset.htmFRAME: BannerFrame FRAME: TabsFrame FRAME: HelpFrame

Cache (285 bytes)

Slide.com -> www.slide.com/com or any website Slide Shows Make a Slide Show Make a Slide Show and tell the world about yourself. Make a Guestbook SkinFlix Download Slide Screensaver Personalize YouTube clips with skins and more. Screensaver Download Slide Screensaver Watch slideshows, videos, hotties, and more.