tangentsoft.net/wskfaq/articles/io-strategies.html
Winsock 2's major features is that it ties sockets into Win32's unified I/O mechanism. In particular, you can now use overlapped I/O on sockets, which is intrinsically more efficient than the above options. Further confusing the issue are threads, because each of the above mechanisms changes in nature when used with threads. In trying to find an answer to the "which I/O strategy" question, it becomes apparent that there are only a few major kinds of programs, and the successful ones follow the same patterns. From those patterns and practical experience -- some personal and some borrowed -- I have derived the following set of heuristics. None of these heuristics are absolute laws, no one isolated heuristic is sufficient, and the heuristics sometimes conflict. When two heuristics conflict, you need to decide which is more important to your application and ignore the other. However, beware of ignoring a heuristic simply because violating it does not create noticeable consequences for your program. If you get into the habit of ignoring a certain heuristic, it becomes useless. The heuristics are ordered in terms of compatibility, then speed, and finally functionality. Compatibility is first, because if a given I/O strategy won't work on the platforms you need to support, it doesn't matter how fast or functional it is. Speed is next because performance requirements are easy to determine, and often important. Functionality is last, because once you decide the compatibility and speed issues, your choices become much more subjective. Heuristic 1: Narrow your choices by deciding which operating systems you need to support. Your code may also need to be compatible with POSIX-based systems. Although there are a few different network and threading APIs used by the various POSIX-based systems, I'll only talk about BSD sockets and POSIX threads in this article. None of these operating systems have exactly the same set of networking features. You can exploit this fact to rule out I/O strategies that not all of your target operating systems support. Where overlapped I/O calls work on Win9x, it is because the mechanism is emulated at the API layer. If, on the other hand, you stray into functionality that only WinNT 4+ provides, your application will fail on Win9x. One example of this is calling ReadFile() with a socket: this works fine on NT4+, but will fail on Win9x. If you only need scatter/gather I/O support, BSD sockets provides this functionality in the readv() and writev() calls. There is no standard Unix mechanism that provides similar efficiencies to Win32's overlapped I/O. Some Unixes provide the aio_*() family of functions (called asynchronous I/O, but not related to Winsock's asynchronous I/O), but this is not implemented widely at the moment. Although all current Unixes support POSIX threads, there are still a lot of older Unix machines out there with broken, nonstandard or nonexistent threading. You will have to choose a subset of all the Unixes if you want to use the same threading code on all Unixes. You'll definitely be writing different threading code for Windows, since its threading API is completely different. Most of this overhead is a linear function of the number of connections: double the number of connections, and you double the processing time. About the only time you should use select() is for compatibility reasons: it's the only non-blocking I/O strategy that works on all versions of Windows (including CE) and on virtually all POSIX-based systems. If your program only needs to work on non-CE versions of Windows, there are better alternatives. Heuristic 3: Asynchronous sockets work best with low volumes of data. Asynchronous Winsock I/O (WSAAsyncSelect isn't the most efficient I/O strategy, but it's not the least efficient, either. It's a fine way to go in a program that deals with low volumes of data. As the volume of data goes up, the overhead becomes more significant. Heuristic 4: For high-performance servers, prefer overlapped I/O. Of all the various I/O strategies, overlapped I/O has the highest performance.
No other I/O strategy comes close to the scalability of overlapped I/O. Heuristic 5: To support a moderate number of connections, consider asynchronous sockets and event objects. If your server only has to support a moderate number of connections -- up to 100 or so -- you may not need overlapped I/O. Overlapped I/O is not easy to program, so if you don't need its efficiencies, you can save yourself a lot of trouble by using a simpler I/O strategy. Programmed correctly, asynchronous sockets are a reasonable choice for a dedicated server supporting a moderate number of connections. The main problem with doing this is that many servers don't have a user interface, and thus no message loop. A server without a UI using asynchronous sockets would have to create an invisible window solely to support its asynchronous sockets. If your program already has a user interface, though, asynchronous sockets can be the least painful way to add a network server feature to it. Another reasonable choice for handling a moderate number of connections is event objects. The main problem you run into with them is that you cannot block on more than 64 event objects at a time. To block on more, you need to create multiple threads, each of which blocks on a subset of the event objects. Before choosing this method, consider that handling 1024 sockets requires 16 threads. Any time you have many more active threads than you have processors in the system, you start causing serious performance problems. One caution: it's very easy to underestimate the number of simultaneous connections you will get on a public Internet server. It may make sense to design for massive scalability even if your estimates don't currently predict thousands of simultaneous clients. Heuristic 6: Low-traffic servers can use most any I/O strategy. For low-traffic servers, there isn't much call to be super-efficient. Perhaps your server just doesn't see high traffic, or perhaps it's running a Windows 95 derivative and so it limited to 10 100 sockets at a time by the OS. Suitable strategies for 1-100 connections are event objects, non-blocking sockets with select(), asynchronous sockets, and threads with blocking sockets. We've covered the first three methods already, so let's consider threads with blocking sockets. This is often the simplest way by far to write a server. You just have a main loop that accepts connections and spins each new connection off to its own thread, where it's handled with blocking sockets. They are efficient, because when a thread blocks, the operating system immediately lets other threads run. Also, synchronous code is more straightforward than equivalent non-synchronous code. There are two main problems with thread-per-connection servers. First, threads often require a lot of synchronization work, which is hard to get right; Second, threads don't scale well at all: as the number of threads increases, the operating system overhead associated with context switches between the threads becomes significant. This method is only suitable for a fairly small number of connections, or a greater number of connections that are mostly idle. Heuristic 7: Do not block inside a user interface thread. This heuristic sounds more like a straightforward rule of Windows programming, but I bring it up because most programs are single-threaded. Heuristic 8: For GUI client programs, prefer asynchronous sockets. Asynchronous sockets were designed from the start to work well with GUI programs. You already have a window loop going, and you already have window management code in the rest of the program. Adding asynchronous network I/O is about as easy as adding a dialog to your program. All of the alternatives require at least one additional thread to handle the networking in order to satisfy the previous heuristic. With asynchronous sockets, you can handle both the network and the UI with a single thread. Since window messages are handled one at a time in the order they arrive, everything is automatically synchronized. Heuristic 9: Threads are rarely helpful in client programs...
|