How can I force a socket to send the data in its buffer?

  From Richard Stevens (rstevens@noao.edu):

  You can't force it.  Period.  TCP makes up its own mind as to when it
  can send data.  Now, normally when you call write() on a TCP socket,
  TCP will indeed send a segment, but there's no guarantee and no way to
  force this.  There are lots of reasons why TCP will not send a
  segment: a closed window and the Nagle algorithm are two things to
  come immediately to mind.

  (Snipped suggestion from Andrew Gierth to use TCP_NODELAY)

  Setting this only disables one of the many tests, the Nagle algorithm.
  But if the original poster's problem is this, then setting this socket
  option will help.

  A quick glance at tcp_output() shows around 11 tests TCP has to make
  as to whether to send a segment or not.

  Now from Dr. Charles E. Campbell Jr.  (cec@gryphon.gsfc.nasa.gov):

  As you've surmised, I've never had any problem with disabling Nagle's
  algorithm.  Its basically a buffering method; there's a fixed overhead
  for all packets, no matter how small.  Hence, Nagle's algorithm
  collects small packets together (no more than .2sec delay) and thereby
  reduces the amount of overhead bytes being transferred.  This approach
  works well for rcp, for example: the .2 second delay isn't humanly
  noticeable, and multiple users have their small packets more
  efficiently transferred.  Helps in university settings where most
  folks using the network are using standard tools such as rcp and ftp,
  and programs such as telnet may use it, too.

  However, Nagle's algorithm is pure havoc for real-time control and not
  much better for keystroke interactive applications (control-C,
  anyone?).  It has seemed to me that the types of new programs using
  sockets that people write usually do have problems with small packet
  delays.  One way to bypass Nagle's algorithm selectively is to use
  "out-of-band" messaging, but that is limited in its content and has
  other effects (such as a loss of sequentiality) (by the way, out-of-
  band is often used for that ctrl-C, too).

  More from Vic:

  So to sum it all up, if you are having trouble and need to flush the
  socket, setting the TCP_NODELAY option will usually solve the problem.
  If it doesn't, you will have to use out-of-band messaging, but
  according to Andrew, "out-of-band data has its own problems, and I
  don't think it works well as a solution to buffering delays (haven't
  tried it though).  It is not 'expedited data' in the sense that exists
  in some other protocols; it is transmitted in-stream, but with a
  pointer to indicate where it is."

  I asked Andrew something to the effect of "What promises does TCP make
  about when it will get around to writing data to the network?"  I
  thought his reply should be put under this question:
  Not many promises, but some.

  I'll try and quote chapter and verse on this:

  References:

       RFC 1122, "Requirements for Internet Hosts" (also STD 3)
       RFC  793, "Transmission Control Protocol"   (also STD 7)

  1. The socket interface does not provide access to the TCP PUSH flag.

  2. RFC1122 says (4.2.2.2):

     A TCP MAY implement PUSH flags on SEND calls.  If PUSH flags are
     not implemented, then the sending TCP: (1) must not buffer data
     indefinitely, and (2) MUST set the PSH bit in the last buffered
     segment (i.e., when there is no more queued data to be sent).

  3. RFC793 says (2.8):

     When a receiving TCP sees the PUSH flag, it must not wait for more
     data from the sending TCP before passing the data to the receiving
     process.

     [RFC1122 supports this statement.]

  4. Therefore, data passed to a write() call must be delivered to the
     peer within a finite time, unless prevented by protocol
     considerations.

  5. There are (according to a post from Stevens quoted in the FAQ
     [earlier in this answer - Vic]) about 11 tests made which could
     delay sending the data. But as I see it, there are only 2 that are
     significant, since things like retransmit backoff are a) not under
     the programmers control and b) must either resolve within a finite
     time or drop the connection.

  The first of the interesting cases is "window closed"  (ie. there is
  no buffer space at the receiver; this can delay data indefinitely, but
  only if the receiving process is not actually reading the data that is
  available)

  Vic asks:

  OK, it makes sense that if the client isn't reading, the data isn't
  going to make it across the connection.  I take it this causes the
  sender to block after the recieve queue is filled?

  The sender blocks when the socket send buffer is full, so buffers will
  be full at both ends.

  While the window is closed, the sending TCP sends window probe
  packets. This ensures that when the window finally does open again,
  the sending TCP detects the fact. [RFC1122, ss 4.2.2.17]

  The second interesting case is "Nagle algorithm" (small segments, e.g.
  keystrokes, are delayed to form larger segments if ACKs are expected
  from the peer; this is what is disabled with TCP_NODELAY)

  Vic Asks:

  Does this mean that my tcpclient sample should set TCP_NODELAY to
  ensure that the end-of-line code is indeed put out onto the network
  when sent?

  No. tcpclient.c is doing the right thing as it stands; trying to write
  as much data as possible in as few calls to write() as is feasible.
  Since the amount of data is likely to be small relative to the socket
  send buffer, then it is likely (since the connection is idle at that
  point) that the entire request will require only one call to write(),
  and that the TCP layer will immediately dispatch the request as a
  single segment (with the PSH flag, see point 2.2 above).

  The Nagle algorithm only has an effect when a second write() call is
  made while data is still unacknowledged. In the normal case, this data
  will be left buffered until either: a) there is no unacknowledged
  data; or b) enough data is available to dispatch a full-sized segment.
  The delay cannot be indefinite, since condition (a) must become true
  within the retransmit timeout or the connection dies.

  Since this delay has negative consequences for certain applications,
  generally those where a stream of small requests are being sent
  without response, e.g. mouse movements, the standards specify that an
  option must exist to disable it. [RFC1122, ss 4.2.3.4]

  Additional note: RFC1122 also says:

     [DISCUSSION]:
        When the PUSH flag is not implemented on SEND calls, i.e., when
        the application/TCP interface uses a pure streaming model,
        responsibility for aggregating any tiny data fragments to form
        reasonable sized segments is partially borne by the application
        layer.

  So programs should avoid calls to write() with small data lengths
  (small relative to the MSS, that is); it's better to build up a
  request in a buffer and then do one call to sock_write() or
  equivalent.

  The other possible sources of delay in the TCP are not really
  controllable by the program, but they can only delay the data
  temporarily.

  Vic asks:

  By temporarily, you mean that the data will go as soon as it can, and
  I won't get stuck in a position where one side is waiting on a
  response, and the other side hasn't recieved the request?  (Or at
  least I won't get  stuck forever)

  You can only deadlock if you somehow manage to fill up all the buffers
  in both directions... not easy.

  If it is possible to do this, (can't think of a good example though),
  the solution is to use nonblocking mode, especially for writes. Then
  you can buffer excess data in the program as necessary.



UNIXguide.net
English to Visayan Cebuano Dictionary
Suggest a Site
Visayan Cebuano to English Dictionary