Tcl Source Code

View Ticket
Login
Ticket UUID: 1945538
Title: Stalling async reads with package TLS
Type: Bug Version: obsolete: 8.5.2
Submitter: eriklns Created on: 2008-04-18 09:39:31
Subsystem: 26. Channel Transforms Assigned To: andreas_kupries
Priority: 8 Severity:
Status: Open Last Modified: 2013-01-14 21:38:25
Resolution: None Closed By:
    Closed on:
Description:
This report regards two issues regarding the usage of the TLS package with Tcl:

- empty reads when channels are readable according to [fileevent];
- stalling data transport for specific cases;

This ticket is a reference to the TLS tracker issue #1945535. It is also reported here since I can't tell which party (Tcl or TLS) can be held repsonsible.
User Comments: eriklns added on 2013-01-14 21:38:25:
Repeated the stalling exercises with Tcl8.6 (that has never been done before).

Conclusion: the stalling happens with 8.6 too.

ferrieux added on 2013-01-14 01:14:26:
Updating the title to only stalls. False positives are now documented, closing 3436609.

dkf added on 2011-11-13 05:11:09:
The [zlib] code consumes 1 byte at a time. Comments in that code indicate that this is to avoid problems with EOF handling (I don't know if it could be improved upon).

ferrieux added on 2011-11-12 00:30:40:
Created bug 3436609 to keep track of the more general context (all multibytes encodings and transforms).

ferrieux added on 2011-11-11 21:52:08:
15:33]ferrieuxdkf -- btw, how do *you* tackle the false-positive-fileevent issue in zlib ?
[15:40]ferrieuxIOW, do you "read ahead" ?
[15:40]dkfno
[15:42]ferrieuxso, zlib can encounter false positive just like tls
....
[15:47]ferrieuxso, do you confirm that on a low-level readable, you're *not* trying to swallow a full payload unit to see whether an upper [read 1] would succeed ?
[15:47]dkfcorrect
[15:48]ferrieuxokay, so my previous assumption holds
[15:49]ferrieuxQ: should we fix that in all multi-byte transforms (zlib, tls, but also -encoding utf8), or say that false positives are a fact of life ?
[15:50]dkfdon't know
[15:50]dkf:)

ferrieux added on 2011-11-11 20:44:25:
Hm, it just occurred to me that "false positive" readable fileevents are unavoidable as soon as a the two following conditions hold:
   (a) the transmission unit is >=2 bytes
   (b) no read-ahead is done in the notifier 

As an example, the very same situation lets an utf-8-encoded channel generate the same false positives in vanilla Tcl. 

To solve it, since (a) is a given for both utf-8 and TLS, there are two options:

 (1) attack (b) by letting the notifier do a (nonblocking) 1-unit read-ahead (into the existing input buffer) when triggered by a low-level readable fileevent, and passing it to the upper layer only when the unit is complete.

 (2) decide that this is a Fact Of Life and warn all developers that a fileevent readable does *not* guarantee an immediate return of [read 1]. Look into the source of TLS where this assumption is made.

eriklns added on 2010-06-16 23:39:47:
The stalling reads issue was explored at the script level. The exploration revealed a consistent pattern.

The conclusion (under reserve) reads:

Stalling occurs upon those specific occasions that a successful, script-level read command caused an internal buffer to become filled up or emptied exactly.

The complete report and the tools that come with this exploration have been added to the cross-filed bug report for the TLS extension: #1945535.

eriklns added on 2009-06-08 21:14:28:
Four more files have been found which exhibit the stalling behaviour.  They have been uploaded to the TLS tracker to which this issue is linked.

taylor_venable added on 2009-02-17 08:20:57:
At the recommendation of Alexandre Ferrieux, I'm adding my own experience to this problem.  When trying to use a tls::socket (version 1.6) with the http package (version 2.7.2) in Tcl 8.5.6 on OpenBSD-current (i686) the http status is often reported as "eof" after a request.  No data can be read in this situation, I simply have to retry until it works.  Failures occur usually between 2 and 8 times in a row for any given request.  I cannot tell what may be going wrong, but my code, along with some debugging output, can be found in this thread on comp.lang.tcl: http://groups.google.com/group/comp.lang.tcl/browse_thread/thread/b25695b10274b303

eriklns added on 2008-05-31 19:44:47:
Logged In: YES 
user_id=113903
Originator: YES

About the scope of this issue:

Please note that the scripts that exercise the issue are protocol-agnostic. The issue applies more generally to Tcl channels that have been "secured" using the TLS extension[*].

So, not only https, but also secured implementations of ftp, smtp, ... whatever, can be expected to suffer from this issue (maybe UDP based communications like DTLS are free of it).

[*] Note that I'm not blaming TLS yet !

ferrieux added on 2008-05-31 17:38:14:
Logged In: YES 
user_id=496139
Originator: NO

Andreas, Jeff, don't you think this one is scary ?
It cripples use of https in some applications.
What about prio 9 ?
What can we do to help you more on this ?

eriklns added on 2008-05-13 04:29:03:
Logged In: YES 
user_id=113903
Originator: YES

Responding to Kevin Kenny's request for a test against Tcl 8.5:
See below for the output of a run of strace.tcl (as it was already included in the previously uploaded .tar.gz).

Conclusion: both issues occur likewise with Tcl 8.5.2 (under Linux).

$ tclsh8.5
% source strace.tcl
CLIENT: 12300
SERVER: 12301
Process 12301 attached - interrupt to quit
Process 12300 attached - interrupt to quit
# REPORT
#
# Client choked on the gif data (total size = 61403 bytes).
# Bytes left in the channel: 4059
# Log details:
#    sock = sock3
#    totalRecv = 57344
#    emptyReads = 2
#    recvs = 0 0 8192 8192 8192 8192 8192 8192 8192
#
# Ways to get the remaining bytes from the socket:
# 1. close the socket at the server side (or kill the server)
# 2. do an extra [read sock3] regardless the absence of a readable event
#
# Invoking method 2 yielded 4059 extra bytes
# The complete gif data have been read, you may repeat [getgif] safely.
# REPORT
#
# Client choked on the gif data (total size = 61403 bytes).
# Bytes left in the channel: 4059
# Log details:
#    sock = sock3
#    totalRecv = 57344
#    emptyReads = 0
#    recvs = 8192 8192 8192 8192 8192 8192 8192
#
# Ways to get the remaining bytes from the socket:
# 1. close the socket at the server side (or kill the server)
# 2. do an extra [read sock3] regardless the absence of a readable event
#
# Invoking method 2 yielded 4059 extra bytes
# The complete gif data have been read, you may repeat [getgif] safely.
Process 12301 detached
Process 12300 detached
DONE
% info patch
8.5.2
% parray tcl_platform
tcl_platform(byteOrder)   = littleEndian
tcl_platform(machine)     = i686
tcl_platform(os)          = Linux
tcl_platform(osVersion)   = 2.6.18.8-0.9-default
tcl_platform(platform)    = unix
tcl_platform(pointerSize) = 4
tcl_platform(user)        = erik
tcl_platform(wordSize)    = 4

-- end of message

andreas_kupries added on 2008-05-06 00:37:21:
Logged In: YES 
user_id=75003
Originator: NO

I have general knowledge regarding channel stacking in the core.
I have no knowledge of TLS internals in particular.

kennykb added on 2008-05-05 20:25:48:
Logged In: YES 
user_id=99768
Originator: NO

I'll be happy to dive in and have a look as needed, but Andreas
is really the right one to look at this first; he's a lot
more familiar with TLS and channel stacking than I am.
(Since the problem appears to be specific to TLS, I
suspect that it's in the transform layer rather than the
notifier itself.)

One useful thing would be for the original submitter to
retest with a Tcl release in the 8.5 series; that would
save us some effort against the possibility of chasing
an already-fixed bug.

ferrieux added on 2008-05-05 13:30:15:
Logged In: YES 
user_id=496139
Originator: NO

Moving this to category "Notifier" instead of "None".
Also took the liberty to assign to Kevin, by analogy with other items of the same kind.
(Kevin -- hope you don't mind ;-)
And bumped to prio 8 because wrecking https is not good on Tcl's business card.

Here is a direct link to the TLS tracker entry:
http://sourceforge.net/tracker/index.php?func=detail&aid=1945535&group_id=13248&atid=113248