Tcl Source Code

View Ticket
Login
Ticket UUID: ed29c4da21f4c0a7858a0f5ae4aae20dd42fc818
Title: http::get -channel broken on MacOS
Type: Bug Version: 8.5.16
Submitter: anonymous Created on: 2014-10-09 19:29:48
Subsystem: 16. Commands A-H Assigned To: dgp
Priority: 5 Medium Severity: Severe
Status: Closed Last Modified: 2014-10-10 19:45:06
Resolution: Fixed Closed By: dgp
    Closed on: 2014-10-10 19:45:06
Description:
The command http::geturl -channel is broken on MacOS 10.9.5 (and likely other versions). Transfers fail while writing to the channel with "resource temporarily unavailable". Transfers seem to work if -channel not specified. Here is an example:

package require http

set outname "saturn.jpg"
set url "http://apod.nasa.gov/apod/image/1409/saturnequinox_cassini_7227.jpg"
set out [open $outname w]
set token [::http::geturl $url -channel $out]
close $out
package require http

On MacOS with ActiveState Tcl 8.5.16 I see:
Error in startup script: error reading "sock8": resource temporarily unavailable
    while executing
"::http::geturl $url -channel $out"
    invoked from within
"set token [::http::geturl $url -channel $out]"
    (file "tcl http example.tcl" line 6)
localhost$ wish tcl\ http\ example.tcl
User Comments: dgp added on 2014-10-10 19:45:06:
Completed fix committed to trunk.

dgp added on 2014-10-10 18:24:54:
Fix committed to the 8.5 branch.

dgp added on 2014-10-10 15:42:37:
Spoke too soon.

New test io-53.15 committed to both core-8-5-branch and trunk.

Demonstrates the bug in both branches.  MBRead() copied the
same bug over.

The http script doesn't demo the problem in 8.6.  I have to
guess that all of the [socket] channel changes (IPv6, etc.)
have had the effect that sockets just do not raise EAGAIN
in the same circumstances.

dgp added on 2014-10-10 13:42:30:
In the demo script, the bug is in [chan copy] aka [fcopy].
That's the easiest route for scripts to get to where the
problem is.  Most other script-level channel read operations
do not go there.

The problem doesn't show in Tcl 8.6 with this script because
the same conditions that route to the bug in 8.5 route to the
new optimized "MoveBytes" version of [chan copy] in 8.6, so
the bug is dodged there, at least in this instance.

dgp added on 2014-10-10 01:07:30:
Thanks for isolating the cause.

Key to making progress is to look at [5180649ac5].
Comment there is "Same results; simpler logic" but now
we know that's false, because that's the checkin that
broke this demo script.

Anyone who needs a fix *NOW* can apply the backout
of that patch, and it should work.

Mystery why this fails on 8.5, but doesn't seem to affect
8.6.2.  After I further analyze the detailed cause, it may
be that some other demo script can demo similar troubles there.

aku added on 2014-10-09 22:48:17:
Looking at the quite short delta I suspect that the removal of the GotFlag check for CHANNEL_BLOCKED is the problem. That is triggered by EWOULDBLOCK IIRC, and that matches the issue we now have.

aku added on 2014-10-09 22:39:33:

Bisection result:

bisect complete 1 BAD 2014-10-09 14:52:48 b14bb38f76f7ee7a 9 BAD 2014-09-01 08:11:52 abaf2748e223200a 11 BAD 2014-08-25 15:36:16 06a91f777f9dc176 12 BAD 2014-08-22 13:23:04 70e97884f0a0517b 13 CURRENT 2014-08-22 13:23:04 [70e97884f0a0517b] -- Here it came in 10 GOOD 2014-08-20 18:59:31 ff52fbb4ac3eb0a8 8 GOOD 2014-08-13 09:04:55 a0fa37f70e4b3310 7 GOOD 2014-07-10 12:52:10 6526c143035aba08 6 GOOD 2014-05-14 09:11:42 57d9dfa4e6b7ca2a 5 GOOD 2014-03-27 21:35:57 b846182cdc2e35ab 4 GOOD 2013-06-12 10:24:52 e2b60a9a55e05f48 3 GOOD 2012-07-28 22:54:19 288d3e72e58232c0 2 GOOD 2010-09-08 17:38:32 7f1e1062ab6ea681

That checkin was made by dgp for Eric Boudaillier. Assigning the ticket now.

Will also attach my test script. Derived from the original one.


aku added on 2014-10-09 22:14:18:
Confirmed that 8.5.9 works with the same script.

aku added on 2014-10-09 22:03:43:
Can confirm the error on a Snow Leopard box with 8.5.16. Somebody said that 8.5.9 works, so I will try that and if that is true bisection can run.

(I modified the script a bit, i.e. I added the missing 'fconfigure -translation binary $out', and the missing '-binary 1' for the geturl call).

aku added on 2014-10-09 21:32:08:
"writing the file" - Then why is the message
   'error reading "sock8"'
That is definitely reading from the socket.
A file handle would be something like 'fileXXX'.

Still, with it 'working' for any site it should be possible to bisect where the problem started.

anonymous added on 2014-10-09 20:37:25:
I see the problem on every site I have tried (I used APOD as a public place to get a large image).

I'm pretty sure the error is in writing the file. based on error messages I see when running similar code in Python using Tkinter (which is where I first ran into the problem -- I primarily code Python, hence some of the clumsiness in my code example).

aku added on 2014-10-09 19:46:21:
From the description it seems that a 'read' on the socket fails.

The underlying error code for the message ('resource temporarily unavailable')
usually is EAGAIN or EWOULDBLOCK, pointing to some mishandling in that area ?!

Do you see the issue only with the APOD site, or others as well ?

Attachments: