Tcl Source Code

View Ticket
Login
Ticket UUID: 597970
Title: windows tcl server drops connections
Type: Bug Version: None
Submitter: davidw Created on: 2002-08-20 21:55:18
Subsystem: 27. Channel Types Assigned To: dkf
Priority: 3 Low Severity:
Status: Closed Last Modified: 2002-09-02 17:05:01
Resolution: Invalid Closed By: dkf
    Closed on: 2002-09-02 10:05:01
Description:
I am running the following on Windows XP:

---
set MyCounter 0

proc Cntr {args} {
    incr ::MyCounter
    puts "$args Counter ::MyCounter"
}

socket -server Cntr 8090

vwait forever
---

And connecting to it with this:

---
set numcon 50

set goodsockets 0
set badsockets 0

set i 0
while { $i < $numcon } {
    if { [catch {
set sk [socket dwelton 80]
fconfigure $sk  -buffering line
puts $sk {GET / HTTP/1.0



}
    } err] } {
incr ::badsockets
puts "$::badsockets bad sockets"
    } else {
incr ::goodsockets
puts "$::goodsockets good sockets"
    }
    incr i
}
---

The GET stuff is an artifact of this originally being
used with tclhttpd, but doesn't really matter.  The
problem appears to be at a very low level.  When I run
tcpdump on the connection, it shows that the windows
box is sending RST's after the other box sends a SYN -
it's not even trying to connect.
User Comments: dkf added on 2002-09-02 17:05:01:
Logged In: YES 
user_id=79902

Hmm.  That's mighty odd. The server is on port 8090, but the
client talks to port 80.  Hence, there's something going on
behind the scenes; mixing a webserver into this changes
*everything*, since they might well have all sorts of extra
policies in place to stop things like resource
over-consumption.

To me, the problem looks ever more like it is not a Tcl
fault.  Does the problem go away if you close the client
sockets handed to the Cntr procedure?  (If the problem
persists even with that change, please resubmit as another
bug with *just* sufficient code that it can be tested
without additional packages...)

davidw added on 2002-08-30 01:29:24:
Logged In: YES 
user_id=240

Really?  That's ugly and complicated, and makes the simple
case that much more difficult.  What's more, it's not
documented in the socket man page.  If I get some time, I
will try modifying things and see what I can see.

davygrvy added on 2002-08-29 07:32:30:

File Added - 30008: accept1_server.tcl

davygrvy added on 2002-08-29 07:31:15:

File Added - 30007: accept_client.tcl

Logged In: YES 
user_id=7549

proc Cntr isn't 1) accepting the new channel with a [fileevent]. 
2) declaring ::MyCounter  as global.

tcl8.0 I remember you could do a read/gets from the accept 
proc, but I don't think this is the case anymore.  The missing 
global ::MyCounter is probably just a little typo and i'll infer it 
should be there.

Using the attached scripts, I get this:

server ->
D:\>tclsh84 accept1_server.tcl
Counter: 1
got "GET / HTTP/1.0" on 127.0.0.1:4001
Counter: 2
got "GET / HTTP/1.0" on 127.0.0.1:4002
Counter: 3
got "GET / HTTP/1.0" on 127.0.0.1:4003
Counter: 4
got "GET / HTTP/1.0" on 127.0.0.1:4004
Counter: 5
got "GET / HTTP/1.0" on 127.0.0.1:4005
Counter: 6
got "GET / HTTP/1.0" on 127.0.0.1:4006
Counter: 7
got "GET / HTTP/1.0" on 127.0.0.1:4007
Counter: 8
got "GET / HTTP/1.0" on 127.0.0.1:4008
Counter: 9
got "GET / HTTP/1.0" on 127.0.0.1:4009
Counter: 10
got "GET / HTTP/1.0" on 127.0.0.1:4010
Counter: 11
got "GET / HTTP/1.0" on 127.0.0.1:4011
Counter: 12
got "GET / HTTP/1.0" on 127.0.0.1:4012
Counter: 13
got "GET / HTTP/1.0" on 127.0.0.1:4013
Counter: 14
got "GET / HTTP/1.0" on 127.0.0.1:4014
Counter: 15
got "GET / HTTP/1.0" on 127.0.0.1:4015
Counter: 16
got "GET / HTTP/1.0" on 127.0.0.1:4016
Counter: 17
got "GET / HTTP/1.0" on 127.0.0.1:4017
Counter: 18
got "GET / HTTP/1.0" on 127.0.0.1:4018
Counter: 19
got "GET / HTTP/1.0" on 127.0.0.1:4019
Counter: 20
got "GET / HTTP/1.0" on 127.0.0.1:4020
Counter: 21
got "GET / HTTP/1.0" on 127.0.0.1:4021
Counter: 22
got "GET / HTTP/1.0" on 127.0.0.1:4022
Counter: 23
got "GET / HTTP/1.0" on 127.0.0.1:4023
Counter: 24
got "GET / HTTP/1.0" on 127.0.0.1:4024
Counter: 25
got "GET / HTTP/1.0" on 127.0.0.1:4025
Counter: 26
got "GET / HTTP/1.0" on 127.0.0.1:4026
Counter: 27
got "GET / HTTP/1.0" on 127.0.0.1:4027
Counter: 28
got "GET / HTTP/1.0" on 127.0.0.1:4028
Counter: 29
got "GET / HTTP/1.0" on 127.0.0.1:4029
Counter: 30
got "GET / HTTP/1.0" on 127.0.0.1:4030
Counter: 31
got "GET / HTTP/1.0" on 127.0.0.1:4031
Counter: 32
got "GET / HTTP/1.0" on 127.0.0.1:4032
Counter: 33
got "GET / HTTP/1.0" on 127.0.0.1:4033
Counter: 34
got "GET / HTTP/1.0" on 127.0.0.1:4034
Counter: 35
got "GET / HTTP/1.0" on 127.0.0.1:4035
Counter: 36
got "GET / HTTP/1.0" on 127.0.0.1:4036
Counter: 37
got "GET / HTTP/1.0" on 127.0.0.1:4037
Counter: 38
got "GET / HTTP/1.0" on 127.0.0.1:4038
Counter: 39
got "GET / HTTP/1.0" on 127.0.0.1:4039
Counter: 40
got "GET / HTTP/1.0" on 127.0.0.1:4040
Counter: 41
got "GET / HTTP/1.0" on 127.0.0.1:4041
Counter: 42
got "GET / HTTP/1.0" on 127.0.0.1:4042
Counter: 43
got "GET / HTTP/1.0" on 127.0.0.1:4043
Counter: 44
got "GET / HTTP/1.0" on 127.0.0.1:4044
Counter: 45
got "GET / HTTP/1.0" on 127.0.0.1:4045
Counter: 46
got "GET / HTTP/1.0" on 127.0.0.1:4046
Counter: 47
got "GET / HTTP/1.0" on 127.0.0.1:4047
Counter: 48
got "GET / HTTP/1.0" on 127.0.0.1:4048
Counter: 49
got "GET / HTTP/1.0" on 127.0.0.1:4049
Counter: 50
got "" on 127.0.0.1:4050
got "" on 127.0.0.1:4049
got "" on 127.0.0.1:4048
got "" on 127.0.0.1:4047
got "" on 127.0.0.1:4046
got "" on 127.0.0.1:4045
got "" on 127.0.0.1:4044
got "" on 127.0.0.1:4043
got "" on 127.0.0.1:4042
got "" on 127.0.0.1:4041
got "" on 127.0.0.1:4040
got "" on 127.0.0.1:4039
got "" on 127.0.0.1:4038
got "" on 127.0.0.1:4037
got "" on 127.0.0.1:4036
got "" on 127.0.0.1:4035
got "" on 127.0.0.1:4034
got "" on 127.0.0.1:4033
got "" on 127.0.0.1:4032
got "" on 127.0.0.1:4031
got "" on 127.0.0.1:4030
got "" on 127.0.0.1:4029
got "" on 127.0.0.1:4028
got "" on 127.0.0.1:4027
got "" on 127.0.0.1:4026
got "" on 127.0.0.1:4025
got "" on 127.0.0.1:4024
got "" on 127.0.0.1:4023
got "" on 127.0.0.1:4022
got "" on 127.0.0.1:4021
got "" on 127.0.0.1:4020
got "" on 127.0.0.1:4019
got "" on 127.0.0.1:4018
got "" on 127.0.0.1:4017
got "" on 127.0.0.1:4016
got "" on 127.0.0.1:4015
got "" on 127.0.0.1:4014
got "" on 127.0.0.1:4013
got "" on 127.0.0.1:4012
got "" on 127.0.0.1:4011
got "" on 127.0.0.1:4010
got "" on 127.0.0.1:4009
got "" on 127.0.0.1:4008
got "" on 127.0.0.1:4007
got "" on 127.0.0.1:4006
got "" on 127.0.0.1:4005
got "" on 127.0.0.1:4004
got "" on 127.0.0.1:4003
got "" on 127.0.0.1:4002
got "" on 127.0.0.1:4001


client ->
D:\>tclsh84 accept_client.tcl
1 good sockets
2 good sockets
3 good sockets
4 good sockets
5 good sockets
6 good sockets
7 good sockets
8 good sockets
9 good sockets
10 good sockets
11 good sockets
12 good sockets
13 good sockets
14 good sockets
15 good sockets
16 good sockets
17 good sockets
18 good sockets
19 good sockets
20 good sockets
21 good sockets
22 good sockets
23 good sockets
24 good sockets
25 good sockets
26 good sockets
27 good sockets
28 good sockets
29 good sockets
30 good sockets
31 good sockets
32 good sockets
33 good sockets
34 good sockets
35 good sockets
36 good sockets
37 good sockets
38 good sockets
39 good sockets
40 good sockets
41 good sockets
42 good sockets
43 good sockets
44 good sockets
45 good sockets
46 good sockets
47 good sockets
48 good sockets
49 good sockets
50 good sockets

I think the problem is assuming old 8.0 behavior in the original 
and not a problem with the core, IMO, of course.

davidw added on 2002-08-22 01:11:33:
Logged In: YES 
user_id=240

dkf - you may be right, but the number is so low that it
seems suspicious.  Also worth noting is that on the same
server, there is a 'service' running that is written in C. 
It can handle a lot of open connections with no problems.  I
was reaching 30 or more without one bad one, whereas with
the tcl server I start getting bad ones at around 7 or 8,
which seems way too low.  The windows guys here don't seem
to think it's on OS limit, either.

dkf added on 2002-08-21 16:42:20:
Logged In: YES 
user_id=79902

My initial suspicion is that your server is hitting a limit
on the number of simultaneous open connections, given that
you are not even attempting to close the connected sockets. 
Once you reach that point (which is generally an OS limit,
though I don't know exactly the details of that) then all
further connections will be closed (with a RST) because
there is no hope of them successfully connecting.  :^/

IIRC, the limit on Unix systems is higher (256 total FDs per
process - which'd translate to 252 connections once you've
allowed for stdio and the server socket - is a common limit,
because of constraints on the size of select() buffer) but
still exists.

I'll let AKu close this if my suspicions are close enough to
the truth...

davidw added on 2002-08-21 04:59:48:
Logged In: YES 
user_id=240

I should mention: I ran the second script from a second
(linux) box.  When run locally, everything works as
expected.  When trying 50 connections, I get about 15 good
sockets and 35 bad ones.  I also investigated this being a
windows limit, but it is apparently not the case with this OS.

Attachments: