Tcl Source Code

View Ticket
Login
Ticket UUID: 3530533
Title: io.test failures on IRIX64
Type: Bug Version: obsolete: 8.6b2
Submitter: dgp Created on: 2012-05-29 14:16:56
Subsystem: 25. Channel System Assigned To: dgp
Priority: 2 Severity:
Status: Closed Last Modified: 2012-06-08 20:27:41
Resolution: Fixed Closed By: dgp
    Closed on: 2012-06-08 13:27:41
Description:
---- Result was:
file size only 24576
---- Result should have been (exact matching):
ok
==== io-27.6 FAILED


---- Result was:
file size only 12288
---- Result should have been (exact matching):
ok
==== io-29.32 FAILED
User Comments: dgp added on 2012-06-08 20:27:41:

allow_comments - 1

dgp added on 2012-06-08 18:57:05:
In that case, it seems putting the #include back into
tclUnixPort.h is the right thing to fix matters on IRIX.
Only worry is that it was taken out for a reason, and I'll
be regressing on that.

Most likely reason I can imagine would be OSX, but I've
just tested that and it works fine.

ferrieux added on 2012-06-08 03:58:17:
In not so old days (AIX, Solaris), including <pthread.h> in front of all the rest was required for all modules of a multithreaded application (not only the ones calling into libpthread). So the minimization seems dangerous.

dgp added on 2012-06-08 03:31:01:
putting the <pthread.h> include back in
tclUnixPort.h avoids those problems, but
it's not clear what aims we have at minimizing 
the reach of that header file.

dgp added on 2012-06-08 02:42:39:
Adding to the top of tclUnixPipe.c

#ifdef TCL_THREADS
#  include <pthread.h>
#endif

is enough to fix these failing tests on core-8-5-branch.

However, that change causes new test failures:

==== exec-9.1 commands returning errors FAILED
==== Contents of test case:

    set x [catch {exec gorp456} msg]
    list $x [string tolower $msg] [string tolower $errorCode]

---- Result was:
1 {couldn't execute "gorp456": socket operation on non-socket} {posix enotsock {socket operation on non-socket}}
---- Result should have been (exact matching):
1 {couldn't execute "gorp456": no such file or directory} {posix enoent {no such file or directory}}
==== exec-9.1 FAILED



==== exec-9.5 commands returning errors FAILED
==== Contents of test case:

    list [catch {exec gorp456 | [interpreter] echo a b c} msg] [string tolower $msg]

---- Result was:
1 {couldn't execute "gorp456": socket operation on non-socket}
---- Result should have been (exact matching):
1 {couldn't execute "gorp456": no such file or directory}
==== exec-9.5 FAILED



==== exec-14.4 -- switch FAILED
==== Contents of test case:

    list [catch {exec -- -gorp} msg] [string tolower $msg]

---- Result was:
1 {couldn't execute "-gorp": socket operation on non-socket}
---- Result should have been (exact matching):
1 {couldn't execute "-gorp": no such file or directory}
==== exec-14.4 FAILED

==== iocmd-11.4 I/O to command pipelines FAILED
==== Contents of test case:

    list [catch {open "| no_such_command_exists" rb} msg] $msg $::errorCode

---- Result was:
1 {couldn't execute "no_such_command_exists": invalid argument} {POSIX EINVAL {invalid argument}}
---- Result should have been (exact matching):
1 {couldn't execute "no_such_command_exists": no such file or directory} {POSIX ENOENT {no such file or directory}}
==== iocmd-11.4 FAILED

dgp added on 2012-06-08 00:13:35:
The specific triggering cause is the loss of the

#include <pthread.h>

in tclUnixPort.h

dgp added on 2012-06-07 22:16:27:
With the patch in place, the start of these test
failures gets tracked to checkin
eac3630f7292db37c55a 2005-11-27 02:33:48

dgp added on 2012-06-07 20:31:35:
Thanks for the patch.

Applying it does not make that failing tests pass
on the core-8-5-branch tip.  Instead, applying it
makes the fail tests fail even back on 8.5.2, where
they were passing.  Some strange instance of bug
masking bug, I guess.  Continuing to investigate...

jenglish added on 2012-06-06 04:36:16:
Archaeology: this was introduced in Tcl 7.4, relevant changes entry:

| 5/5/95 (portability improvement) Changed to use BSDgettimeofday on
| IRIX machines, to avoid compilation problems with the gettimeofday
| declaration.

This *might* have been necessary on IRIX 3 (released circa 1998).  To the best of my recollection, it was _not_ needed on IRIX 4.* (circa 1991), and it is *definitely* not necessary today.  This is safe to remove.

jenglish added on 2012-06-06 04:25:00:

File Added - 445309: no-bsdgettimeofday.patch

jenglish added on 2012-06-06 04:21:17:
Probable cause of problem: configure.in checks if BSDgettimeofday() is present, and tclUnixPort.h says:

-#   ifdef HAVE_BSDGETTIMEOFDAY
-#define gettimeofday BSDgettimeofday
-#   endif

This is not right.  We don't want BSDgettimeofday() on IRIX, we want the regular old gettimeofday().

On my machine (IRIX 6.5.<something>, circa 2002), in sys/time.h, it looks like if you compile with -D_BSD_COMPAT , sys/time.h #defines gettimeofday as BSDgettimeofday *and in addition* gets a different definition of 'struct timeval', but only on 64-bit builds.  Tcl's autogoo is doing the former, but not the latter, which would lead to an ABI mismatch, as suspected.

jenglish added on 2012-06-06 02:20:32:
Best guess as to why this is causing a problem: IRIX has several variants of gettimeofday(), and also has variants of 'struct timeval'.  Which one gets picked depends on various #ifdefs that are or are not in effect when <sys/time.h> is #included.  This *might* be a stack smash, caused by a timeval/gettimeofday mismatch.  Checkin ec37817fc47e removed a local variable, which may have been masking the smash.

dgp added on 2012-06-05 20:05:06:
Comments left in failing tests.

dgp added on 2012-06-01 23:01:36:
Cross reference Bug 1942197

dgp added on 2012-06-01 22:57:25:
On trunk, failures start with checkin
b5d4ae05ac619c8fa9c572b80ee2b4729f42634a 2008-04-14 17:54:57 UTC
which is the merge of the same changes.

dgp added on 2012-06-01 22:40:34:
On the core-8-5-branch, failures start with checkin
ec37817fc47ea5584aeac174db469b5ba9f05484 2008-04-14 17:49:59 UTC

dgp added on 2012-06-01 19:51:18:
Thread-enabled release 8.5.2 handles these tests just fine.

dgp added on 2012-06-01 02:00:49:
These test failures are entirely about the switch
to a --enable-threads build by default.  They fail
on thread-enabled Tcl on this platform.

Attachments: