Tcl Source Code

View Ticket
Login
Ticket UUID: 3d3124d01dabb46584e39877d44802bf5e8f4952
Title: failures with -ftrapv compiler option
Type: Bug Version: 8.6
Submitter: anonymous Created on: 2014-03-06 15:27:31
Subsystem: 11. Conversions from String Assigned To: kbk
Priority: 5 Medium Severity: Important
Status: Pending Last Modified: 2023-09-15 13:24:24
Resolution: Fixed Closed By: nobody
    Closed on: 2023-09-14 22:44:19
Description:
If tcl is compiled with the "-ftrapv" option (which adds checks for integer overflow) the string to integer conversion code in tclStrToD.c fails when parsing the most negative 64-bit integer (-9223372036854775808).

This manifests itself in the clock command:

% clock format [clock seconds] -format "%d-%b-%Y"
Process 4807 stopped
* thread #1: tid = 0x4e9b38, 0x000000010009ebcc libtcl8.5.dylib`TclParseNumber(interp=0x0000000000000000, objPtr=<unavailable>, expected=0x00000001000ca82e, bytes=<unavailable>, numBytes=-1, endPtrPtr=0x0000000000000000, flags=<unavailable>) + 5628 at tclStrToD.c:560, queue = 'com.apple.main-thread, stop reason = EXC_BAD_INSTRUCTION (code=EXC_I386_INVOP, subcode=0x0)
    frame #0: 0x000000010009ebcc libtcl8.5.dylib`TclParseNumber(interp=0x0000000000000000, objPtr=<unavailable>, expected=0x00000001000ca82e, bytes=<unavailable>, numBytes=-1, endPtrPtr=0x0000000000000000, flags=<unavailable>) + 5628 at tclStrToD.c:560
   557 		    } else if (flags & TCL_PARSE_OCTAL_ONLY) {
   558 			goto zeroo;
   559 		    } else if (isdigit(UCHAR(c))) {
-> 560 			significandWide = c - '0';
   561 			numSigDigs = 1;
   562 			state = DECIMAL;
   563 			break;

because the clock command at some point gets to here in clock.tcl:

   3596     set r {}
   3597     set lastTime $MINWIDE
   3598     foreach t $times c $codes {
   3599         if { $t < $lastTime } {
   3600             return -code error "$fileName has times out of order"
   3601         }
   3602         set lastTime $t

The crash happens in line 3599 when $t is compared with $MINWIDE. This code doesn't look quite right to me as $t can never be less than $lastTime the first time through as $lastTime is the most negative number. Shouldn't lastTime be initialized with the first time in the list? Or given that lastTime is presumably seconds the arbitrary choice of $MINWIDE to seed it could be somewhat different?

I'm not sure whether the bug is in tclStrToD (because it deliberately overflows and -ftrapv should never be used) but there surely does seem to be a bug in clock.tcl as well.

For reference this also fails:

% set last -9223372036854775808
-9223372036854775808
% incr last
Illegal instruction: 4

This is on OS X Mavericks with clang but colleagues report a similar problem on linux with gcc.

Apologies if this is a duplicate as I've had trouble finding how to search the ticket database.
User Comments: chrstphrchvz added on 2023-09-15 13:24:24:

(Disregard previous comment.) Regarding the overflow in tclExecute.c, I suggest looking at [176de86b3b] in case that warrants taking a different approach.


chrstphrchvz added on 2023-09-15 13:23:24:

Regarding the overflow in tclExecute.c, I suggest looking at [ba731752d345] in case that warrants taking a different approach.


chrstphrchvz added on 2023-09-15 11:39:49:

I agree that building with -ftrapv as the reporter did would have encountered various other instances of signed integer overflow. However, there remains possibility of that throughout Tcl/Tk codebases, and often beyond existing test suite coverage. So I expect this to remain an ongoing and gradually uncovered issue, rather than something a single ticket can be designated for and addressed in a reasonable time period.

I would suggest having a CI build with UBSan (-fsanitize=undefined) enabled. That is what I have encountered other major software projects doing, and will likely be much more useful than -ftrapv.


jan.nijtmans added on 2023-09-15 10:34:24:

Found 3 more places where -ftrapv errors are triggered, running the standard testsuite. See: [237894f65ad976db]. Easiest fix is just cast to (unsigned)


jan.nijtmans added on 2023-09-15 06:21:07:

@chrstphrchvz, Many Thanks! As usual, you are right!

Let's reserve this ticket for the -ftrapv option then, and let's make sure it continues working: [fbfde51043fdf463]

I think it's OK to generalize tickets: Even though the original report was very specific (mentioning tclStrToD.c), it can be used to fix similar related issues which fall outside the original ticket specification but belong to the same idea.


chrstphrchvz added on 2023-09-14 22:44:19:

Pardon me obsessing over this… I still feel that clock-61.3 likely does not belong in the title of this ticket, as the reporter’s issue involved parsing -9223372036854775808, and shown much more succinctly by their incr example than by the $MINWIDE$lastTime example in clock.tcl.

I did not spot any relevant changes to the C code mentioned by the reporter, so maybe it is better to close this as “works for me” rather than “fixed” or “outdated”.

I have also encountered reports of false negatives in GCC -ftrapv (and reportedly not affecting -fsanitize=signed-integer-overflow).


jan.nijtmans added on 2023-09-14 21:21:37:

The fix [ee677ecc84710dfd|here] indeed fixes all testcases when using -ftrapv.

Thanks!


chrstphrchvz added on 2023-09-14 08:27:18:

Jan, I believe what you encountered in clock-61.3 is what I reported in [b9a41897a8]. There are likely other instances of signed integer overflow in the clock code which I have encountered but not reported.

When I said I could not reproduce this issue, I meant the issue specifically at the location in TclParseNumber() described by the reporter. Sorry for the confusion.


jan.nijtmans added on 2023-09-14 07:42:37:

Well, I compiled Tcl 8.6 with the -ftrapv flag, the result was:

$ make test-tcl TESTFLAGS="-file clock.test"
...
Tests began at Thu Sep 14 09:35:58 CEST 2023
clock.test
Test file error: child killed: illegal instruction

Tests ended at Thu Sep 14 09:36:19 CEST 2023 all.tcl: Total 0 Passed 0 Skipped 0 Failed 0 Sourced 1 Test Files.

Test files exiting with errors:

clock.test

All other test-cases pass, and the examples in this ticket pass to.

The test-case failing was:

test clock-61.3 {near-miss overflow of a wide integer on output} {
    clock format 0x7fffffffffffffff -format %s -gmt true
} [expr 0x7fffffffffffffff]

So it looks like not everything is solved in current 8.6. It's just the maximum number, not the minimum one, still causing problems.

For 8.5 this would be a Wont-Fix, but if Kevin (or anyone else) wants to dig into this .... it still can be fixed for 8.6


chrstphrchvz added on 2023-08-15 07:41:07:

c - '0' only overflows when c < -0x50.

Sorry, this is wrong. c - '0' cannot overflow at all, because '0' by itself is int which causes c to be promoted, and c regardless of signedness cannot be in the range [INT_MIN,INT_MIN+'0') which are the only values that would lead to overflow. The mystery remains.


chrstphrchvz added on 2023-08-15 07:26:51:

I am not certain which operation overflowed here. c - '0' only overflows when c < -0x50. If isdigit(UCHAR(c)) is true, then c is between 0x30 and 0x39, so c - '0' cannot overflow. Something else seems wrong. I do not know if declaring c as unsigned char (since char is signed by default on Darwin amd64) to avoid the -ftrapv exception is a good idea or just hides the problem. But I have not spotted any relevant changes since this issue was reported which may have resolved it.

So far I cannot reproduce this issue with either -ftrapv or -fsanitize=signed-integer-overflow when I run these examples.