Tcl Source Code

View Ticket
Login
Ticket UUID: eb816947b39bcab21dfb538c10029ee4901d7341
Title: Issue with binary string and split command
Type: Patch Version: 8.6.9
Submitter: crn Created on: 2019-09-12 09:38:22
Subsystem: 12. ByteArray Object Assigned To: dgp
Priority: 5 Medium Severity: Important
Status: Open Last Modified: 2019-09-13 14:09:20
Resolution: None Closed By: nobody
    Closed on:
Description:
There is a problem in Tcl_SplitObjCmd when splitting a binary string with only one caracter, code relies on strchr which attends a nul caracter in end of string. But, in some special cases, this caracter isn't present and Tcl crashes. String size is known, so we need to use it by replacing strchr by memchr.
User Comments: dgp added on 2019-09-13 14:09:20:
Tcl_SetByteArrayObj(objPtr, bytes, length) sets objPtr->bytes to NULL.

It also fills another array objPtr->internalRep.twoPtrValue.ptr1->bytes[] and
does not terminate that data with '\0' (by design of that internal rep), but
that has nothing to do with what [split] examines.

dgp added on 2019-09-13 14:04:38:
Thanks! Will take a look.

crn added on 2019-09-13 13:52:02:
In Tcl_SetObjLength :
objPtr->bytes[length] = 0;
But in Tcl_SetByteArrayObj (no '\0') :
byteArrayPtr = ckalloc(BYTEARRAY_SIZE(length));
memcpy(byteArrayPtr->bytes, bytes, (size_t) length);

I don't see where Tcl_NewByteArray adds '\0'.

My program uses Tk and TableList.

dgp added on 2019-09-13 13:29:49:
In that scenario, the right thing to do is to fix the bug, not to make the [split] command more tolerant of malformed inputs.

In any valid Tcl_Obj struct where objPtr->bytes is not NULL, it must be that

    (objPtr->bytes[objPtr->length] == '\0')

at any time when control over that struct changes hands, as in passing as
an argument. Whatever code has failed to maintain that invariant is where
the bug is.

Any further tips you can offer on finding that bug are welcome. When you encounter it, are any extensions in use?

crn added on 2019-09-13 13:03:44:
Sorry, I am not able to reproduce it with a simple script. It happens in a complex program, the string we wanted to split comes from a read command while the file is written (not sure if this matter). While looking at C code through a debugger, we saw that Tcl_StringObj length was ok, "bytes" pointer was pointing on a previous string (longer than the current, we clearly older values after the current string) but no '\0' at end of current string.
So the scenario implies a buffer reusage and reading a file with variable chunk size.

dgp added on 2019-09-13 12:33:07:
Can you provide a demonstration script?

Attachments: