Tcl Source Code

View Ticket
Login
Ticket UUID: da16d155744c55d5a769a81b57b0e80045e8391a
Title: unacceptable long file read time by fix of [db4f2843cd]
Type: Bug Version: core-8-6-branch
Submitter: pointsman Created on: 2024-01-07 23:57:34
Subsystem: 25. Channel System Assigned To: nobody
Priority: 5 Medium Severity: Important
Status: Closed Last Modified: 2024-01-22 15:11:28
Resolution: Fixed Closed By: oehhar
    Closed on: 2024-01-22 15:11:28
Description:
Paul Obermeier reported on tcl-core:

Using the latest 9.0b1 release, I get inacceptable large read times, when reading a large file.
This happens only on Windows and when using the default channel settings, i.e not using fconfigure.

The attached script demonstrates the issue by creating a large file and reading the file using 3 different modes:
1. Read the file in small chunks and build up the data string.
2. Use plain "read $fp"
3. Use "read $fp [file size]"

Using Tcl 8.6 or 8.7 this works just fine, but with Tcl9 modes 2 and 3 take forever.
If configuring the file channel with "-translation binary", read times are normal.

=== (end of Pauls report)

I have confirmed on windows 10 with msys/mingw, gcc 8.1.

Bisecing points to [f4c0d899fb4534f4]
User Comments: oehhar added on 2024-01-22 15:11:28:

The fix is reverted for now by commit [7a188dd1c9]. Please see ticket [db4f2843cd] for the reasons.

Thank you all ! Harald


oehhar added on 2024-01-22 13:54:23:

I made some tracing sessions with the two versions:

Old:

Tcl_AppendToObj(objPtr, NULL, dstLimit);

and New:

Tcl_SetObjLength(objPtr, numBytes + dstLimit);

IMHO, the performance difference comes from the different allocation strategies of the 4096 bytes read by each read command. Each command grows the buffer by 4096 bytes (plus 2 bytes for eventual full utf code (on 8.6 with TCL_CHAR_MAX=3).

The old method uses GrowStringBuffer() which doubles the buffer on each run. In consequence, only a much smaller amount of realloc-calls is used, but eventually a large chunk of memory is wasted.

The new method uses ckrealloc() for each chunk of 4096 bytes.

The test string size is 1000000000. So 1000000000 / 4096 = 244141 reallocs happen with increasing size.

The great factor-two logic of GrowStringBuffer() is probably not used by Tcl_SetObjLength(), as this does not signal the potential of a future growth.

The 2nd comparison of the two methods is, if a UNICODE string is prefered.

The called routine is "Tcl_AppendLimitedToObj()". Here is the preference of Unicode as stated by Sergey:

    if (stringPtr->hasUnicode && stringPtr->numChars > 0) {
	AppendUtfToUnicodeRep(objPtr, bytes, toCopy);
    } else {
	AppendUtfToUtfRep(objPtr, bytes, toCopy);
    }

In the tracing I did, there is never a chance to get a UNICODE string.

In consequence, I would suggest the following:

- unwind the patch - add a test, that unicode is never in the buffer at this position.

As in both cases "SetStringFromAny(NULL, objPtr)" is called, an eventual unicode representation may be invalidated after this call. SOmething like:

SetStringFromAny(NULL, objPtr);
stringPtr = GET_STRING(objPtr);
stringPtr->hasUnicode = 0;
stringPtr->numChars = -1;

Is this a plan ?

Thank you and take care, Harald


oehhar added on 2024-01-14 18:27:26:

This is the ticket of the fix causing the long read time: [db4f2843cd]

The issue is within the ReadChars function to read blocks from a channel.

    Tcl_Size numBytes;

    int dstLimit = TCL_UTF_MAX - 1 + toRead * factor / UTF_EXPANSION_FACTOR;

    (void) Tcl_GetStringFromObj(objPtr, &numBytes);
    Tcl_SetObjLength(objPtr, numBytes + dstLimit);

*Tcl_SetObjLength* operates on the string value of the object, even if the word "string" is not in the function name. The description does not say, if the string is in Unicode (16 bit) or UTF-8 (8 bit).

The former command was:

Tcl_AppendToObj(objPtr, NULL, dstLimit);
which did not have the slow-down.

The command description states, that it works on Unicode strings (16 bits) prefered.

I don't jet really get the comment by Sergey. He states, that channel data is anyway always UTF-8. How may it happen, that the buffer gets converted to 16 bits (Unicode) ?

This spot sould be found.

I will try to make a debug session within the weak to find this spot.

Ticket renamed and version set to core-8-6-branch.

Any help appreciated.

Thank you and take care, Harald


pointsman added on 2024-01-08 11:54:21:
core-8-6-branch tip also shows the bug.

oehhar added on 2024-01-08 10:01:27:

Discussed today on the release manager support meeting. Results:

  • Not a blocker for b1
  • Should be in 8.6 and on all platforms
  • Supposed reason is that the new call transforms the data to 32 bit unicode and back to utf-8 for each chunk.

Take care, Harald


oehhar added on 2024-01-08 08:02:26:

Thank you for catching. I hope Sergey has a comment on this.

Take care, Harald


Attachments: