Tcl Source Code

View Ticket
Login
Ticket UUID: c9e68eb6ca40bb844b6c5d172a8e203f4b419857
Title: tclEncoding.c: undefined behavior involving NULL pointers
Type: Patch Version: 8.6.12
Submitter: chrstphrchvz Created on: 2021-12-20 18:53:45
Subsystem: 52. Portability Support Assigned To: nobody
Priority: 5 Medium Severity: Minor
Status: Open Last Modified: 2023-03-26 09:48:01
Resolution: None Closed By: nobody
    Closed on:
Description:

Passing NULL as the src argument to conversion functions UtfToUtfProc() (used by UtfIntToUtfExtProc() and UtfExtToUtfIntProc()), UtfToUnicodeProc(), UnicodeToUtfProc(), Iso88591ToUtfProc(), Iso88591FromUtfProc(), EscapeToUtfProc(), EscapeFromUtfProc(), TableToUtfProc(), and TableFromUtfProc() in generic/tclEncoding.c triggers the following undefined behaviors:

  • NULL + 0 pointer arithmetic when computing srcEnd = src + srcLen
  • NULL < NULL pointer comparison in the loop condition src < srcEnd
  • NULL - NULL pointer subtraction when computing *srcReadPtr = src - srcStart

Additionally, BinaryProc() uses memcpy() and will call it with NULL as the source, which is undefined even for copy length of 0.

UBSan currently only complains for pointer arithmetic using NULL (at least when compiled with Clang -fsanitize=pointer-overflow), and for memcpy() with NULL source (when compiled with GCC -fsanitize=nonnull-attribute). I do not observe Tcl behaving unexpectedly in any of these cases.

Example during startup when the TCL_LIBRARY environment variable is not set, TclpInitLibraryPath() (unix/tclUnixInit.c) calls Tcl_ExternalToUtfDString() with src == NULL:

$ unset TCL_LIBRARY
$ tclsh8.6 /dev/null
tcl/generic/tclEncoding.c:2300:18: runtime error: applying zero offset to null pointer

More complete example using C (not aware how to accomplish this from Tcl; I put this near the end of Tcltest_Init() for lack of a better place):

    if (1) {
#define MYENCODINGS_LEN 6
	const char * myencodings[MYENCODINGS_LEN] = {
		"identity",	/* BinaryProc */
		"utf-8",	/* UtfToUtfProc (UtfExtToUtfIntProc, UtfIntToUtfExtProc) */
		"unicode",	/* UnicodeToUtfProc, UtfToUnicodeProc */
		"iso8859-1",	/* Iso88591ToUtfProc, Iso88591FromUtfProc */
		"shiftjis",	/* TableToUtfProc, TableFromUtfProc */
		"iso2022"	/* EscapeToUtfProc, EscapeFromUtfProc */
	};
	int i;
	for (i = 0; i < MYENCODINGS_LEN; i++) {
	    Tcl_DString ds;
	    Tcl_DStringInit(&ds);
	    Tcl_Encoding e = Tcl_GetEncoding(interp, myencodings[i]);
	    fprintf(stderr, "\e[32m'%s'\e[0m\n", Tcl_GetEncodingName(e));
	    Tcl_ExternalToUtfDString(e, NULL, 0, &ds);
	    Tcl_DStringFree(&ds);
	    Tcl_UtfToExternalDString(e, NULL, 0, &ds);
	    Tcl_DStringFree(&ds);
	    Tcl_FreeEncoding(e);
	}
    }

Output:

$ TCL_LIBRARY= ./tcltest /dev/null
'identity'
tcl/generic/tclEncoding.c:2138:5: runtime error: null pointer passed as argument 2, which is declared to never be null
'utf-8'
tcl/generic/tclEncoding.c:2299:18: runtime error: applying zero offset to null pointer
'unicode'
tcl/generic/tclEncoding.c:2475:18: runtime error: applying zero offset to null pointer
tcl/generic/tclEncoding.c:2562:18: runtime error: applying zero offset to null pointer
'iso8859-1'
tcl/generic/tclEncoding.c:2913:18: runtime error: applying zero offset to null pointer
tcl/generic/tclEncoding.c:2995:18: runtime error: applying zero offset to null pointer
'shiftjis'
tcl/generic/tclEncoding.c:2683:18: runtime error: applying zero offset to null pointer
tcl/generic/tclEncoding.c:2797:18: runtime error: applying zero offset to null pointer
'iso2022'
tcl/generic/tclEncoding.c:3154:18: runtime error: applying zero offset to null pointer
tcl/generic/tclEncoding.c:3364:18: runtime error: applying zero offset to null pointer

Some possible styles for suggested fixes: either have functions exit early with dedicated block duplicating necessary final statements (not as feasible for EscapeFromUtfProc()); use a single NULL check and goto to skip over anything which assumes src is non-NULL; or add inline checks for NULL wherever src is assumed to be non-NULL. The last approach seems not too invasive: no required rearranging or duplication of existing code, only thing duplicated is NULL check; this is what the attached patch uses.

My understanding is that these conversion functions are only ever called by Tcl_ExternalToUtf(), Tcl_UtfToExternal(), Tcl_ExternalToUtfDString(), and Tcl_UtfToExternalDString(); notably they already force srcLen to 0 when src is NULL and to a nonnegative amount otherwise. So I do not expect any proposed fixes to change the actual behavior.

User Comments: chrstphrchvz added on 2023-03-26 09:48:01:

Reuploaded patch to apply cleanly following changes in [be99e0b905a6].


Attachments: