Tcl Source Code: View Ticket

Ticket UUID:	2089279
Title:	StringObj.3 Tcl_ObjPrintf inaccuracies
Type:	Bug	Version:	obsolete: 8.6a2
Submitter:	georgeps	Created on:	2008-09-02 16:27:27
Subsystem:	10. Objects	Assigned To:	jan.nijtmans
Priority:	8	Severity:	Minor
Status:	Closed	Last Modified:	2024-02-05 15:30:27
Resolution:	Fixed	Closed By:	jan.nijtmans
		Closed on:	2024-02-05 15:30:27
Description:	StringObj.3 gives this example: long x = 5; Tcl_Obj *objPtr = Tcl_ObjPrintf("Value is %d", x); This is incorrect, because the Tcl_ObjPrintf code will expect an int, and use va_arg with int. As can be seen in the AppendPrintfToObjVA: case 'd': ... seekingConversion = 0; switch (size) { case -1: case 0: Tcl_ListObjAppendElement(NULL, list, Tcl_NewLongObj((long int)va_arg(argList, int))); The issue is that if the user uses %d for a long, the va_list will have the wrong offset. The example will work in the 1 argument case, and even the multiple argument case if sizeof(int) = sizeof(long). It however will fail in the case of sizeof(long) > sizeof(int), because the va_list will have the wrong offset. Thus you'd be getting parts of the integer. This can have disasterous consequences if the offset is wrong, because the next va_arg call becomes wrong, which if it's a pointer may lead to segfaults. Tcl works with some platforms where a long is 64-bit and an int is 32-bit. #ifdef TCL_WIDE_INT_IS_LONG as I understand it refers to this. The next issue is that the format specifiers are not exactly the same as the [format] command, and probably shouldn't be. The format command uses (as documented) %ld for truncation to a 64-bit integer. %lld for no truncation (BigNum) and %d for a 32-bit integer. This is incomptible with AppendPrintfToObjVA() (used by Tcl_ObjPrintf), because it currently accepts a long for %ld which may or may not be 64-bit depending on the platform. The wide and BigNum cases are also not handled by AppendPrintfToObjVA(). Thus the suggestion in the documentation that Tcl_ObjPrintf has the same syntax as the format command are wrong.
User Comments:	jan.nijtmans added on 2024-02-05 15:30:27: Should _really_ be [0e4c7187e7d373cf\|fixed] now, both the documentation and the actual behavior. jan.nijtmans added on 2024-02-05 14:18:38: Actually .... on 32-bit platforms it's still not correct in the unsigned case. Example: Tcl_ObjPrintf("%lu", 4294967295UL) This will print 18446744073709551615, even on 32-bit platforms. Expected: 4294967295 jan.nijtmans added on 2024-02-01 10:16:23: > The discrepancy between Tcl_ObjPrintf() and Tcl_Format() should be noted in the doc, and %l / %ll specified. Documentation is updated now, so I think we can close this. oehhar added on 2024-02-01 09:33:26: Here are related messages from the core list for documentation: -------- Weitergeleitete Nachricht -------- Betreff: [TCLCORE] Backport %ll in Tcl_ObjPrintf Datum: Wed, 31 Jan 2024 16:15:45 +0100 Von: Eric Boudaillier <[email protected]> In Tcl 8.6, the documentation says that Tcl_ObjPrintf() works like [format]. However this is not true in 8.6.13. A fix has been done here: https://core.tcl-lang.org/tcl/info/eac13870dca5d4e7 <https://core.tcl-lang.org/tcl/info/eac13870dca5d4e7> for Tcl 9. Is it possible to backport this for the next 8.6.14? I'm trying to use Tcl_ObjPrintf("%lu") to create Tcl_Obj for unsigned 64 bits. Or is there a better way to do this? What I am really looking for is in this commit https://core.tcl-lang.org/tcl/info/4c2ed52f8f2e2e50, where a single "l" modifier creates a wideInt. OK, I should have looked at existing tickets before the code :-) There is already some (very old) tickets about this: * https://core.tcl-lang.org/tcl/tktview/2089279fffffffffffff this one, still open, already says that [format] use 64 bits for %l modifier, %lld for BigNum. * https://core.tcl-lang.org/tcl/tktview/2b6a4fa5ed30b44364b1 this one is what I observe, saying it is fixed in 8.7. So what about 8.6? * https://core.tcl-lang.org/tcl/tktview/3fbaa4e23bfa711c4340 similar to previous one, also for 8.7 only. Since we'll have no 8.7, could that be backported in 8.6? But I'll probably create a new ticket, it seems that the code in tip is still not aligned with what does [format]: case 1: Tcl_ListObjAppendElement(NULL, list, Tcl_NewWideIntObj( va_arg(argList, long))); break; case 2: Tcl_ListObjAppendElement(NULL, list, Tcl_NewWideIntObj( va_arg(argList, Tcl_WideInt))); break; -------- Weitergeleitete Nachricht -------- Datum: Wed, 31 Jan 2024 17:34:43 +0100 Von: Jan Nijtmans <[email protected]> Op wo 31 jan 2024 om 17:19 schreef Eric Boudaillier: > Since we'll have no 8.7, could that be backported in 8.6? It's a little bit tricky, but possible. If we implement "ll", it's tempting to use TCL_LL_MODIFIER, which results in "I64" on Win64. So we should implement that too. And because "format" uses the same functionality .... I think this should do it: <https://core.tcl-lang.org/tcl/info/c6795d89c8467204> -------- Weitergeleitete Nachricht -------- Datum: Wed, 31 Jan 2024 18:01:26 +0100 Von: Eric Boudaillier <[email protected]> I'm not sure... What you wrote for case 2 (%ll) is what I'd like for case 1 (%l), and what does [format]. case 2 needs to pass a bignum (same as case 3 for 'L' modifier - not in tip? where did I see that...). I32 and I64 are not specified in 8.6 format.n, not needed in my mind, I don't want to change the doc, just fix the code. It's not clear now if Tcl_ObjPrintf() is supposed to do what print() does or what [format] does, but the doc says [format]. I have used it because printf() was different across windows/linux/32/64 bits. -------- Weitergeleitete Nachricht -------- Datum: Wed, 31 Jan 2024 21:33:33 +0100 Von: Jan Nijtmans <[email protected]> Op wo 31 jan 2024 om 18:01 schreef Eric Boudaillier: > > I'm not sure... > What you wrote for case 2 (%ll) is what I'd like for case 1 (%l), and what does [format]. I'm sure. Tcl_WideInt stands for "long" on 64-bit platforms, "long long" on 32-bit platforms and __int64 on Windows. So, the portable way to format a Tcl_WideInt is using TCL_LL_MODIFIER, which is "l" on most platforms (like you), "ll" on 32-bit platforms and "I64" on Windows. If we want to support Tcl_WideInt in a portable way, we need to implement all of them, even though _you_ probably only need "l". case 1 is for "long" ("l"), case 2 is for "long long" ("ll") -------- Weitergeleitete Nachricht -------- Datum: Wed, 31 Jan 2024 20:39:47 +0000 Von: Donal Fellows <[email protected]> Jan Nijtmans <[email protected]> wrote: > I'm sure. Tcl_WideInt stands for "long" on 64-bit platforms, "long long" on 32- > bit platforms and __int64 on Windows. So, the portable way to format a > Tcl_WideInt is using TCL_LL_MODIFIER, which is "l" on most platforms (like > you), "ll" on 32-bit platforms and "I64" on Windows. If we want to support > Tcl_WideInt in a portable way, we need to implement all of them, even though > _you_ probably only need "l". It's a shame that it's currently the thing that's making Tcl builds noisy, with messages like this coming up rather often: .../tcl/generic/tclHash.c:636:21: warning: format '%u' expects argument of type unsigned int', but argument 5 has type 'size_t' {aka 'long long unsigned int'} [-Wformat=] 636 \| snprintf(p, 60, "number of buckets with %d or more entries: %" TCL_Z_MODIFIER "u\n", \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 637 \| NUM_COUNTERS, overflow); \| ~~~~~~~~ \| \| \| size_t {aka long long unsigned int} .../tcl/generic/tclHash.c:636:84: note: format string is defined here 636 \| snprintf(p, 60, "number of buckets with %d or more entries: %" TCL_Z_MODIFIER "u\n", \| ~~~~~~~~~~~~~~~~~~~^ \| \| \| unsigned int \| %" TCL_Z_MODIFIER "llu I'm not picking on that specific site at all; it's just the first one I found in my local build logs. -------- Weitergeleitete Nachricht -------- Datum: Thu, 1 Feb 2024 05:55:22 +0100 Von: Eric Boudaillier <[email protected]> OK, let's separate the problems. I solved my unsigned 64 bits generation by using Tcl_Format("%lu"), which works as expected: %l for wide ints, for 32 bits / 64 bits builds. I wrongly used Tcl_ObjPrintf(), because there is a discrepancy on how it manages %l / %ll, even in Tcl 9.0b1 (windows 32 bits builds): Tcl_ObjPrintf("%lu") truncates to 32 bits, Tcl_Format("%lu").always formats 64 bits. The discrepancy between Tcl_ObjPrintf() and Tcl_Format() should be noted in the doc, and %l / %ll specified. Even the implementation is misleading, both are using Tcl_AppendFormatToObj(). jan.nijtmans added on 2024-01-31 16:41:57: Proposed fix [c6795d89c8467204\|here] dgp added on 2017-07-09 22:09:45: For Tcl_ObjPrintf(), the format %lld DOES NOT MEAN (long long int). If you program as if it does, you will get crashes. Don't do that. The %d format means pull an (int) which is then pushed through losslessly as a Tcl_NewLongObj() to be formatted by [format %d] which expects to act on a (long int). (I know. That's a confusing mess, but [format] was already designed that way by the time I got to touch it.) This allows Tcl_ObjPrintf() to be similar to sprintf() in what C programmers expect to pass when using the %d specifier, while still re-using the same formatting engine already existing for [format]. The %ld format means pull a (long int) which is then pushed through losslessly as a Tcl_NewWideIntObj() to be formatted by [format %ld] which expects to act on a (Tcl_WideInt). There is no way to pass a (long long int) or a (Tcl_WideInt) or a (size_t) argument to Tcl_ObjPrintf(). There's always been a /* TODO / comment in the code noting this incompleteness. More recently we've recorded the need to correct this in an RFE ticket. https://core.tcl.tk/tcl/tktview?name=3fbaa4e23b Jan Nijtmans has related work on the branch z_modifier and might have more on other branches I cannot find quickly right now. stu added on 2017-07-08 18:37:03: Just stumbled across this. See also http://core.tcl.tk/tcl/tktview/2b6a4fa5ed30b44364b1 On 32-bit systems: Tcl_ObjPrintf("%lld %ld\n", (long long) 3, (long) 4)) Yields "3 0" Tcl_ObjPrintf("%lld %s\n", (long long) 3, "a")) Segfaults Right now it's hard to have confidence in Tcl_ObjPrintf. Even if it was fixed today the broken version will persist in the wild for a long time possibly requiring autoconfery for those who wish to use it with confidence. Too bad. dkf added on 2009-04-10 20:15:24: Corrected the example in the docs, but the real problems persist. ferrieux added on 2009-03-20 06:33:18: After a second dive, it looks like the doc is currently an outright lie: - The %h prefix for [ciudoxX] is completely ignored (va_arg(int) in all cases) (I also don't understand the absence of a "p++" in the %h case) - As George mentioned, generalization to %ll et al is not a good idea (not really suited for vararg handling) So I'd advocate to fix the doc so that it admits taking printf() as a model and diverging only on %h. (maybe an explicit list of specifiers and modifiers should be added, since printf() has a variable ambition on different unix flavours). ferrieux added on 2009-03-05 06:10:35: Indeed, there's a trace of more ambition in APTOVA: / TODO: support for wide (and bignum?) arguments */ at present, what's implemented is closer to sprintf. Could someone review to see how exactly it differs ? Is the %h (length -1) case correct ?