Tcl Source Code

View Ticket
Login
Ticket UUID: 7371b6270b727521cf0c7044a882d16ef5c4e96b
Title: AddressSanitizer use-after-return detection breaks NRE tests, coroutines
Type: Patch Version: 8.6.13
Submitter: chrstphrchvz Created on: 2023-08-04 04:36:43
Subsystem: 60. NRE and coroutines Assigned To: jan.nijtmans
Priority: 5 Medium Severity: Important
Status: Closed Last Modified: 2023-10-22 16:24:40
Resolution: Accepted Closed By: chrstphrchvz
    Closed on: 2023-10-22 16:24:40
Description:

AddressSanitizer (ASan) can detect use-after-return errors. Depending on the platform, this is enabled either by default, or by using the appropriate compiler flags or runtime environment variables (e.g. ASAN_OPTIONS=detect_use_after_return=1).

But enabling this can completely break Tcl coroutines: there will be unexpected “cannot yield: C stack busy” errors, and numerous tests will fail or even hang (e.g. http-3.10.0, io-28.6). This is because the real stack gets replaced with a heap-allocated fake stack, and there is no guarantee that the address of a stack variable will be consistent for a given C stack depth, as assumed by the coroutines implementation.

Rather than requiring AddressSanitizer use-after-return detection to be disabled, I suggest that Tcl use AddressSanitizer-friendly approaches instead. For GCC and Clang, there is __builtin_frame_address(0); and for MSVC, there is _AddressOfReturnAddress(). And I suggest always using these when available, for consistency between builds with and without AddressSanitizer, and in case of other use-after-return detection tools with similar limitations. See attached patch. https://github.com/llvm/llvm-project/blob/llvmorg-10.0.0-rc1/clang/lib/Basic/Stack.cpp#L24 shows a very similar example.

User Comments: chrstphrchvz added on 2023-10-22 16:24:40:

I am not aware of anything else to do for this ticket; closing. Thanks for committing and revising this, Jan!


chrstphrchvz added on 2023-10-11 10:59:44:

…in other words, I consider the change to be justified.


chrstphrchvz added on 2023-10-11 10:47:52:

Note to self regarding #if __GNUC__ being changed to #if defined(__GNUC__) in [df7a3fd9e118]: although the former is what the LLVM example used, and would seem more robust in case of someone/something else defining __GNUC__ to 0, the latter is more consistent with the rest of core Tcl.


chrstphrchvz added on 2023-10-05 08:14:33:

I notice the CI failure for GCC on Windows. It attempted to use _AddressOfReturnAddress() which is MSVC-specific, but the condition is HAVE_INTRIN_H even though intrin.h is not MSVC-specific. I am thinking that the condition for _AddressOfReturnAddress() should depend on defined(_MSC_VER) (whether or not in addition to HAVE_INTRIN_H); and/or the GCC/Clang __builtin_frame_address(0) case should go first (as done by LLVM).


chrstphrchvz added on 2023-09-30 08:37:43:

See revised attached patch. It seems that factoring out code for use in both tclBasic.c and tclTest.c requires reserving a stub in tclInt.decls.


chrstphrchvz added on 2023-08-04 15:30:24:

I now notice the same issue affecting NRE test functions. I will look into patching those as well.


Attachments: