Tcl Source Code

View Ticket
Login
Ticket UUID: 5d170b5ca5e12743006d737c79f959f3efabc644
Title: checkin 9f8b7bea5344f1b0 broke netbsd's thread notifier
Type: Bug Version: trunk
Submitter: emiliano Created on: 2015-08-06 01:21:25
Subsystem: 01. Notifier Assigned To: nobody
Priority: 5 Medium Severity: Critical
Status: Closed Last Modified: 2015-09-24 11:20:04
Resolution: Fixed Closed By: jan.nijtmans
    Closed on: 2015-09-24 11:20:04
Description:
Starting from checkin 9f8b7bea5344f1b0 ( http://core.tcl.tk/tcl/info/9f8b7bea5344f1b0746587d660c81fa4f7109f00 ) on netbsd the test suite started to suffer massive slowdowns and several failing tests (out of 0 in previous versions).

fossil bisect indicates that the above mentioned checkin provokes the problem on the default build. Building with --disable-threads shows no problems.
User Comments: jan.nijtmans added on 2015-09-24 11:20:04:
Fixed in core-8-5-branch and trunk now.

jan.nijtmans added on 2015-09-24 10:49:03:

Answering to Joe: Bug [a27ee06c65] (which is now marked as dup of this one) shows that the problem is not only on netbsd.


jan.nijtmans added on 2015-09-07 18:53:24:

Gustaf's explanation and performance measures can be found here: http://code.activestate.com/lists/tcl-core/14779/

Quoting (partially): This morning i've got a lock-free version of "auto-lazy" working (based on the low-level pthread_*lock interface to avoid the problem with tcl's mega-locks). One reason for the locks in the *Prepare callback is to have the mutex variables in a defined state in the child. This can be addressed as well by reinitializing the mutexes (and condition variables in question)

With this, we reach essentially the same performance as without the at-fork callbacks:


mistachkin added on 2015-09-07 18:00:41:
I'm not sure what the changes on the associated branch actually do.

Also, does the original TIP #435 implementation have issues on any platform
__other__ than NetBSD?

jan.nijtmans added on 2015-09-07 08:38:28:

Branch bug-5d170b5ca5 now open for widespread testing.


dkf added on 2015-09-06 12:56:18:

xref [76d79516f1]


jan.nijtmans added on 2015-08-28 12:54:48:

It appears that Joe Mistachkin is working on a fix in the bug-57945b574a branch http://core.tcl.tk/tcl/info/f31817e84132dd8c. And Gustaf Neumann is working on a more fundamental fix (with performance improvements) in the expermental branch http://core.tcl.tk/tcl/info/992fb42ccd03439d

Let's hope one of those will lead to a good result in the near future.


emiliano added on 2015-08-06 22:48:15:
* Setting TCL_MUTEX_LOCK_SLEEP_TIME to 0, that is

emiliano added on 2015-08-06 22:36:28:
Setting TCL_MUTEX_LOCK_SLEEP_TIME has limited effect (fewer failing test) but doesn't seem to be a solution. Attaching test results with the modification

mistachkin added on 2015-08-06 17:03:37:
One thing that could have a significant impact on this is
the TCL_MUTEX_LOCK_SLEEP_TIME constant in "tclUnixThrd.c"...

Maybe try setting it to a lower value, perhaps zero, if the
platform will consider that as something like yield() to the
other threads.

mistachkin added on 2015-08-06 03:36:34:
Could you please attach the log files generated by ./configure?

emiliano added on 2015-08-06 02:32:47:
Finally figured out how to attach files :-)

emiliano added on 2015-08-06 01:24:31:
I have logs of full test suite runs from checkins 9f8b7bea5344f1b0 (bad) and its parent 41d3e42ddaf7d287 (good), both with and without threads, but fossil seem not to offer the possibility to attach those files

Attachments: