Tk Source Code

View Ticket
Login
Ticket UUID: 06c14339060ba9aebc3c090b4d7524c8abd5a5d4
Title: Text widget crash during delete
Type: Bug Version: trunk
Submitter: fvogel Created on: 2016-01-30 13:51:52
Subsystem: 18. [text] Assigned To: fvogel
Priority: 7 High Severity: Critical
Status: Closed Last Modified: 2016-02-08 22:57:39
Resolution: Fixed Closed By: fvogel
    Closed on: 2016-02-08 22:57:39
Description:
Report on TCLCORE list from Brian Griffin about the text widget crashing in a large application in both 8.6.4 and trunk.

fossil bisect points at:
  http://core.tcl.tk/tk/ci/486c2c231615967b?sbs=1
  http://core.tcl.tk/tk/ci/9badac93d2e79f06?sbs=1
  http://core.tcl.tk/tk/ci/575b376065334692?sbs=1

However those commits look OK to me.

Brian's feeling is that "in the failing test case, there are some lines of text in the text widget, a whole lot of lines get added to the end, then a whole lot of lines get deleted from the beginning.  By the time the idle task gets called, not a single line of text present was present when the idle task was scheduled.  Something somewhere has cached something that has long since been free'd."
I completely agree with his opinion.

Following Brian's delivery of the sequence of text widget calls with timing information I came up with the following seemingly minimal script:

  pack [text .txt]
  .txt insert end \n
  .txt mark set mymark 1.0
  .txt insert mymark {Line1 
  Line2
  }
  after 5 {.txt insert mymark "Line3"}
  after 5 {.txt delete 1.0 2.0}

Placing the above in a file "crashit.tcl" and then in tclsh:

  package require Tk
  source crashit.tcl

This produces a text widget crash (in core-8-5-branch and trunk).

(note: directly pasting the script at once in tclsh or wish will not crash. To get the crash, one can paste everything but the last two lines and then the last two lines)
User Comments: fvogel added on 2016-02-08 22:57:39:
And BTW, running:

valgrind --leak-check=full make test TESTFLAGS="-file text.test"

does not show any remaining issue with the text widget, neither with the test suite nor with memory checks from valgrind.

fvogel added on 2016-02-08 22:09:35:
Merged to core-8-5-branch and trunk, "so more people will be triggered to test it".

Closing.

fvogel added on 2016-02-08 21:52:09:

Correct fix is in FindDLine, see [717e12ee].

Emergency fix [c3c09f82] reverted.

See branch bug-06c1433906.


jan.nijtmans added on 2016-02-08 15:58:10:
emergency-fix committed to core-8-5-branch and trunk.

Still keeping this ticket open, according to François' suggestion, in case something better still is found.

Great work (so far), appears to be good enough for 8.5.19/8.6.5 !!!!!!!!!!

fvogel added on 2016-02-07 13:51:14:
Despite I still think TextChanged() is OK as is, I have committed an emergency fix in TextChanged().

Backing out DGP's change (but it was very close!) and with my fix in, everything I'm aware of now passes sucessfully:
  - The full test suite, including the new textDisp-8.13, shows zero test failure
  - The "reduced demo" I have poseted belon on 2016-01-31 14:25:16 does no longer crash
  - Sourcing the "record_crash2.tcl" from Brian (second and latest version sent on TCLCORE on 29 Jan. 2016) does no longer crash.

Please confirm on your full large app, Brian.

As stated above, this is an emergency fix because the root thing is not yet addressed: FindDLine's mission is to provide the DLine corresponding to a given index, but this is not *always* possible and the cases when it is not possible are not consistently addressed. There a situations where the searched DLine has been unlinked previously, and FindDLine in that case does not exhibit fully consistent behaviour. This is apparent in several places in the text widget code where the DLine returned by FindDLine gets processed in different ways depending on the caller's needs. I believe FindDLine could/should be improved to be instructed about what DLine to return when the searched DLine does not exist (i.e. was unlinked). An added input parameter to FindDLine could tell whether to return NULL, or the previous DLine, or the next one, or whatever depending on the needs of the caller. But this is a non-negligible project which implies some amount of careful investment if we want to avoid regressions and new crashed (and I believe we want ;-) !)

Also still <TODO>:
  - analysis of valgrind output (is there any remaining issue with the fix now?)
  - MODEL-VIEW-CONTROLLER pattern checks suggested by Brian on TCLCORE

fvogel added on 2016-02-06 10:18:15:
I think TextChanged() is correctly written (I mean: without [e1540e0f]).

However, within the delete operation, when TextChanged() calls FindDLine this call returns the wrong display line because the line we're looking for (for TextChanged()'s lastPtr) was unlinked previously because of the insert operation on that same line.

fvogel added on 2016-02-06 09:22:56:
IMHO the fix you propose is unfortunately wrong.

Running:

  package require Tk
  pack [text .t]
  .t insert 1.0 "\nLine1\nLine2\n"
  after 10 {.t insert 3.0 "" ; .t delete 1.0 2.0}

indeed there is no crash.

But then, I get a crash just by randomly clicking inside the text widget with button-1 down. I can't select anymore for instance: it crashes instantaneously.

I strongly suggest this change be reverted in both core-8-5-branch and trunk.

dgp added on 2016-02-05 19:22:18:
Committed test and patch to core-8-5-branch.
At least one crash is fixed, even if not the prompting one.

dgp added on 2016-02-04 17:50:47:
See branch bug-06c1433906 for possible fix.

That patch stops the short demo from crashing for
me and doesn't create any new crashes in the Tk test
suite.

Does it solve the issue in the large app as well?

dgp added on 2016-02-04 17:43:13:
Before that checkin, TextChanged() always deleted
all DLines that contained index2Ptr.

After, it could leave one behind to cause trouble.

Looking into fix.

dgp added on 2016-02-04 16:47:44:
Bisecting with the shortened demo shows
the trouble to originate in

http://core.tcl.tk/tk/info/04d70b6fab0770dd

dgp added on 2016-02-04 15:07:12:
The comment describing the TkTextIndex struct
appears relevant:

/*
 * Data structures of the type defined below are used during the execution of
 * Tcl commands to keep track of various interesting places in a text. An
 * index is only valid up until the next modification to the character
 * structure of the b-tree so they can't be retained across Tcl commands.
 * However, mods to marks or tags don't invalidate indices.
 */

The DLine struct copies an entire TkTextIndex struct into it,
and appears to have no/insufficient constraints to prevent
that data from living beyond the next mod to the b-tree.

dgp added on 2016-02-04 14:58:03:
Very good to have a short demo.  I reproduce
the failure.

It's not always exactly the same failure, but
I do see one failure mode that I can partially
explain.  The trouble is that DisplayText()
eventually calls FindDLine() and passes in 
a dlPtr value with dlPtr->index->linePtr
that has already been freed.

That linePtr gets freed during the [.t delete]
operation at about line 1540 of tkTextBTree.c.
There's a merging of two lines followed by
free of the latter one.  This leaves dangling the 
dlPtr->index->linePtr value that was copied into
dlPtr earlier.

fvogel added on 2016-01-31 14:25:16:
Some news:

I have an even shorter script triggering the crash. Paste this is tclsh (in two steps, see below):

  package require Tk
  pack [text .t]
  .t insert 1.0 "\nLine1\nLine2\n"

  # paste this separately from the previous lines
  .t insert 3.0 "" ; .t delete 1.0 2.0

I have started analysis of the crash and I'm progressing despite not yet understanding it fully. I need more time to devote to this, unfortunately I will not be able to work on this this week.

Nevertheless a workaround exists: By inserting  update idletasks  between the insert and delete, the crash disappears:

  # paste this separately from the previous lines
  .t insert 3.0 "" ; update idletasks ; .t delete 1.0 2.0

Attachments: