Tcl Source Code: View Ticket

Ticket UUID:	2318906
Title:	order of insertion seems to have large performance impact
Type:	RFE	Version:	None
Submitter:	nobody	Created on:	2008-11-21 00:33:03
Subsystem:	15. Dict Object	Assigned To:	dkf
Priority:	3 Low	Severity:
Status:	Open	Last Modified:	2009-06-15 16:17:52
Resolution:	None	Closed By:
		Closed on:
Description:	See attached tarball which has a script showing the problem. This seems to be related to nested dicts. I'm doing something like this in a loop: dict set $d "key1" $x $y dict set $d "key2" $a $u dict set $d "key2" $b $v dict set $d "key2" $c $w ... The "key2" inserts can occur multiple times per iterations. It's much faster to collect the "key2" stuff into a separate dict and copy them in one shot at the end of all the loops into $d instead of interleaving "key1" and "key2".
User Comments:	dkf added on 2009-06-15 16:17:52: data_type - 360894 dkf added on 2009-01-06 18:41:11: On my machine (using [time] to do the measurements, which uses microsecond precision) I get a relative speed difference of a bit over 15 times with Tcl 8.5.2. That seems to be acceptable to me; we never warranted that nested dicts would be amazingly efficiently updated and accessed, and it is no surprise that there are ways to achieve performance-sensitive stuff that are faster. (You have a speed issue, true, but you have a trivial work around.) It would be possible to do something more efficient for this case, but working out how and when to enable the efficiency saving mechanism (i.e. caching the fact that a key is used on a path) is difficult and I'm not sure that the gains would be worth it. Keeping open as something to consider. [email protected] added on 2008-11-22 03:45:02: Original submitter here. Thanks for looking into this. I'm sorry I did not attach the result my run in the original filing. Interesting results. On my machine, I see the following after changing to 'clock milliseconds'. Fast method gives: 68 ms, Slow method gives: 1565 ms. uname -a: Linux hostname 2.6.9-11.ELsmp #1 SMP Fri May 20 18:26:27 EDT 2005 i686 i686 i386 GNU/Linux Running the pre-built binaries: ActiveTcl8.5.5.0.287690-linux-ix86-threaded. I'll try building my own Tcl. fast method. 1000 lines done. 2000 lines done. 3000 lines done. 4000 lines done. 5000 lines done. lines: 5000 filelines: 3267 commitlines: 1733 elapsed: 68 slow method. 1000 lines done. 2000 lines done. 3000 lines done. 4000 lines done. 5000 lines done. lines: 5000 filelines: 3267 commitlines: 1733 elapsed: 1565 nijtmans added on 2008-11-21 14:16:27: Oops, forgot to save the file before running it. When changing "clock seconds" to "clock milliseconds", it gives: $ tclsh85 dict_test.tcl fast method. 1000 lines done. 2000 lines done. 3000 lines done. 4000 lines done. 5000 lines done. lines: 5000 filelines: 3267 commitlines: 1733 elapsed: 62 slow method. 1000 lines done. 2000 lines done. 3000 lines done. 4000 lines done. 5000 lines done. lines: 5000 filelines: 3267 commitlines: 1733 elapsed: 42 When separating the key "filesincommit" in another "gitinfo3" variable, the elapsed time drops to 40. The top dict only has 3 entries, that's what causes this inefficiency. nijtmans added on 2008-11-21 13:50:59: I don't think this is a bug. In the code above, every "dict set" has to search two keys in two hash tables. Separating the "key2" stuff, means you only have to seach one key for each "dict set", and only do a single search for "key2". A hash search depends much on the size of the hash table, but small hash-tables are relatively inefficient. It is simply the nature of hash-tables. Running your code: $ tclsh85 dict_test.tcl fast method. 1000 lines done. 2000 lines done. 3000 lines done. 4000 lines done. 5000 lines done. lines: 5000 filelines: 3267 commitlines: 1733 elapsed: 0 slow method. 1000 lines done. 2000 lines done. 3000 lines done. 4000 lines done. 5000 lines done. lines: 5000 filelines: 3267 commitlines: 1733 elapsed: 1 That's fast, less than 1 second. Even when changing "clock seconds" to "clock milliseconds", the answer is the same, which means that it is less than the resolution of the windows clock (something like 1/15 sec) Collecting the "key2" stuff into a separate dict is indeed a good optimization. Regards, Jan Nijtmans [email protected] added on 2008-11-21 07:33:03: File Added - 302311: dict_test.tgz

Attachments:

dict_test.tgz [download] added by [email protected] on 2008-11-21 07:33:03. [details]