Ticket UUID: | a1bd37b7198f7bf51e8f1f554435045bf6cc8b68 | |||
Title: | clock (free)scan of ISO 8601 timestamp with literal T behaves strange | |||
Type: | Bug | Version: | any | |
Submitter: | sebres | Created on: | 2020-05-20 13:17:00 | |
Subsystem: | - New Builtin Commands | Assigned To: | sebres | |
Priority: | 5 Medium | Severity: | Minor | |
Status: | Open | Last Modified: | 2020-07-08 15:55:02 | |
Resolution: | None | Closed By: | nobody | |
Closed on: | ||||
Description: |
It looks like clock free scan recognizes literal "T" in (undocumented?) ISO 8601 timestamp incorrectly, I guess as a legacy (military?) timezone "T". # ISO: % clock format [clock scan "20200520 14:40:00" -gmt 1] -gmt 1 Wed May 20 14:40:00 GMT 2020 % clock format [clock scan "20200520T14:40:00" -gmt 1] -gmt 1 Wed May 20 14:40:00 GMT 2020This seems not correct in my opinion (despite the documentation says about literal "T" by ISO without "-" separator in date only, recognize "T" as timezone looks weird to me). I think either we must aggressively strict the timezones bison/yacc rules (must consider word boundaries or something similar) or ISO format must be extended with YYYY-MM-DD date. | |||
User Comments: |
sebres added on 2020-07-08 15:55:02:
Sure, I think to have a better solution, but since it is based on my newest improved branch, this were a bit "larger" change (but as already said - better)...
I'm still unsure about the fix using a cherry-pick from clock-speedup branch, because that would also imply several things (e. g. pre-requirements for positive time zone fix [5019748c73], etc). Surely I was also planning to prepare clock-speedup to merge it at some point (I still hope to fulfill that), but not yet now, since as discussed in tclchat, TCT (Kevin) waiting for a description about all the changes. But such partial cherry-pick could indeed have some pros for me (for instance free-scan would have few conflicts by scanning as well as few differences to speedup-branch in code hereafter). jan.nijtmans added on 2020-07-08 14:46:17: I would like to see this fix in Tcl 8.6.11. Any objections? Otherwise I'll do the merge in a few days! @sebres, if you have additional suggestions, don't wait too long. I like chw's solution. jan.nijtmans added on 2020-06-23 06:38:20: > by rebase I noticed that several changes, you've made at some point directly in "tclDate.c", are inconsistent with current "tclGetDate.y" Indeed, Thanks! I think I corrected that now. sebres added on 2020-06-22 18:52:20:
I rebased my tclclockmod with fix for that [4f064d306d] to the tcl-core. @Jan: by rebase I noticed that several changes, you've made at some point directly in "tclDate.c", are inconsistent with current "tclGetDate.y" (and so will be lost if you rebuilt it with yacc/bison, e. g. using chw added on 2020-05-27 16:03:51: Or alternatively, use my variant which deals with "T" vs other +7 timezones by numerically classifying military tz names vs others in the second attachment, which is a patch relative to bug-a1bd37b719 branch. sebres added on 2020-05-27 15:38:05: Well, in-between I found basically more things there, that must be fixed now:
I'll try to rebase the patch to tcl-core, if I'd get it ready there. jan.nijtmans added on 2020-05-27 14:01:16: Proposed fix committed in a branch here: [f0ea6d62c12fcc40]. With test-case and documentation, what more do we want... I like it! chw added on 2020-05-27 09:34:19: sebres, you're right. What about adding a 4th free scan format CCyy-mm-ddThh:mm:ss like in the attached patch against core-8-6-branch? oehhar added on 2020-05-26 06:25:00: Sebres, Christian, thank you for constantly caring about TCL. I can say that the format "2020-05-20T14:40:00" is a format I have often to deal with, as some SQL servers return this format for timestamps. I always scanned it with a detailed pattern and I am happy that there is a native support. I would call it a bug, when the T is interpreted as a time zone indicator and not as the fix ISO separator. Thank you, Harald sebres added on 2020-05-25 10:11:40:
> Citing the "FREE FORM SCAN" section in clock(n) from core-8-6-branch What you're trying to tell me with? I wrote already that despite it is documented, one can not say that the behavior by format "YYYY-MM-DDThh:mm:ss" (that is also described as part of ISO 8601) is correct. There are only two possibility as I suggested above, either to accept it as ISO (to extend the format), or to restrict TZ recognition rules (like word boundaries), so scan would reject it as invalid input. > it seems to be some undefined behavior you're observing here regarding the presence of the -gmt option. No, gmt option is only to illustrate that normally no TZ conversions are used by format and scan (and it is ignored if TZ specified in input string). And it has basically nothing with this bug. As already said the free scan considering T literal as a military time zone. chw added on 2020-05-21 11:18:14: Citing the "FREE FORM SCAN" section in clock(n) from core-8-6-branch ISO 8601 point-in-time An ISO 8601 point-in-time specification, such as “CCyymmddThh‐ mmss,” where T is the literal “T”, “CCyymmdd hhmmss”, or “CCyym‐ mddThh:mm:ss”. Note that only these three formats are accepted. The command does not accept the full range of point-in-time specifications specified in ISO8601. Other formats can be rec‐ ognized by giving an explicit -format option to the clock scan command. it seems to be some undefined behavior you're observing here regarding the presence of the -gmt option. |