Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
Comment: | Added long comment explaining history and work in progress making bytearray interfaces usable. |
---|---|
Downloads: | Tarball | ZIP archive | SQL archive |
Timelines: | family | ancestors | descendants | both | trunk |
Files: | files | file ages | folders |
SHA1: |
d42a114238e3e3c7ee99986099b6805a |
User & Date: | dgp 2016-12-02 18:18:59 |
Context
2016-12-02
| ||
21:08 | Avoid "warning: format '%llu' expects argument of type 'long long unsigned int', but argument 2 has ... check-in: 871fcf85c4 user: andy tags: trunk | |
21:00 | Merge trunk check-in: 4ab23eded7 user: andy tags: amg-array-enum-c-api | |
19:02 | merge trunk check-in: 237dd7902f user: dgp tags: novem | |
19:01 | merge trunk check-in: 33041ae916 user: dgp tags: tip-445 | |
18:18 | Added long comment explaining history and work in progress making bytearray interfaces usable. check-in: d42a114238 user: dgp tags: trunk | |
15:39 | Relocate the cmdEpoch bumps during command deletion so that command resolution caching works properl... check-in: 71fa1f9c91 user: dgp tags: trunk | |
Changes
Changes to generic/tclBinary.c.
︙ | ︙ | |||
151 152 153 154 155 156 157 | { "hex", BinaryDecodeHex, TclCompileBasic1Or2ArgCmd, NULL, NULL, 0 }, { "uuencode", BinaryDecodeUu, TclCompileBasic1Or2ArgCmd, NULL, NULL, 0 }, { "base64", BinaryDecode64, TclCompileBasic1Or2ArgCmd, NULL, NULL, 0 }, { NULL, NULL, NULL, NULL, NULL, 0 } }; /* | | > > | > > > > > | > > > | < < < < > | | | > > > > > > > > > > > > > > > > > > > > > > > > > | < | > | > > > > > > > | > > > > > > > | | | > > > > > > > | > > > | > > > > > > > > > | 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 | { "hex", BinaryDecodeHex, TclCompileBasic1Or2ArgCmd, NULL, NULL, 0 }, { "uuencode", BinaryDecodeUu, TclCompileBasic1Or2ArgCmd, NULL, NULL, 0 }, { "base64", BinaryDecode64, TclCompileBasic1Or2ArgCmd, NULL, NULL, 0 }, { NULL, NULL, NULL, NULL, NULL, 0 } }; /* * The following object types represent an array of bytes. The intent is * to allow arbitrary binary data to pass through Tcl as a Tcl value * without loss or damage. Such values are useful for things like * encoded strings or Tk images to name just two. * * It's strange to have two Tcl_ObjTypes in place for this task when * one would do, so a bit of detail and history how we got to this point * and where we might go from here. * * A bytearray is an ordered sequence of bytes. Each byte is an integer * value in the range [0-255]. To be a Tcl value type, we need a way to * encode each value in the value set as a Tcl string. The simplest * encoding is to represent each byte value as the same codepoint value. * A bytearray of N bytes is encoded into a Tcl string of N characters * where the codepoint of each character is the value of corresponding byte. * This approach creates a one-to-one map between all bytearray values * and a subset of Tcl string values. * * When converting a Tcl string value to the bytearray internal rep, the * question arises what to do with strings outside that subset? That is, * those Tcl strings containing at least one codepoint greater than 255? * The obviously correct answer is to raise an error! That string value * does not represent any valid bytearray value. Full Stop. The * setFromAnyProc signature has a completion code return value for just * this reason, to reject invalid inputs. * * Unfortunately this was not the path taken by the authors of the * original tclByteArrayType. They chose to accept all Tcl string values * as acceptable string encodings of the bytearray values that result * from masking away the high bits of any codepoint value at all. This * meant that every bytearray value had multiple accepted string * representations. * * The implications of this choice are truly ugly. When a Tcl value has * a string representation, we are required to accept that as the true * value. Bytearray values that possess a string representation cannot * be processed as bytearrays because we cannot know which true value * that bytearray represents. The consequence is that we drag around * an internal rep that we cannot make any use of. This painful price * is extracted at any point after a string rep happens to be generated * for the value. This happens even when the troublesome codepoints * outside the byte range never show up. This happens rather routinely * in normal Tcl operations unless we burden the script writer with the * cognitive burden of avoiding it. The price is also paid by callers * of the C interface. The routine * * unsigned char *Tcl_GetByteArrayFromObj(objPtr, lenPtr) * * has a guarantee to always return a non-NULL value, but that value * points to a byte sequence that cannot be used by the caller to * process the Tcl value absent some sideband testing that objPtr * is "pure". Tcl offers no public interface to perform this test, * so callers either break encapsulation or are unavoidably buggy. Tcl * has defined a public interface that cannot be used correctly. The * Tcl source code itself suffers the same problem, and has been buggy, * but progressively less so as more and more portions of the code have * been retrofitted with the required "purity testing". The set of values * able to pass the purity test can be increased via the introduction of * a "canonical" flag marker, but the only way the broken interface itself * can be discarded is to start over and define the Tcl_ObjType properly. * Bytearrays should simply be usable as bytearrays without a kabuki * dance of testing. * * The Tcl_ObjType "properByteArrayType" is (nearly) a correct * implementation of bytearrays. Any Tcl value with the type * properByteArrayType can have its bytearray value fetched and * used with confidence that acting on that value is equivalent to * acting on the true Tcl string value. This still implies a side * testing burden -- past mistakes will not let us avoid that * immediately, but it is at least a conventional test of type, and * can be implemented entirely by examining the objPtr fields, with * no need to query the intrep, as a canonical flag would require. * * Until Tcl_GetByteArrayFromObj() and Tcl_SetByteArrayLength() can * be revised to admit the possibility of returning NULL when the true * value is not a valid bytearray, we need a mechanism to retain * compatibility with the deployed callers of the broken interface. * That's what the retained "tclByteArrayType" provides. In those * unusual circumstances where we convert an invalid bytearray value * to a bytearray type, it is to this legacy type. Essentially any * time this legacy type gets used, it's a signal of a bug being ignored. * A TIP should be drafted to remove this connection to the broken past * so that Tcl 9 will no longer have any trace of it. Prescribing a * migration path will be the key element of that work. The internal * changes now in place are the limit of what can be done short of * interface repair. They provide a great expansion of the histories * over which bytearray values can be useful in the meanwhile. */ static const Tcl_ObjType properByteArrayType = { "bytearray", FreeByteArrayInternalRep, DupByteArrayInternalRep, UpdateStringOfByteArray, |
︙ | ︙ |