This component-private class is used to hold delimiter information. Each Tokenizer
object will have, as a private data member, an object of this class, and will pass the address of that member to the (private) constructor of each TokenizerIterator
object it issues:
+--------------------------------------+
| ,--------------. |
| `--------------'\ |
| | \ |
| | ,----*------------. |
| | ( TokenizerIterator ) |
| | /`-----------------' |
| | / |
| ,----*--o-. |
| `---------' |
+--------------------------------------+
bdlb_tokenizer
Definition bdlb_tokenizer.h:608
Definition bdlb_tokenizer.h:834
See bdlb_tokenizer
Create a Tokenizer_Data
object and load the d_charTypes
data member such that it has the same value as if this (overly prescriptive) algorithm were used: (I) initialize each entry in d_charTypes
array to a value indicating that the character having that index
as its (e.g., ASCII) representation is a token character; (II) then, for each character in the specified softDelimiters
sequence, overwrite the element at the corresponding index in d_charTypes
with a value that indicates that the character is a soft delimiter character; (III) finally, for each character in the specified hardDelimiters
sequence, overwrite the element at the corresponding index with a distinct value that indicates the character is a hard delimiter* character. Note that duplicate delimiter characters in the respective inputs are naturally ignored, and that a character that appears in both sets would naturally be considered hard. Also note that it is entirely reasonable to state, in any public interface, that the behavior is undefined unless the characters in the union of the two delimiter sequences are unique.