Quick Links:

bal | bbl | bdl | bsl

Namespaces

Component bdlb_chartype
[Package bdlb]

Supply locale-independent version of <ctype.h> functionality. More...

Namespaces

namespace  bdlb

Detailed Description

Outline
Purpose:
Supply locale-independent version of <ctype.h> functionality.
Classes:
bdlb::CharType namespace for pure (read-only) procedures on characters
See also:
Component bdlb_string
Description:
This component defines a utility class bdlb::CharType that provides an efficient, locale-independent alternative for the standard functionality found in <ctype.h>. The following character categories are supported (note that ODIGIT, IDENT, ALUND, ALL, and NONE are new):
    ============================================================
    Category   Description
    --------   -------------------------------------------------
    UPPER      [A-Z]
    LOWER      [a-z]
    ALPHA      [A-Za-z]
    ODIGIT     [0-7]
    DIGIT      [0-9]
    XDIGIT     [0-9A-Fa-f]
    ALNUM      [0-9A-Za-z]
    SPACE      [space|tab|CR|NL|VT|FF]
    PRINT      any printable character including SPACE
    GRAPH      any printable character except SPACE
    PUNCT      any printable character except SPACE or ALNUM
    CNTRL      [\0-\37] and \177 (in standard ASCII, see below)
    ASCII      [\0-\177]
    IDENT      [ALNUM|_]
    ALUND      [ALPHA|_]
    ALL        any 8-bit value
    NONE       []
    ============================================================
Supported functionality includes determining whether a character is a member of a given bdlb::CharType and also providing a null-terminated, contiguous sequence (and character count) for each character category. Additionally, the standard conversion methods toUpper and toLower are also provided.
Note that this component assumes the ASCII character set with standard encodings, which is sufficient for all currently supported platforms.
ASCII Character Set:
The following table provides a reference for the ASCII character set:
      Decimal      Hexadecimal    Key        Meaning
      -------      -----------    ---        -------
      0            0x00           ^@         NULL
      1            0x01           ^A         Start Heading
      2            0x02           ^B         Start Text
      3            0x03           ^C         End Text
      4            0x04           ^D         End of transmission
      5            0x05           ^E         Enquiry
      6            0x06           ^F         Acknowledge
      7            0x07           ^G         Bell
      8            0x08           ^H         Backspace
      9            0x09           ^I         Horizontal Tab
      10           0x0A           ^J         Newline (Linefeed)
      11           0x0B           ^K         Vertical Tab
      12           0x0C           ^L         Form Feed
      13           0x0D           ^M         Carriage Return
      14           0x0E           ^N         Shift Out
      15           0x0F           ^O         Shift In
      16           0x10           ^P         Data Link Escape
      17           0x11           ^Q         Device Control 1
      18           0x12           ^R         Device Control 2
      19           0x13           ^S         Device Control 3
      20           0x14           ^T         Device Control 4
      21           0x15           ^U         Negative Acknowledgement
      22           0x16           ^V         Synchronous Idle
      23           0x17           ^W         End of transmission Block
      24           0x18           ^X         Cancel
      25           0x19            ^Y        End of Medium
      26           0x1A           ^Z         Substitute
      27           0x1B           ^[         Escape
      28           0x1C           ^\         File Separator
      29           0x1D           ^]         Group Separator
      30           0x1E           ^^         Record Separator
      31           0x1F           ^_         Unit Separator
      32           0x20           (space)
      33           0x21           !
      34           0x22           "
      35           0x23           #
      36           0x24           $
      37           0x25           %
      38           0x26           &
      39           0x27           '
      40           0x28           (
      41           0x29           )
      42           0x2A           *
      43           0x2B           +
      44           0x2C           ,
      45           0x2D           -
      46           0x2E           .
      47           0x2F           /
      48-57        0x30-0x39      0-9
      58           0x3A           :
      59           0x3B           ;
      60           0x3C           <
      61           0x3D           =
      62           0x3E           >
      63           0x3F           ?
      64           0x40           @
      65-90        0x41-0x5A      A-Z
      91           0x5B           [
      92           0x5C           \          backslash
      93           0x5D           ]
      94           0x5E           ^
      95           0x5F           _
      96           0x60           `
      97-122       0x61-0x7A      a-z
      123          0x7B           {
      124          0x7C           |
      125          0x7D           }
      126          0x7E           ~
      127          0x75          ^?          Delete (Rubout)
Category Definitions:
The following table defines the members of each category:
                 UPPER
                 :  LOWER
                 :  :  ALPHA
                 :  :  :  ODIGIT
                 :  :  :  :  DIGIT
                 :  :  :  :  :  XDIGIT
                 :  :  :  :  :  :  ALNUM
                 :  :  :  :  :  :  :  SPACE
                 :  :  :  :  :  :  :  :  PRINT
                 :  :  :  :  :  :  :  :  :  GRAPH
                 :  :  :  :  :  :  :  :  :  :  PUNCT
                 :  :  :  :  :  :  :  :  :  :  :  CNTRL
                 :  :  :  :  :  :  :  :  :  :  :  :  ASCII
                 :  :  :  :  :  :  :  :  :  :  :  :  :  IDENT
                 :  :  :  :  :  :  :  :  :  :  :  :  :  :  ALUND
                 :  :  :  :  :  :  :  :  :  :  :  :  :  :  :  ALL
                 :  :  :  :  :  :  :  :  :  :  :  :  :  :  :  :  NONE
    Dec   Hex    :  :  :  :  :  :  :  :  :  :  :  :  :  :  :  :  :       Char
    ---   ---    -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -       ----
      0     0    _  _  _  _  _  _  _  _  _  _  _  C  A  _  _  A  _        ^@
      1     1    _  _  _  _  _  _  _  _  _  _  _  C  A  _  _  A  _        ^A
      2     2    _  _  _  _  _  _  _  _  _  _  _  C  A  _  _  A  _        ^B
      3     3    _  _  _  _  _  _  _  _  _  _  _  C  A  _  _  A  _        ^C
      4     4    _  _  _  _  _  _  _  _  _  _  _  C  A  _  _  A  _        ^D
      5     5    _  _  _  _  _  _  _  _  _  _  _  C  A  _  _  A  _        ^E
      6     6    _  _  _  _  _  _  _  _  _  _  _  C  A  _  _  A  _        ^F
      7     7    _  _  _  _  _  _  _  _  _  _  _  C  A  _  _  A  _        ^G
      8     8    _  _  _  _  _  _  _  _  _  _  _  C  A  _  _  A  _        ^H
      9     9    _  _  _  _  _  _  _  S  _  _  _  C  A  _  _  A  _        ^I
     10     A    _  _  _  _  _  _  _  S  _  _  _  C  A  _  _  A  _        ^J
     11     B    _  _  _  _  _  _  _  S  _  _  _  C  A  _  _  A  _        ^K
     12     C    _  _  _  _  _  _  _  S  _  _  _  C  A  _  _  A  _        ^L
     13     D    _  _  _  _  _  _  _  S  _  _  _  C  A  _  _  A  _        ^M
     14     E    _  _  _  _  _  _  _  _  _  _  _  C  A  _  _  A  _        ^N
     15     F    _  _  _  _  _  _  _  _  _  _  _  C  A  _  _  A  _        ^O
     16    10    _  _  _  _  _  _  _  _  _  _  _  C  A  _  _  A  _        ^P
     17    11    _  _  _  _  _  _  _  _  _  _  _  C  A  _  _  A  _        ^Q
     18    12    _  _  _  _  _  _  _  _  _  _  _  C  A  _  _  A  _        ^R
     19    13    _  _  _  _  _  _  _  _  _  _  _  C  A  _  _  A  _        ^S
     20    14    _  _  _  _  _  _  _  _  _  _  _  C  A  _  _  A  _        ^T
     21    15    _  _  _  _  _  _  _  _  _  _  _  C  A  _  _  A  _        ^U
     22    16    _  _  _  _  _  _  _  _  _  _  _  C  A  _  _  A  _        ^V
     23    17    _  _  _  _  _  _  _  _  _  _  _  C  A  _  _  A  _        ^W
     24    18    _  _  _  _  _  _  _  _  _  _  _  C  A  _  _  A  _        ^X
     25    19    _  _  _  _  _  _  _  _  _  _  _  C  A  _  _  A  _        ^Y
     26    1A    _  _  _  _  _  _  _  _  _  _  _  C  A  _  _  A  _        ^Z
     27    1B    _  _  _  _  _  _  _  _  _  _  _  C  A  _  _  A  _        ^[
     28    1C    _  _  _  _  _  _  _  _  _  _  _  C  A  _  _  A  _        ^/
     29    1D    _  _  _  _  _  _  _  _  _  _  _  C  A  _  _  A  _        ^]
     30    1E    _  _  _  _  _  _  _  _  _  _  _  C  A  _  _  A  _        ^^
     31    1F    _  _  _  _  _  _  _  _  _  _  _  C  A  _  _  A  _        ^_
     32    20    _  _  _  _  _  _  _  S  P  _  _  _  A  _  _  A  _
     33    21    _  _  _  _  _  _  _  _  P  G  P  _  A  _  _  A  _        !
     34    22    _  _  _  _  _  _  _  _  P  G  P  _  A  _  _  A  _        "
     35    23    _  _  _  _  _  _  _  _  P  G  P  _  A  _  _  A  _        #
     36    24    _  _  _  _  _  _  _  _  P  G  P  _  A  _  _  A  _        $
     37    25    _  _  _  _  _  _  _  _  P  G  P  _  A  _  _  A  _        %
     38    26    _  _  _  _  _  _  _  _  P  G  P  _  A  _  _  A  _        &
     39    27    _  _  _  _  _  _  _  _  P  G  P  _  A  _  _  A  _        '
     40    28    _  _  _  _  _  _  _  _  P  G  P  _  A  _  _  A  _        (
     41    29    _  _  _  _  _  _  _  _  P  G  P  _  A  _  _  A  _        )
     42    2A    _  _  _  _  _  _  _  _  P  G  P  _  A  _  _  A  _        *
     43    2B    _  _  _  _  _  _  _  _  P  G  P  _  A  _  _  A  _        +
     44    2C    _  _  _  _  _  _  _  _  P  G  P  _  A  _  _  A  _        ,
     45    2D    _  _  _  _  _  _  _  _  P  G  P  _  A  _  _  A  _        -
     46    2E    _  _  _  _  _  _  _  _  P  G  P  _  A  _  _  A  _        .
     47    2F    _  _  _  _  _  _  _  _  P  G  P  _  A  _  _  A  _        /
     48    30    _  _  _  O  D  X  A  _  P  G  _  _  A  I  _  A  _        0
     49    31    _  _  _  O  D  X  A  _  P  G  _  _  A  I  _  A  _        1
     50    32    _  _  _  O  D  X  A  _  P  G  _  _  A  I  _  A  _        2
     51    33    _  _  _  O  D  X  A  _  P  G  _  _  A  I  _  A  _        3
     52    34    _  _  _  O  D  X  A  _  P  G  _  _  A  I  _  A  _        4
     53    35    _  _  _  O  D  X  A  _  P  G  _  _  A  I  _  A  _        5
     54    36    _  _  _  O  D  X  A  _  P  G  _  _  A  I  _  A  _        6
     55    37    _  _  _  O  D  X  A  _  P  G  _  _  A  I  _  A  _        7
     56    38    _  _  _  _  D  X  A  _  P  G  _  _  A  I  _  A  _        8
     57    39    _  _  _  _  D  X  A  _  P  G  _  _  A  I  _  A  _        9
     58    3A    _  _  _  _  _  _  _  _  P  G  P  _  A  _  _  A  _        :
     59    3B    _  _  _  _  _  _  _  _  P  G  P  _  A  _  _  A  _        ;
     60    3C    _  _  _  _  _  _  _  _  P  G  P  _  A  _  _  A  _        <
     61    3D    _  _  _  _  _  _  _  _  P  G  P  _  A  _  _  A  _        =
     62    3E    _  _  _  _  _  _  _  _  P  G  P  _  A  _  _  A  _        >
     63    3F    _  _  _  _  _  _  _  _  P  G  P  _  A  _  _  A  _        ?
     64    40    _  _  _  _  _  _  _  _  P  G  P  _  A  _  _  A  _        @
     65    41    U  _  A  _  _  X  A  _  P  G  _  _  A  I  A  A  _        A
     66    42    U  _  A  _  _  X  A  _  P  G  _  _  A  I  A  A  _        B
     67    43    U  _  A  _  _  X  A  _  P  G  _  _  A  I  A  A  _        C
     68    44    U  _  A  _  _  X  A  _  P  G  _  _  A  I  A  A  _        D
     69    45    U  _  A  _  _  X  A  _  P  G  _  _  A  I  A  A  _        E
     70    46    U  _  A  _  _  X  A  _  P  G  _  _  A  I  A  A  _        F
     71    47    U  _  A  _  _  _  A  _  P  G  _  _  A  I  A  A  _        G
     72    48    U  _  A  _  _  _  A  _  P  G  _  _  A  I  A  A  _        H
     73    49    U  _  A  _  _  _  A  _  P  G  _  _  A  I  A  A  _        I
     74    4A    U  _  A  _  _  _  A  _  P  G  _  _  A  I  A  A  _        J
     75    4B    U  _  A  _  _  _  A  _  P  G  _  _  A  I  A  A  _        K
     76    4C    U  _  A  _  _  _  A  _  P  G  _  _  A  I  A  A  _        L
     77    4D    U  _  A  _  _  _  A  _  P  G  _  _  A  I  A  A  _        M
     78    4E    U  _  A  _  _  _  A  _  P  G  _  _  A  I  A  A  _        N
     79    4F    U  _  A  _  _  _  A  _  P  G  _  _  A  I  A  A  _        O
     80    50    U  _  A  _  _  _  A  _  P  G  _  _  A  I  A  A  _        P
     81    51    U  _  A  _  _  _  A  _  P  G  _  _  A  I  A  A  _        Q
     82    52    U  _  A  _  _  _  A  _  P  G  _  _  A  I  A  A  _        R
     83    53    U  _  A  _  _  _  A  _  P  G  _  _  A  I  A  A  _        S
     84    54    U  _  A  _  _  _  A  _  P  G  _  _  A  I  A  A  _        T
     85    55    U  _  A  _  _  _  A  _  P  G  _  _  A  I  A  A  _        U
     86    56    U  _  A  _  _  _  A  _  P  G  _  _  A  I  A  A  _        V
     87    57    U  _  A  _  _  _  A  _  P  G  _  _  A  I  A  A  _        W
     88    58    U  _  A  _  _  _  A  _  P  G  _  _  A  I  A  A  _        X
     89    59    U  _  A  _  _  _  A  _  P  G  _  _  A  I  A  A  _        Y
     90    5A    U  _  A  _  _  _  A  _  P  G  _  _  A  I  A  A  _        Z
     91    5B    _  _  _  _  _  _  _  _  P  G  P  _  A  _  _  A  _        [
     92    5C    _  _  _  _  _  _  _  _  P  G  P  _  A  _  _  A  _       '\'
     93    5D    _  _  _  _  _  _  _  _  P  G  P  _  A  _  _  A  _        ]
     94    5E    _  _  _  _  _  _  _  _  P  G  P  _  A  _  _  A  _        ^
     95    5F    _  _  _  _  _  _  _  _  P  G  P  _  A  I  A  A  _        _
     96    60    _  _  _  _  _  _  _  _  P  G  P  _  A  _  _  A  _        `
     97    61    _  L  A  _  _  X  A  _  P  G  _  _  A  I  A  A  _        a
     98    62    _  L  A  _  _  X  A  _  P  G  _  _  A  I  A  A  _        b
     99    63    _  L  A  _  _  X  A  _  P  G  _  _  A  I  A  A  _        c
    100    64    _  L  A  _  _  X  A  _  P  G  _  _  A  I  A  A  _        d
    101    65    _  L  A  _  _  X  A  _  P  G  _  _  A  I  A  A  _        e
    102    66    _  L  A  _  _  X  A  _  P  G  _  _  A  I  A  A  _        f
    103    67    _  L  A  _  _  _  A  _  P  G  _  _  A  I  A  A  _        g
    104    68    _  L  A  _  _  _  A  _  P  G  _  _  A  I  A  A  _        h
    105    69    _  L  A  _  _  _  A  _  P  G  _  _  A  I  A  A  _        i
    106    6A    _  L  A  _  _  _  A  _  P  G  _  _  A  I  A  A  _        j
    107    6B    _  L  A  _  _  _  A  _  P  G  _  _  A  I  A  A  _        k
    108    6C    _  L  A  _  _  _  A  _  P  G  _  _  A  I  A  A  _        l
    109    6D    _  L  A  _  _  _  A  _  P  G  _  _  A  I  A  A  _        m
    110    6E    _  L  A  _  _  _  A  _  P  G  _  _  A  I  A  A  _        n
    111    6F    _  L  A  _  _  _  A  _  P  G  _  _  A  I  A  A  _        o
    112    70    _  L  A  _  _  _  A  _  P  G  _  _  A  I  A  A  _        p
    113    71    _  L  A  _  _  _  A  _  P  G  _  _  A  I  A  A  _        q
    114    72    _  L  A  _  _  _  A  _  P  G  _  _  A  I  A  A  _        r
    115    73    _  L  A  _  _  _  A  _  P  G  _  _  A  I  A  A  _        s
    116    74    _  L  A  _  _  _  A  _  P  G  _  _  A  I  A  A  _        t
    117    75    _  L  A  _  _  _  A  _  P  G  _  _  A  I  A  A  _        u
    118    76    _  L  A  _  _  _  A  _  P  G  _  _  A  I  A  A  _        v
    119    77    _  L  A  _  _  _  A  _  P  G  _  _  A  I  A  A  _        w
    120    78    _  L  A  _  _  _  A  _  P  G  _  _  A  I  A  A  _        x
    121    79    _  L  A  _  _  _  A  _  P  G  _  _  A  I  A  A  _        y
    122    7A    _  L  A  _  _  _  A  _  P  G  _  _  A  I  A  A  _        z
    123    7B    _  _  _  _  _  _  _  _  P  G  P  _  A  _  _  A  _        {
    124    7C    _  _  _  _  _  _  _  _  P  G  P  _  A  _  _  A  _        |
    125    7D    _  _  _  _  _  _  _  _  P  G  P  _  A  _  _  A  _        }
    126    7E    _  _  _  _  _  _  _  _  P  G  P  _  A  _  _  A  _        ~
    127    7F    _  _  _  _  _  _  _  _  _  _  _  C  A  _  _  A  _        ^?
Usage:
This section illustrates intended use of this component.
Example 1: Validating C-Style Identifiers:
The character category extensions IDENT and ALUND are particularly useful for parsing C-style identifier names as described by the following regular expression:
  [A-Za-z_]([A-Za-z0-9_])*
The first character is required and must be in category ALUND. All subsequent characters are optional and must be in category IDENT:
  bool isIdentifier(const char *token)
      // Return 'true' if the specified 'token' conforms to the requirements
      // of a C-style identifier, and 'false' otherwise.
  {
      assert(token);

      if (!bdlb::CharType::isAlund(*token)) {
          return false; // bad required first character             // RETURN
      }

      for (const char *p = token + 1; *p; ++p) {
          if (!bdlb::CharType::isIdent(*p)) {
              return false; // bad optional subsequent character    // RETURN

      return true;
  }