Supply locale-independent version of <ctype.h>
functionality.
More...
Namespaces |
namespace | bdlb |
Detailed Description
- Outline
-
-
- Purpose:
- Supply locale-independent version of
<ctype.h>
functionality.
-
- Classes:
bdlb::CharType | namespace for pure (read-only) procedures on characters |
- See also:
- Component bdlb_string
-
- Description:
- This component defines a utility class
bdlb::CharType
that provides an efficient, locale-independent alternative for the standard functionality found in <ctype.h>
. The following character categories are supported (note that ODIGIT
, IDENT
, ALUND
, ALL
, and NONE
are new): ============================================================
Category Description
-------- -------------------------------------------------
UPPER [A-Z]
LOWER [a-z]
ALPHA [A-Za-z]
ODIGIT [0-7]
DIGIT [0-9]
XDIGIT [0-9A-Fa-f]
ALNUM [0-9A-Za-z]
SPACE [space|tab|CR|NL|VT|FF]
PRINT any printable character including SPACE
GRAPH any printable character except SPACE
PUNCT any printable character except SPACE or ALNUM
CNTRL [\0-\37] and \177 (in standard ASCII, see below)
ASCII [\0-\177]
IDENT [ALNUM|_]
ALUND [ALPHA|_]
ALL any 8-bit value
NONE []
============================================================
Supported functionality includes determining whether a character is a member of a given bdlb::CharType
and also providing a null-terminated, contiguous sequence (and character count) for each character category. Additionally, the standard conversion methods toUpper
and toLower
are also provided.
- Note that this component assumes the ASCII character set with standard encodings, which is sufficient for all currently supported platforms.
-
- ASCII Character Set:
- The following table provides a reference for the ASCII character set:
Decimal Hexadecimal Key Meaning
------- ----------- --- -------
0 0x00 ^@ NULL
1 0x01 ^A Start Heading
2 0x02 ^B Start Text
3 0x03 ^C End Text
4 0x04 ^D End of transmission
5 0x05 ^E Enquiry
6 0x06 ^F Acknowledge
7 0x07 ^G Bell
8 0x08 ^H Backspace
9 0x09 ^I Horizontal Tab
10 0x0A ^J Newline (Linefeed)
11 0x0B ^K Vertical Tab
12 0x0C ^L Form Feed
13 0x0D ^M Carriage Return
14 0x0E ^N Shift Out
15 0x0F ^O Shift In
16 0x10 ^P Data Link Escape
17 0x11 ^Q Device Control 1
18 0x12 ^R Device Control 2
19 0x13 ^S Device Control 3
20 0x14 ^T Device Control 4
21 0x15 ^U Negative Acknowledgement
22 0x16 ^V Synchronous Idle
23 0x17 ^W End of transmission Block
24 0x18 ^X Cancel
25 0x19 ^Y End of Medium
26 0x1A ^Z Substitute
27 0x1B ^[ Escape
28 0x1C ^\ File Separator
29 0x1D ^] Group Separator
30 0x1E ^^ Record Separator
31 0x1F ^_ Unit Separator
32 0x20 (space)
33 0x21 !
34 0x22 "
35 0x23 #
36 0x24 $
37 0x25 %
38 0x26 &
39 0x27 '
40 0x28 (
41 0x29 )
42 0x2A *
43 0x2B +
44 0x2C ,
45 0x2D -
46 0x2E .
47 0x2F /
48-57 0x30-0x39 0-9
58 0x3A :
59 0x3B ;
60 0x3C <
61 0x3D =
62 0x3E >
63 0x3F ?
64 0x40 @
65-90 0x41-0x5A A-Z
91 0x5B [
92 0x5C \ backslash
93 0x5D ]
94 0x5E ^
95 0x5F _
96 0x60 `
97-122 0x61-0x7A a-z
123 0x7B {
124 0x7C |
125 0x7D }
126 0x7E ~
127 0x75 ^? Delete (Rubout)
-
- Category Definitions:
- The following table defines the members of each category:
UPPER
: LOWER
: : ALPHA
: : : ODIGIT
: : : : DIGIT
: : : : : XDIGIT
: : : : : : ALNUM
: : : : : : : SPACE
: : : : : : : : PRINT
: : : : : : : : : GRAPH
: : : : : : : : : : PUNCT
: : : : : : : : : : : CNTRL
: : : : : : : : : : : : ASCII
: : : : : : : : : : : : : IDENT
: : : : : : : : : : : : : : ALUND
: : : : : : : : : : : : : : : ALL
: : : : : : : : : : : : : : : : NONE
Dec Hex : : : : : : : : : : : : : : : : : Char
--- --- - - - - - - - - - - - - - - - - - ----
0 0 _ _ _ _ _ _ _ _ _ _ _ C A _ _ A _ ^@
1 1 _ _ _ _ _ _ _ _ _ _ _ C A _ _ A _ ^A
2 2 _ _ _ _ _ _ _ _ _ _ _ C A _ _ A _ ^B
3 3 _ _ _ _ _ _ _ _ _ _ _ C A _ _ A _ ^C
4 4 _ _ _ _ _ _ _ _ _ _ _ C A _ _ A _ ^D
5 5 _ _ _ _ _ _ _ _ _ _ _ C A _ _ A _ ^E
6 6 _ _ _ _ _ _ _ _ _ _ _ C A _ _ A _ ^F
7 7 _ _ _ _ _ _ _ _ _ _ _ C A _ _ A _ ^G
8 8 _ _ _ _ _ _ _ _ _ _ _ C A _ _ A _ ^H
9 9 _ _ _ _ _ _ _ S _ _ _ C A _ _ A _ ^I
10 A _ _ _ _ _ _ _ S _ _ _ C A _ _ A _ ^J
11 B _ _ _ _ _ _ _ S _ _ _ C A _ _ A _ ^K
12 C _ _ _ _ _ _ _ S _ _ _ C A _ _ A _ ^L
13 D _ _ _ _ _ _ _ S _ _ _ C A _ _ A _ ^M
14 E _ _ _ _ _ _ _ _ _ _ _ C A _ _ A _ ^N
15 F _ _ _ _ _ _ _ _ _ _ _ C A _ _ A _ ^O
16 10 _ _ _ _ _ _ _ _ _ _ _ C A _ _ A _ ^P
17 11 _ _ _ _ _ _ _ _ _ _ _ C A _ _ A _ ^Q
18 12 _ _ _ _ _ _ _ _ _ _ _ C A _ _ A _ ^R
19 13 _ _ _ _ _ _ _ _ _ _ _ C A _ _ A _ ^S
20 14 _ _ _ _ _ _ _ _ _ _ _ C A _ _ A _ ^T
21 15 _ _ _ _ _ _ _ _ _ _ _ C A _ _ A _ ^U
22 16 _ _ _ _ _ _ _ _ _ _ _ C A _ _ A _ ^V
23 17 _ _ _ _ _ _ _ _ _ _ _ C A _ _ A _ ^W
24 18 _ _ _ _ _ _ _ _ _ _ _ C A _ _ A _ ^X
25 19 _ _ _ _ _ _ _ _ _ _ _ C A _ _ A _ ^Y
26 1A _ _ _ _ _ _ _ _ _ _ _ C A _ _ A _ ^Z
27 1B _ _ _ _ _ _ _ _ _ _ _ C A _ _ A _ ^[
28 1C _ _ _ _ _ _ _ _ _ _ _ C A _ _ A _ ^/
29 1D _ _ _ _ _ _ _ _ _ _ _ C A _ _ A _ ^]
30 1E _ _ _ _ _ _ _ _ _ _ _ C A _ _ A _ ^^
31 1F _ _ _ _ _ _ _ _ _ _ _ C A _ _ A _ ^_
32 20 _ _ _ _ _ _ _ S P _ _ _ A _ _ A _
33 21 _ _ _ _ _ _ _ _ P G P _ A _ _ A _ !
34 22 _ _ _ _ _ _ _ _ P G P _ A _ _ A _ "
35 23 _ _ _ _ _ _ _ _ P G P _ A _ _ A _ #
36 24 _ _ _ _ _ _ _ _ P G P _ A _ _ A _ $
37 25 _ _ _ _ _ _ _ _ P G P _ A _ _ A _ %
38 26 _ _ _ _ _ _ _ _ P G P _ A _ _ A _ &
39 27 _ _ _ _ _ _ _ _ P G P _ A _ _ A _ '
40 28 _ _ _ _ _ _ _ _ P G P _ A _ _ A _ (
41 29 _ _ _ _ _ _ _ _ P G P _ A _ _ A _ )
42 2A _ _ _ _ _ _ _ _ P G P _ A _ _ A _ *
43 2B _ _ _ _ _ _ _ _ P G P _ A _ _ A _ +
44 2C _ _ _ _ _ _ _ _ P G P _ A _ _ A _ ,
45 2D _ _ _ _ _ _ _ _ P G P _ A _ _ A _ -
46 2E _ _ _ _ _ _ _ _ P G P _ A _ _ A _ .
47 2F _ _ _ _ _ _ _ _ P G P _ A _ _ A _ /
48 30 _ _ _ O D X A _ P G _ _ A I _ A _ 0
49 31 _ _ _ O D X A _ P G _ _ A I _ A _ 1
50 32 _ _ _ O D X A _ P G _ _ A I _ A _ 2
51 33 _ _ _ O D X A _ P G _ _ A I _ A _ 3
52 34 _ _ _ O D X A _ P G _ _ A I _ A _ 4
53 35 _ _ _ O D X A _ P G _ _ A I _ A _ 5
54 36 _ _ _ O D X A _ P G _ _ A I _ A _ 6
55 37 _ _ _ O D X A _ P G _ _ A I _ A _ 7
56 38 _ _ _ _ D X A _ P G _ _ A I _ A _ 8
57 39 _ _ _ _ D X A _ P G _ _ A I _ A _ 9
58 3A _ _ _ _ _ _ _ _ P G P _ A _ _ A _ :
59 3B _ _ _ _ _ _ _ _ P G P _ A _ _ A _ ;
60 3C _ _ _ _ _ _ _ _ P G P _ A _ _ A _ <
61 3D _ _ _ _ _ _ _ _ P G P _ A _ _ A _ =
62 3E _ _ _ _ _ _ _ _ P G P _ A _ _ A _ >
63 3F _ _ _ _ _ _ _ _ P G P _ A _ _ A _ ?
64 40 _ _ _ _ _ _ _ _ P G P _ A _ _ A _ @
65 41 U _ A _ _ X A _ P G _ _ A I A A _ A
66 42 U _ A _ _ X A _ P G _ _ A I A A _ B
67 43 U _ A _ _ X A _ P G _ _ A I A A _ C
68 44 U _ A _ _ X A _ P G _ _ A I A A _ D
69 45 U _ A _ _ X A _ P G _ _ A I A A _ E
70 46 U _ A _ _ X A _ P G _ _ A I A A _ F
71 47 U _ A _ _ _ A _ P G _ _ A I A A _ G
72 48 U _ A _ _ _ A _ P G _ _ A I A A _ H
73 49 U _ A _ _ _ A _ P G _ _ A I A A _ I
74 4A U _ A _ _ _ A _ P G _ _ A I A A _ J
75 4B U _ A _ _ _ A _ P G _ _ A I A A _ K
76 4C U _ A _ _ _ A _ P G _ _ A I A A _ L
77 4D U _ A _ _ _ A _ P G _ _ A I A A _ M
78 4E U _ A _ _ _ A _ P G _ _ A I A A _ N
79 4F U _ A _ _ _ A _ P G _ _ A I A A _ O
80 50 U _ A _ _ _ A _ P G _ _ A I A A _ P
81 51 U _ A _ _ _ A _ P G _ _ A I A A _ Q
82 52 U _ A _ _ _ A _ P G _ _ A I A A _ R
83 53 U _ A _ _ _ A _ P G _ _ A I A A _ S
84 54 U _ A _ _ _ A _ P G _ _ A I A A _ T
85 55 U _ A _ _ _ A _ P G _ _ A I A A _ U
86 56 U _ A _ _ _ A _ P G _ _ A I A A _ V
87 57 U _ A _ _ _ A _ P G _ _ A I A A _ W
88 58 U _ A _ _ _ A _ P G _ _ A I A A _ X
89 59 U _ A _ _ _ A _ P G _ _ A I A A _ Y
90 5A U _ A _ _ _ A _ P G _ _ A I A A _ Z
91 5B _ _ _ _ _ _ _ _ P G P _ A _ _ A _ [
92 5C _ _ _ _ _ _ _ _ P G P _ A _ _ A _ '\'
93 5D _ _ _ _ _ _ _ _ P G P _ A _ _ A _ ]
94 5E _ _ _ _ _ _ _ _ P G P _ A _ _ A _ ^
95 5F _ _ _ _ _ _ _ _ P G P _ A I A A _ _
96 60 _ _ _ _ _ _ _ _ P G P _ A _ _ A _ `
97 61 _ L A _ _ X A _ P G _ _ A I A A _ a
98 62 _ L A _ _ X A _ P G _ _ A I A A _ b
99 63 _ L A _ _ X A _ P G _ _ A I A A _ c
100 64 _ L A _ _ X A _ P G _ _ A I A A _ d
101 65 _ L A _ _ X A _ P G _ _ A I A A _ e
102 66 _ L A _ _ X A _ P G _ _ A I A A _ f
103 67 _ L A _ _ _ A _ P G _ _ A I A A _ g
104 68 _ L A _ _ _ A _ P G _ _ A I A A _ h
105 69 _ L A _ _ _ A _ P G _ _ A I A A _ i
106 6A _ L A _ _ _ A _ P G _ _ A I A A _ j
107 6B _ L A _ _ _ A _ P G _ _ A I A A _ k
108 6C _ L A _ _ _ A _ P G _ _ A I A A _ l
109 6D _ L A _ _ _ A _ P G _ _ A I A A _ m
110 6E _ L A _ _ _ A _ P G _ _ A I A A _ n
111 6F _ L A _ _ _ A _ P G _ _ A I A A _ o
112 70 _ L A _ _ _ A _ P G _ _ A I A A _ p
113 71 _ L A _ _ _ A _ P G _ _ A I A A _ q
114 72 _ L A _ _ _ A _ P G _ _ A I A A _ r
115 73 _ L A _ _ _ A _ P G _ _ A I A A _ s
116 74 _ L A _ _ _ A _ P G _ _ A I A A _ t
117 75 _ L A _ _ _ A _ P G _ _ A I A A _ u
118 76 _ L A _ _ _ A _ P G _ _ A I A A _ v
119 77 _ L A _ _ _ A _ P G _ _ A I A A _ w
120 78 _ L A _ _ _ A _ P G _ _ A I A A _ x
121 79 _ L A _ _ _ A _ P G _ _ A I A A _ y
122 7A _ L A _ _ _ A _ P G _ _ A I A A _ z
123 7B _ _ _ _ _ _ _ _ P G P _ A _ _ A _ {
124 7C _ _ _ _ _ _ _ _ P G P _ A _ _ A _ |
125 7D _ _ _ _ _ _ _ _ P G P _ A _ _ A _ }
126 7E _ _ _ _ _ _ _ _ P G P _ A _ _ A _ ~
127 7F _ _ _ _ _ _ _ _ _ _ _ C A _ _ A _ ^?
-
- Usage:
- This section illustrates intended use of this component.
-
- Example 1: Validating C-Style Identifiers:
- The character category extensions
IDENT
and ALUND
are particularly useful for parsing C-style identifier names as described by the following regular expression: The first character is required and must be in category ALUND
. All subsequent characters are optional and must be in category IDENT
: