Quick Links:

bal | bbl | bdl | bsl

Namespaces

Component bdlde_hexencoder
[Package bdlde]

Provide automata converting to hex encodings. More...

Namespaces

namespace  bdlde

Detailed Description

Outline
Purpose:
Provide automata converting to hex encodings.
Classes:
bdlde::HexEncoder automaton for Quoted-Printable encoding
See also:
Component bdlde_hexdecoder
Description:
This component provides a class, bdlde::HexEncoder, for encoding plain text into its hexadecimal representation.
bdlde::HexEncoder and bdlde::HexDecoder provide a pair of template functions (each parameterized separately on both input and output iterators) that can be used respectively to encode and to decode byte sequences of arbitrary length into and from the printable Hex representation.
Each instance of either the encoder or decoder retains the state of the conversion from one supplied input to the next, enabling the processing of segmented input -- i.e., processing resumes where it left off with the next invocation on new input. Instance methods are provided for both the encoder and decoder to (1) assert the end of input, (2) determine whether the input so far is currently acceptable, and (3) indicate whether a non-recoverable error has occurred.
Hex Encoding:
The data stream is processed one byte at a time from left to right. Each byte
      7 6 5 4 3 2 1 0
     +-+-+-+-+-+-+-+-+
     |               |
     +-+-+-+-+-+-+-+-+
      `------v------'
            Byte
is segmented into two intermediate 4-bit quantities.
      3 2 1 0 3 2 1 0
     +-+-+-+-+-+-+-+-+
     |       |       |
     +-+-+-+-+-+-+-+-+
      `--v--' `--v--'
       char0   char1
Each 4-bit quantity is in turn used as an index into the following character table to generate an 8-bit character.
     =================
  Hex Alphabet *
     -----------------
     Val Enc  Val Enc
     --- ---  --- ---
       0 '0'    8 '8'
       1 '1'    9 '9'
       2 '2'   10 'A'
       3 '3'   11 'B'
       4 '4'   12 'C'
       5 '5'   13 'D'
       6 '6'   14 'E'
       7 '7'   15 'F'
     =================
Depending on the settings encoder represents values from 10 to 15 as uppercase (A-'F') or lowercase letters(a-'f').
Input values of increasing length along with their corresponding Hex encodings are illustrated below:
        Data: /* nothing */
    Encoding: /* nothing */

        Data: "0"     (0011 0000)
    Encoding: 30

        Data: "01"    (0011 0000 0011 0001)
    Encoding: 3031

        Data: "01A"   (0011 0000 0011 0001 1000 0001)
    Encoding: 303141

        Data: "01A?"  (0011 0000 0011 0001 1000 0001 0011 1111)
    Encoding: 3031413F
Hex Decoding:
The data stream is processed two bytes at a time from left to right. Each sequence of two 8-bit quantities
      7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |               |               |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      `------v------' `------v------'
           Byte0           Byte1
is segmented into four intermediate 4-bit quantities.
      3 2 1 0 3 2 1 0 3 2 1 0 3 2 1 0
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |       |       |       |       |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      `--v--' `--v--' `--v--' `--v--'
      chunk0   chunk1  chunk2  chunk3
The second and forth chunks are combined to get the resulting 8-bit character.
Whitespace characters are ignored. On any non-alphabet character the decoder reports an error. In order for a Hex encoding to be valid the length of the input data (excluding any whitespace characters) must be a multiple of two.
Input values of increasing length along with their corresponding Hex encodings are illustrated below (note that the encoded whitespace character is skipped and the resulting string does not contain it):
        Data: /* nothing */
    Encoding: /* nothing */

        Data: "4"       (0000 0100)
    Encoding: /* nothing */

        Data: "41"      (0000 0100 0000 0001)
    Encoding: A

        Data: "412"     (0000 0100 0000 0001 0000 0010)
    Encoding: A

        Data: "4120"    (0000 0100 0000 0001 0000 0010 0000 0000)
    Encoding: A

        Data: "41203"   (0000 0100 0000 0001 0000 0010 0000 0000
                         0000 0011)
    Encoding: A

        Data: "41203F"  (0011 0000 0011 0001 1000 0001 0010 0011
                         0000 0011 0000 1111)
    Encoding: A?
Usage:
This section illustrates intended use of this component.
Example 1: Basic Usage of bdlde::HexEncoder:
The following example shows how to use a bdlde::HexEncoder object to implement a function, streamEncoder, that reads text from bsl::istream, encodes that text into hex representation , and writes the encoded text to a bsl::ostream. streamEncoder returns 0 on success and a negative value if the input data could not be successfully encoded or if there is an I/O error.
  int streamEncoder(bsl::ostream& os, bsl::istream& is)
      // Read the entire contents of the specified input stream 'is', convert
      // the input plain text to hex representation, and write the encoded
      // text to the specified output stream 'os'.  Return 0 on success, and
      // a negative value otherwise.
  {
      enum {
          SUCCESS      =  0,
          ENCODE_ERROR = -1,
          IO_ERROR     = -2
      };
First we create an object, create buffers for storing data, and start loop that runs while the input stream contains some data:
      bdlde::HexEncoder converter;

      const int INBUFFER_SIZE  = 1 << 10;
      const int OUTBUFFER_SIZE = 1 << 10;

      char inputBuffer[INBUFFER_SIZE];
      char outputBuffer[OUTBUFFER_SIZE];

      char *output    = outputBuffer;
      char *outputEnd = outputBuffer + sizeof outputBuffer;

      while (is.good()) {  // input stream not exhausted
On each iteration we read some data from the input stream:
          is.read(inputBuffer, sizeof inputBuffer);

          const char *input    = inputBuffer;
          const char *inputEnd = input + is.gcount();

          while (input < inputEnd) { // input encoding not complete

              int numOut;
              int numIn;
Convert obtained text using bdlde::HexEncoder:
              int status = converter.convert(
                                       output,
                                       &numOut,
                                       &numIn,
                                       input,
                                       inputEnd,
                                       static_cast<int>(outputEnd - output));
              if (status < 0) {
                  return ENCODE_ERROR;                              // RETURN
              }

              output += numOut;
              input  += numIn;
And write encoded text to the output stream:
              if (output == outputEnd) {  // output buffer full; write data
                  os.write(outputBuffer, sizeof outputBuffer);
                  if (os.fail()) {
                      return IO_ERROR;                              // RETURN
                  }
                  output = outputBuffer;
              }
          }
      }

      while (1) {
          int numOut = 0;
Then, we need to store the unhandled symbol (if there is one) to the output buffer and complete the work of our encoder:
          int more = converter.endConvert(
                                      output,
                                      &numOut,
                                      static_cast<int>(outputEnd - output));
          if (more < 0) {
              return ENCODE_ERROR;                                  // RETURN
          }

          output += numOut;

          if (!more) { // no more output
              break;
          }

          assert(output == outputEnd);  // output buffer is full

          os.write(outputBuffer, sizeof outputBuffer);  // write buffer
          if (os.fail()) {
              return IO_ERROR;                                      // RETURN
          }
          output = outputBuffer;
      }

      if (output > outputBuffer) {
          os.write(outputBuffer, output - outputBuffer);
      }

      return is.eof() && os.good() ? SUCCESS : IO_ERROR;
  }
Next, to demonstrate how our function works we need to create a stream with data to encode. Assume that we have some character buffer, BLOOMBERG_NEWS, and a function, streamDecoder mirroring the work of the streamEncoder:
  bsl::istringstream inStream(bsl::string(BLOOMBERG_NEWS,
                                          sizeof(BLOOMBERG_NEWS)));
  bsl::stringstream  outStream;
  bsl::stringstream  backInStream;
Then, we use our function to encode text:
  assert(0 == streamEncoder(outStream,    inStream));
Now, we decode this text back using mirror function:
  assert(0 == streamDecoder(backInStream, outStream));
Finally, we observe that the output fully matches the original text:
  assert(0 == strcmp(BLOOMBERG_NEWS, backInStream.str().c_str()));