Quick Links:

bal | bbl | bdl | bsl

Public Types | Public Member Functions

bdljsn::Tokenizer Class Reference

#include <bdljsn_tokenizer.h>

List of all members.

Public Types

enum  TokenType {
  e_BEGIN = 1, e_ELEMENT_NAME, e_START_OBJECT, e_END_OBJECT,
  e_START_ARRAY, e_END_ARRAY, e_ELEMENT_VALUE, e_ERROR,
  BAEJSN_BEGIN = e_BEGIN, BAEJSN_ELEMENT_NAME = e_ELEMENT_NAME, BAEJSN_START_OBJECT = e_START_OBJECT, BAEJSN_END_OBJECT = e_END_OBJECT,
  BAEJSN_START_ARRAY = e_START_ARRAY, BAEJSN_END_ARRAY = e_END_ARRAY, BAEJSN_ELEMENT_VALUE = e_ELEMENT_VALUE, BAEJSN_ERROR = e_ERROR
}
 

This enum lists all the possible token types.

More...
enum  { k_EOF = +1 }
typedef bsls::Types::IntPtr IntPtr
typedef bsls::Types::Uint64 Uint64

Public Member Functions

 Tokenizer (bslma::Allocator *basicAllocator=0)
 ~Tokenizer ()
int advanceToNextToken ()
void reset (bsl::streambuf *streambuf)
int resetStreamBufGetPointer ()
void setAllowHeterogenousArrays (bool value)
void setAllowNonUtf8StringLiterals (bool value)
void setAllowStandAloneValues (bool value)
void setAllowTrailingTopLevelComma (bool value)
bool allowHeterogenousArrays () const
bool allowNonUtf8StringLiterals () const
bool allowStandAloneValues () const
bool allowTrailingTopLevelComma () const
bsls::Types::Uint64 currentPosition () const
bsls::Types::Uint64 readOffset () const
int readStatus () const
TokenType tokenType () const
int value (bsl::string_view *data) const

Detailed Description

This class provides a mechanism for traversing JSON data stored in a bsl::streambuf one node at a time and allows clients to access the data associated with that node, including its type and data value.

See Component bdljsn_tokenizer


Member Typedef Documentation


Member Enumeration Documentation

Enumerator:
e_BEGIN 

starting token

e_ELEMENT_NAME 

element name

e_START_OBJECT 

start of an object ({)

e_END_OBJECT 

end of an object (})

e_START_ARRAY 

start of an array ([)

e_END_ARRAY 

end of an array (])

e_ELEMENT_VALUE 

element value of a simple type

e_ERROR 

error token

BAEJSN_BEGIN 
BAEJSN_ELEMENT_NAME 
BAEJSN_START_OBJECT 
BAEJSN_END_OBJECT 
BAEJSN_START_ARRAY 
BAEJSN_END_ARRAY 
BAEJSN_ELEMENT_VALUE 
BAEJSN_ERROR 
anonymous enum
Enumerator:
k_EOF 

Constructor & Destructor Documentation

bdljsn::Tokenizer::Tokenizer ( bslma::Allocator basicAllocator = 0  )  [explicit]

Create a Reader object. Optionally specify a basicAllocator used to supply memory. If basicAllocator is 0, the currently installed default allocator is used.

bdljsn::Tokenizer::~Tokenizer (  ) 

Destroy this object.


Member Function Documentation

int bdljsn::Tokenizer::advanceToNextToken (  ) 

Move to the next token in the data steam. Return 0 on success and a non-zero value otherwise. Each call to advanceToNextToken invalidates the string references returned by the value accessor for prior nodes. Note that on malformed JSON, this function may, but will not always, return a non-zero value before the end of the token stream is reached.

void bdljsn::Tokenizer::reset ( bsl::streambuf *  streambuf  ) 

Reset this tokenizer to read data from the specified streambuf. Note that the reader will not be on a valid node until advanceToNextToken is called. Note that this function does not change the value of the allowStandAloneValues, allowHeterogenousArrays, or allowNonUtf8StringLiterals options.

int bdljsn::Tokenizer::resetStreamBufGetPointer (  ) 

Reset the get pointer of the streambuf held by this object to refer to the byte following the last processed byte, if the held streambuf supports seeking, and return an error otherwise leaving this object unchanged. Return 0 on success, and a non-zero value otherwise. Note that after a successful function return users can read data from the streambuf that was specified during reset from where this object stopped. Also note that this call implies the end of processing for this object and any subsequent methods invoked on this object should only be done after calling reset and specifying a new streambuf.

void bdljsn::Tokenizer::setAllowHeterogenousArrays ( bool  value  ) 

Set the allowHeterogenousArrays option to the specified value. If the allowHeterogenousArrays value is true this tokenizer will successfully tokenize heterogeneous values within an array. If the option's value is false then the tokenizer will return an error for arrays having heterogeneous values. By default, the value of the allowHeterogenousArrays option is true.

void bdljsn::Tokenizer::setAllowNonUtf8StringLiterals ( bool  value  ) 

Set the allowNonUtf8StringLiterals option to the specified value. If the allowNonUtf8StringLiterals value is false this tokenizer will check string literal tokens for invalid UTF-8, enter an error mode if it encounters a string literal token that has any content that is not UTF-8, and fail to advance to subsequent tokens until reset is called. By default, the value of the allowNonUtf8StringLiterals option is true.

void bdljsn::Tokenizer::setAllowStandAloneValues ( bool  value  ) 

Set the allowStandAloneValues option to the specified value. If the allowStandAloneValues value is true this tokenizer will successfully tokenize JSON values (strings and numbers). If the option's value is false then the tokenizer will only tokenize complete JSON documents (JSON objects and arrays) and return an error for stand alone JSON values. By default, the value of the allowStandAloneValues option is true.

void bdljsn::Tokenizer::setAllowTrailingTopLevelComma ( bool  value  ) 

Set the allowTrailingTopLevelComma option to the specified value. If the allowTrailingTopLevelComma value is true this tokenizer will successfully tokenize JSON values where a comma follows the top-level JSON element. If the option's value is false then the tokenizer will reject documents with such trailing commas, such as {},. By default, the value of the allowTrailingTopLevelComma option is true for backwards compatibility. Note that a document without any JSON elements is invalid whether or not it contains commas.

bool bdljsn::Tokenizer::allowHeterogenousArrays (  )  const

Return the value of the allowHeterogenousArrays option of this tokenizer.

bool bdljsn::Tokenizer::allowNonUtf8StringLiterals (  )  const

Return the value of the allowNonUtf8StringLiterals option of this tokenizer.

bool bdljsn::Tokenizer::allowStandAloneValues (  )  const

Return the value of the allowStandAloneValues option of this tokenizer.

bool bdljsn::Tokenizer::allowTrailingTopLevelComma (  )  const

Return the value of the allowTrailingTopLevelComma option of this tokenizer.

bsls::Types::Uint64 bdljsn::Tokenizer::currentPosition (  )  const

Return the offset of the current octet being tokenized in the stream supplied to reset, or if an error occurred, the position where the failed attempt to tokenize a token occurred. Note that this operation is intended to provide additional information in the case of an error.

bsls::Types::Uint64 bdljsn::Tokenizer::readOffset (  )  const

Return the last read position relative to when reset was called. Note that readOffset() >= currentPosition() -- the readOffset is the offset of the last octet read from the stream supplied to reset, and is at or beyond the current position being tokenized.

int bdljsn::Tokenizer::readStatus (  )  const

Return the status of the last call to reloadStringBuffer():

  • 0 if reloadStringBuffer() has not been called or if a token was successfully read.
  • k_EOF (which is positive) if no data could be read before reaching EOF.
  • a negative value if the allowNonUtf8StringLiterals option is false and a UTF-8 error occurred. The specific value returned will be one of the enumerators of the bdlde::Utf8Util::ErrorStatus enum type indicating the nature of the UTF-8 error.
TokenType bdljsn::Tokenizer::tokenType (  )  const

Return the token type of the current token.

int bdljsn::Tokenizer::value ( bsl::string_view *  data  )  const

Load into the specified data the value of the specified token if the current token's type is e_ELEMENT_NAME or e_ELEMENT_VALUE or leave data unmodified otherwise. Return 0 on success and a non-zero value otherwise. Note that the returned data is only valid until the next manipulator call on this object.


The documentation for this class was generated from the following file: