January 18, 2023

Introducing bdljsn::Json

BDE 3.111.0 introduced a new package, bdljsn that provides an in-memory document type bdljsn::Json modeled on the JSON specification, as well as utilties to read and write JSON text into a bdljsn::Json object.

How does bdljsn fit into the universe of other JSON facilities?

Comparison With baljsn

The bdljsn package serves a different purpose from baljsn. bdljsn::Json provides a value-semantic representation of arbitrary JSON documents (which can be encoded and decoded), while baljsn provides JSON support for other (non-JSON specific) vocabulary types.

baljsn, BDE’s existing JSON support package, effectively has two key features:

  • Encoding and decoding bdlat message ojects (i.e., BAS messages) to and from JSON

  • Encoding and decoding bdld::Datum objects to and from JSON

Note

baljsn::Encoder, baljsn::Decoder, and the bdlat framework (i.e., BAS messages) remain BDE’s recommended facility for encoding messages with a well defined schema!

Notably though, baljsn does not provide facilities or an in-memory representation that is naturally suited to working specifically with arbitrary JSON documents, and instead is focused on establishing a JSON mapping for other important document representations.

Comparison with Other JSON Libraries

There are many high quality existing open-source JSON libraries. Many of these libraries cater to specific sets of engineering trade-offs (some emphasize speed, some emphasize usability, etc). bdljsn provides a unique set of design choices.

The notable features of bdljsn include:

  • Integration with common BDE vocabulary (like bsl::string, bdldfp::Decimal64, bslh::hashAppend)

  • Scoped memory allocator support

  • Careful treatment of Numbers

  • An intuitive design, with support for legacy platforms

Most of those features are hopefully self explanatory.

The careful treatment of numbers has two aspects: the first is support for bdldfp::Decimal64 which can more precisely represent base-10 numeric data (as is common in financial applications). The second is that bdljsn defers evaluating numbers into a native number representation until the user requests a specific numeric type. Most JSON libraries make a best effort, when parsing JSON text, to decide the best numeric type to represent a number. Determining the expected number type at the point of parse works most of the time, but can hide errors, particularly around integer conversion. For example, the JSON document 10000000000000001.0 will often convert to the uint64_t value 10000000000000000 (because the library has coerced the text to a double), the JSON document 2147483648 (the absolute value of INT_MIN) will convert to the int -2147483648. or the JSON document 1.5 will return the int, 1.

bdljsn::JsonNumber defers coercing the JSON text for a number into a numeric type until the numeric type the user desires is known. This allows providing status for whether the number converted is within the representable range of the resulting type, and whether rounding was required – including whether a bdldfp::Decimal64 precisely represents the JSON text. Further, users can obtain the original JSON text for the number, which may be useful for error reporting (for example).

Note

See bdljsn_jsonnumber for details on the treatment of numbers.

Using bdljsn

A simple example of reading JSON text:

const char *exampleJSON = R"(
   {
     "number": 3.14,
     "boolean": true,
     "string": "text",
     "null": null,
     "array": [ 2.76, true ],
     "object": { "boolean": false }
   })";
bdljsn::Json result;
bdljsn::Error error;
int rc = bdljsn::JsonUtil::read(&result, &error, exampleJSON);
if (0 != rc) {
      bdljsn::JsonUtil::printError(bsl::cout, exampleJSON, error);
      return rc;
}

using namespace bdldfp::DecimalLiterals;

assert(3.14     == result["number"].asDouble());
assert(true     == result["boolean"].theBoolean());
assert("text"   == result["string"].theString());
assert(true     == result["null"].isNull());
assert(2.76_d64 == result["array"][0].asDecimal64());
assert(false    == result["object"]["boolean"].theBoolean());

bsl::cout << result << bsl::endl;

A simple example of creating a bdljsn::Json object in memory:

using namespace bdldfp::DecimalLiterals;

bdljsn::Json json;

json.makeObject();
json["number"]  = 3.14;
json["boolean"] = true;
json["string"]  = "text";
json["array"].makeArray();
json["array"].theArray().pushBack(bdljsn::Json(2.76_d64));
json["array"].theArray().pushBack(bdljsn::Json(true));
json["object"].makeObject()["boolean"] = false;

bdljsn::WriteOptions options;
options.setStyle(bdljsn::WriteStyle::e_PRETTY).setSortMembers(true);
bdljsn::JsonUtil::write(bsl::cout, json, options);

A simple example of initializing a bdljsn::Json object with a literal:

using namespace bdljsn::JsonLiterals;
bdljsn::Json value = R"({ "number": 4, "array": [0, 2, null] })"_json;

assert(bdljsn::JsonType::e_NUMBER == value["number"].type());
assert(bdljsn::JsonType::e_ARRAY  == value["array"].type());

Notice that supplying invalid JSON text to a _json literal will result in a call to the currently installed bsls::Assert failure handler. Literals use the currently installed global allocator to supply memory.

Additional examples can be found in bdljsn.

Understanding bdljsn

bdljsn::Json has a close structural similarity to the JSON grammar itself, which looks like the following, at a high level:

JSON ::= Object
       | Array
       | String
       | Number
       | Boolean
       | null

Noting that the Object and Array alternatives can recursively contain JSON. Just like this grammar, a bdljsn::Json is a variant of object, array, string, number, boolean, or the null value. The set of types that can be contained by a bdljsn::Json are represented as follows:

There is a 1-1 mapping between JSON documents and bdljsn::Json objects.

Note

Every bdljsn::Json object represents a valid JSON document!

This means writing a bdljsn::Json to a string can only fail if an out-of-memory condition occurs.

To ensure that all bdljsn::Json objects represent valid JSON documents, several operations (like constructors) have notable pre-conditions on their input:

Note that while a supplied string must be valid UTF-8, bdljsn will escape characters as needed when writing a bdljsn::Json to JSON text.

In addition to the classes that model JSON concepts, bdljsn provides:

Possible Next Steps

There are also a couple long-term extensions BDE is considering.

  • Support for JSON Schema. BDE is a strong believer in schemas. Schema validation is fairly complex, and we are trying to assess the value of such an extensions.

  • Support for JSON Pointer (IETF RFC 6901) objects.

  • An option to convert to/from bdlat-compatible types.

  • Support for initializer-list syntax (similar to nlohmann/json)