BDE 4.14.0 Production release
|
Provide command line parsing, validation, and access.
This component provides a value-semantic class, balcl::CommandLine
, used to represent the command-line arguments passed to a process. Also provided is balcl::CommandLineOptionsHandle
, an optionally-used class that provides an alternate means of access to the options (and associative values, if any) found in a balcl::CommandLine
object in a "parsed" state.
The constructor of balcl::CommandLine
takes a specification describing the command-line arguments. Once created, printUsage
can be invoked to print the usage syntax. The parse
method takes command-line arguments and validates them against the specification supplied at construction, printing suitable messages on an optionally-specified stream in case of a parsing error. Once parsed, options and values can be accessed using various access methods. The class provides a set of theType access methods (for example, theString
, theInt
) that return the value of the specified option name. It is also possible to link a variable with an option in the specification; doing so will cause the variable to be loaded with the option value once parse
has been invoked and was successful.
This component offers the following features:
bsl::optional
objects for each of the scalar option types except bool
.A lower bound can be placed on the number of multi-valued non-option arguments (e.g., two or more values) can be achieved by explicitly specifying the required number of single-valued non-option arguments of the same type before the unrestricted multi-value non-option of that same type.
This section provides background on Unix-style command-line arguments, as well as definitions of terms used frequently in this documentation (such as "option", "flag", "non-option", "tag", "short tag", "long tag"). Readers familiar with Unix command lines can skim this section or omit entirely.
Command-line arguments can be classified as:
For example, in the following command line:
the command name is mybuildcommand
. There is one option, described by -c CC64
: c
is the tag name, and CC64
is the option value. There is also one boolean option (flag): -e
is a flag, e
is the flag name. The last parameter, myproject
, is a non-option argument.
Sometimes option is also used where "flag" or "non-option" would be more accurate. What is actually intended should be clear from context.
A user specifies an option on a command line by entering one of the tag values configured for that option. Each option has a mandatory long tag and an optional short tag. The short tag, if specified, must be a single character; the long tag generally must follow the same rules applicable to C/C++ identifiers, except that -
is allowed (but not as the leading character). When a short tag is used on a command line, it must be preceded by -
, and when a long tag is used it must be preceded by --
. Flags have no corresponding values; they are either present or absent. Option tags must be followed by a corresponding option value. An option can have multiple values (such options are called multi-valued options). When multiple values are provided for an option, the tag must appear with each value (see the section {Multi-Valued Options and How to Specify Them}). Arguments that are not the command name, options, or flags are called "non-option" arguments and can be either single-valued or multi-valued. They do not have any tag associated with them.
Consider the syntax of a typical Unix-style command whose options are described by the usage string:
Here:
The command can be invoked as follows:
and an equivalent command line is:
Note that short tags must be prepended with -
and long tags with --
. To specify a non-option argument beginning with -
, use a single --
separator (not followed by a long tag).
This component supports a variety of forms for specifying option values. They are best described by example. Consider the command-line specification described by the following usage string:
The following (valid) forms can be used with equivalent meaning:
Note that if =13
is desired as an option value, then whitespace must be used as in:
All of the following are invalid:
Flags can be grouped (i.e., expressed more succinctly like -ctv
instead of -c -t -v
). While grouping flags, short tags must be used. For example, given the command-line specification described by the following usage string:
the following command lines are valid and equivalent:
Note that the last character in a group need not be a flag; it could be an option. Any character that is the short tag of an option signals the end of the flag group, and it must be followed by the value of the option. For example, given the command-line specification described by the following usage string:
the following command lines are valid and equivalent:
Options can have several values. For example, in the command-line specification described by the following usage string, *
denotes a multi-valued option, and +
denotes a multivalued option that must occur at least once.
multiple values can be given as follows:
They need not be supplied contiguously. For example, the following command line is valid and equivalent to the above:
Note that the tag needs to be repeated for every option value. For example, the following command line is invalid (because -l
must be repeated before both lib2
and lib3
):
Short and long forms can be used in mixed fashion, however:
Command-line arguments can appear in any order. For example, given the command-line specification described by the following usage string:
all the following command lines are valid (and equivalent):
There are three exceptions to the above rule on argument order:
-
then it must not appear before any option or flag and a --
must be put on the command line to indicate the end of all options and flags.For example, the following is invalid because of rule (1) above (because -o
should be followed by myoutfile
):
and the following is incorrect because of rule (2) (because -weirdfilename
, which starts with -
, must appear after --
):
The previous examples can be corrected in either of the following ways:
Note that the order of values within the sequence of multi-valued non-option arguments differs in both examples, as per rule (3). In the first example, the non-option arguments have for a value the (ordered) sequence:
while in the second example, the non-option argument value is the sequence:
This order may or may not matter to the application.
Sometimes users may wish to supply configuration via environment variables, which is common practice, for example, for services deployed via Docker containers in cloud environments.
By default, parsed command line option values are unaffected by the environment. If an environment variable name is specified for an option, then, during parsing, if an option is not set on the command line, and the specified environment variable has a value, that value is used for the command line option. If a value is not supplied, either via the command line, or via an environment variable, then the default value for the option, if any, is used.
Boolean options when set via the environment variable must be associated with a string.
Any other text, including an empty string, is treated as invalid input.
Array options can be set by environment variable. When an array option is supplied via an environment variable string, the discrete values must be separated using the space character. For example, if the program takes an integer array of years as an option, and that option is associated with the environment variable name "MYPROG_YEARS", to supply a value via the command line one could specify:
The '\' is used as an escape-character if an element of an array contains a space. For example:
would configure an array containing "C:\Program Files\MyProg" and "D:\Another\Path".
Environment variable exist in a single namespace accessed by all programs run in the shell, and many environment variable names are used by many different systems for different purposes. For that reason, we strongly encourage selecting a (ideally) unique prefix for the environment variables used by your application, to reduce the likelihood of collisions with environment variables used by other applications for different purposes.
For example, if someone were writing a trading application tradesvc
, they might choose to use "TRADESVC_" as a common prefix for the environment variables.
A command line is described by an option table (supplied as an array of balcl::OptionInfo
). Each entry (row) of the table describes an option (i.e., an option, flag, or non-option argument). Each entry has several fields, specified in the following order:
The first three fields must be specified. The type-and-constraint field can be omitted (meaning no constraint), and the occurrence information field likewise can be omitted (meaning that the option is not required on the command line).
The following sections provide a more detailed description of each field, including example values for each field.
In some applications, command-line specifications must be defined using a statically-initialized array. For that reason, there are two classes that serve the same purpose: balcl::OptionInfo
is a statically-initializable class but it does not conform to the bslma
allocator protocol, while balcl::Option
is convertible from balcl::OptionInfo
, takes allocators, and is suitable for storing into containers.
The tag field specifies the (optional) short tag and long tag for the corresponding option or flag, except that non-option arguments are indicated by an empty string for a tag field. There can only be one multi-valued entry for non-option arguments, and it must be listed last among the non-options.
The general format is either:
<s>
is the short tag, and <long>
is the long tag; orNote that for short tags (<s>
), s
must be a single character (different from -
and |
); for long tags ("<long>"), long
must have 2 or more characters (which may contain -
, but not as the first character, and cannot contain |
). Also note that either no tag (empty string), both short and long tags, or only a long tag, may be specified.
The tag field cannot be omitted, but it can be the empty string.
The name field specifies the name through which the option value can be accessed either through one of the theType methods.
The general format is any non-empty string. In most cases, the name will be used as-is. Note that any suffix starting at the first occurrence of =
, if any, is removed from the name before storing in the balcl::Option
. Thus, if a name having such a suffix is specified in a balcl::OptionInfo
(e.g., "nameOption=someAttribute"), the correct name to use for querying this option by name does not include the suffix (e.g., cmdLine.numSpecified("nameOption=someAttribute")
will always return 0, but cmdLine.numSpecified("nameOption")
will return the appropriate value).
This field cannot be omitted, and it cannot be an empty string.
The description field is used when printing the usage string.
The general format is any non-empty string.
This field cannot be omitted, and it cannot be an empty string.
The type-and-constraint field specifies the type and constraints for the option values. Flags are identified by having the boolean type; note that flags cannot have constraints. Multiple values (for multi-valued options and multi-valued non-option arguments) can be specified by using array types. The list of the supported types is provided in the section Supported Types below.
Other constraints can be placed on individual value(s). When the type is an array type, then those constraints are placed on the individual value(s) held in the array and not on the entire array. A list of useful constraints is provided in the section Supported Constraint Values . Also see the section Building New Constraints to see how new constraints can be built so that they can be used in the same manner as the available constraints.
Additionally, this field allows a specified variable to be linked to the option. In that case, after parsing, the variable is loaded with the option value specified on the command line (or its default value, if any, if the option is absent from the command line). Occurrence Information Field describes how to configure a default value.
The general format can be one of either:
This field can be omitted. If so, the type is assumed to be of string type with no constraints and no variable is linked to the option. No occurrence information field can then be specified; if such a field is desired, then the type-and-constraint field needs to be set explicitly.
Linked variables are updated by the parse
method of balcl::CommandLine
should that method determine a value for an option; otherwise, the linked variable is left unchanged. The value for an option is determined either from the command-line arguments passed to parse
or obtained from a pre-configured default value, if any (see Occurrence Information Field .
Linked variables can be bsl::optional
objects that wrap any of the non-array option types except for bool
(see Supported Types . Also, a link to a bsl::optional
object is disallowed if the option is "required" or has a default value (see Occurrence Information Field .
The occurrence information field is used to specify a default value for an option, and whether an option is required on the command line or is optional. An option may also be "hidden" (i.e., not displayed by printUsage
).
The general format of this field is one of the following:
If a default value is specified, the option is assumed to be optional; in addition, the default value must satisfy the type and constraint indicated by the specified type-and-constraint field.
This field can be omitted, and is always omitted if the type-and-constraint field is not specified. If omitted, the option is not required on the command line and has no default value; furthermore, if the option is not present on the command line, the linked variable, if any, is unaffected.
The environment variable name field is used to specify a string which is the name of an environment variable which, if set, allows the environment to be searched for a value of the option. If an option is specified on the command line, the environment is never searched for that option. If an option has a default value, and a value is specified in the environment, the value in the environment takes precedence over the default value.
The following tables give examples of field values.
The tag field may be declared using the following forms:
The name field may be declared using the following form:
Suppose, for example, that our application has the following parameters:
The type and constraint fields may be declared using the following values:
The following values may be used for this field:
Note: If an option is optional and no value is provided on the command line or through an environment variable, isSpecified
will will return false and
numSpecified' will return 0. If no default value is provided and if the variable is a linked variable, it will be unmodified by the parsing. If the variable is accessed through CommandLineOptionsHandle::value
, it will be in a null state (defined type but no defined value).
The name field may be declared using the following form:
The following types are supported. The type is specified by an enumeration value (see balcl_optiontype ) supplied as the first argument to:
which is used to create the type-and-constraint field value in the command-line specification. When the constraint need only specify the type of the option value (i.e., no linked variable or programmatic constraint), one can supply any of the public data members of balcl::OptionType
shown below:
The ASCII representation of these values (i.e., the actual format of the values on command lines) depends on the type:
This component supports constraint values for each supported type except bool
. Specifically, the utility struct
balcl::Constraint
defines TYPEConstraint
types (for instance, StringConstraint
, IntConstraint
) that can be used to define a constraint suitable for the balcl::TypeInfo
class.
A constraint is simply a function object that takes as its first argument the (address of the) data to be constrained and as its second argument the stream that should be written to with an appropriate error message when the data does not follow the constraint. The functor should return a bool
value indicating whether or not the data abides by the constraint (with true
indicating success). A constraint for a given option whose value has the given type must be convertible to one of the TYPEConstraint
types defined in the utility struct
balcl::Constraint
. Note that when passing a function as a constraint, the address of the function must be passed.
The balcl::CommandLine
class has a complex set of preconditions on the option specification table (array of balcl::OptionInfo
objects) passed to each of its constructors. There are requirements on individual elements, on elements relative to each other, and on the entire set of elements. If these preconditions are not met, the behavior of the constructor is undefined.
The preconditions (some previously mentioned) are given in their entirety below. Moreover, an overloaded class method, balcl::CommandLine::isValidOptionSpecification
, is provided to allow programmatic checking without risk of incurring undefined behavior.
The tag, name, and description fields must pass the isTagValid
, isNameValid, and
isDescriptionValid' methods of balcl::Option
, respectively.
Collectively, each non-empty short tag, each long tag, and each name must be unique in the specification.
Options having the bool
type (also known as "flags") are distinguished from the other supported option types in several ways:
-x=true
and -xFlag false
are disallowed when setting booleans on the command line). The presence or absence of the tag name (either long or short form) on the command line determines the value of the option.BoolArray
option type).-x -x -x
). For other option types, specifying multiple tags for a non-array option is an error.true
; however, one can determine the number of appearances by using the position
accessor.bsl::optional<bool>
variable.theBool
method returns the same value (true
or false
) as the isSpecified
method. In contrast, the the*
accessor methods for the other option types have a precondition such that either isSpecified()
must be true
or the option must have a default value.This section illustrates intended use of this component.
Suppose we want to design a sorting utility named mysort
that has the following syntax:
The <fileList>
argument is a non-option
, meaning that its value or values appear on the command line unannounced by tags. In this case, the +
following the argument means that it is an array type of argument where at least one element is required, so its values are stored in a bsl::vector
.
First, we define our variables to be initialized from the command line. All values must be initialized to their default state:
Then, we define our OptionInfo
table of attributes to be set. The fields of the OptionInfo
are:
parse
will set any variables that were specified on the command line, return 0 and there will be no output.Finally, we show what will happen if mysort
is called with invalid arguments. We will call without specifying an input file to fileList
, which will be an error. parse
streams a message describing the error and then returns non-zero, so our program will call cmdLine.printUsage
, which prints a detailed usage message.
Imagine we defined the same mysort
program with the same options. After a successful parse
, balcl::Commandline
makes the state of every option available through accessors (in addition to setting external variables as shown in example 1).
For every type that is supported, there is a the<TYPE>
accessor which takes a single argument, the name of the argument. In the above program, if parsing was successful, the following asserts will always pass:
The next accessors we'll discuss are isSpecified
and numSpecified
. Here, we use isSpecified
to determine whether "fieldSeparator" was specified on the command line, and we use numSpecified
to determine the number of times the "fieldSeparator" option appeared on the command line:
Suppose we are implementing mysort
(from examples 1 & 2) again, but here we want to make use of default option values and the ability to supply options via the environment.
In this example, we have decided not to link local variables, and instead access the option values via the balcl::CommandLine
object. Since we are not linking local variables, we specify OptionType::k_<TYPE>
in the specification table below for each TypeInfo
field.
To specify default values, we pass the default value to the OccurrenceInfo
field. Boolean options always have a default value of false
.
We also choose to allow these options to be supplied through the environment. To enable this, we specify an environment variable name as the 5th (optional) element of the OptionInfo
specification for the option. If no name is supplied for the environment variable, the option cannot be set via the environment.
First, in main
, we define our spec table:
Then, we declare our cmdLine
object. This time, we pass it a stream, and messages will be written to that stream rather than cerr
(the default).
Next, we call parse
(just like in Example 1):
balcl::CommandLine
uses the following precedence to determine the value of a command line option:
false
for booleans)Finally, if an option value is not supplied by either the command line or environment, and there is no default value, any linked variable will be unmodified, cmdLine.hasValue
for the option will return false
, and the behavior is undefined if cmdLine.the<TYPE>
for the option is called.
Note that cmdLine.isSpecified
will be true
only if an option was supplied by the command line or the environment.
If an array options is set by an environment variable, the different elements of the array are separated by spaces by default.
All these calling sequences are equivalent:
or
or
or the user can specify arguments through environment variables:
or as a combination of command line arguments and environment variables:
The '\' character is used as an escape character for array values provided via an environment variable. So, for example, if we needed to encode file names that contain a space (), which is the element separator (by default), we would use "\ ":
Notice we used a single tick to avoid requiring a double escape when supplying the string to the shell (e.g., avoiding "C:\\\\file\\ name\\ 1").
Suppose, we are again implementing mysort
, and we want to introduce some constraints on the values supplied for the variables. In this example, we will ensure that the supplied input files exist and are not directories, and that fieldSeparator
is appropriate.
First, we write a validation function for the file name. A validation function supplied to balcl::CommandLine
takes an argument of a const pointer to the input option type (with the user provided value) and a stream on which to write an error message, and the validation function returns a bool
that is true
if the option is valid, and false
otherwise.
Here, we implement a function to validate a file name, that returns true
if the file exists and is a regular file, and false
otherwise (writing an description of the error to the stream
):
Then, we also want to make sure that the specified fieldSeparator
is a non-whitespace printable ascii character, so we write a function for that:
Next, we define main
and declare the variables to be configured:
Notice that fieldSeparator
are in automatic storage with no constructor or initial value. We can safely use an uninitialized variable in the specTable
below because the specTable
provides a default value for it, which will be assigned to the variable if an option value is not provided on the command line or through environment variables. reverse
has to be initialized because no default for it is provided in specTable
.
Then, we declare our specTable
, providing function pointers for our constraint functions to the second argument of the TypeInfo
constructor.
If the constraint functions return false
, cmdLine.parse
will return non-zero, and the output will contain the message from the constraint function followed by the usage message.
We can use a bsl::optional
variables when providing optional command line parameters. Suppose we want to write a command line that takes an optional input file, and if the file is not supplied, take input from a bsl::stringstream
.
To represent the optional file name parameter, we link a variable of type bsl::optional<bsl::string>
. In general, when linking a variable to an option, we can choose to use bsl::optonal<TYPE>
in place of TYPE
, for any option type other than bool
.
Finally, we test whether optionalFileName
has been set, and if it has not been set, take input from a prepared stringstream
.