clang-metatool
A framework for reusing code in clang tools
|
When we first started writing clang tools, we realized that there is a lot of life cycle management that we had to repeat. In some cases, people advocate the usage of global variables to manage the life-cycle of that data, but that makes code reuse across tools even harder.
Additionally, we also learned that when writing a tool, it will be beneficial if the code is split in two phases. First a data collection phase, and later a post-processing phase that actually performed the bulk of the logic of the tool.
Essentially you will only need to write a class like:
And then you can use the clangmetatool::MetaToolFactory
combined with the clangmetatool::MetaTool
in your tool's main function:
One way in which our initial tools got hard to write and maintain was by trying to perform analysis or even replacements during the callbacks. It was not immediately obvious that this would lead to hard-to-maintain code. After we switched to the two-phase approach, we were able to reuse a lot more code across tools.
Fork me at github
clangmetatool::MetaToolFactory
This provides the boilerplate for a refactoring tool action, since you need a factory that passes the replacementsMap in to the frontend action class.
clangmetatool::MetaTool
This provides the boilerplate of a FrontendAction class that will perform data gathering and then run a post-processing phase that may do replacements. This simplifies the writing of the code into a constructor that registers preprocessor callbacks or ast matchers and a postprocessing phase.
When building a clang tool you are expected to ship the builtin headers from the compiler with the tool, otherwise the tool will fail to find headers like stdarg.h. Clang expects to find the builtin headers relative to the absolute path of where the tool is installed. This cmake module will provide a function called clangmetatool_install
which will handle all of that for you, example at skeleton/CMakeLists.txt.
This defines types that can be used as building blocks, those will be in the clangmetatool::types
namespace.
Another part of this library consists of a number of "Data
Collectors". Those will be in the clangmetatool::collectors
namespace.
"Data Collector" is a "design pattern" for reusing code in clang tools. It works by having a class that takes the CompilerInstance object as well as the match finder to the constructor and registers all required callbacks in order to collect the data later.
The collector class will also have a "getData" method that will return the pointer to a struct with the data. The "getData" method should only be called in the 'post-processing' phase of the tool.
Another part of this consists of constant propagators to assist with analysis. Those will be in the clangmetatool::propagation
namespace.
More specifically, the current implementation provides propagation for the follwing types so that variables may be queried for their true values anywhere within the control-flow, so long as the value is deterministic:
int
& int
-like typesThis could be useful for various purposes but especially for identifing things like which database a function is actually calling out to, etc.
clangmetatool::propagation::ConstantCStringPropagator
This provides infrastructure (utilizing clangmetatool::propagation::ConstantPropagator
and clangmetatool::propagation::PropagationVisitor
) to propagate constant C-style string values over the program. Resulting in the true value of a variable wherever the value is deterministic and "<UNRESOLVED>" anywhere else.
clangmetatool::propagation::ConstantPropagator
and clangmetatool::propagation::PropagationVisitor
These two classes provide the boilerplate to create infrastructure to propagate constants of arbitrary types through the control flow graph of the program in such a way that anywhere the constant value of a variable would be deterministic one may query its value at that point.
These classes are private to the library, but additional propagators could be easily made using these facilities.
After you "git init" into an empty directory, copy the contents of the skeleton directory. To build that project, do something like:
You need a full llvm+clang installation directory. Unfortunately, the Debian and Ubuntu packages are broken, so you may need to work-around by creating some symlinks (see .travis.Dockerfile in this repo for an example).