C Parser (Front End)

The C parser (front end) enables the construction of C custom compilers, analysis tools, or source transformation tools. It is a member of SD's family of language front ends, based on first-class infrastructure (DMS) for implementing such custom tools. The C front end includes:

  • Lexical analysis including ASCII, EBCDIC, ISO 8859-1, UTF-8 and 16, and Japanese Shift-JIS
    • Conversion of literal values (numbers, escaped strings) into native values to enable easy computation over literal values
    • String literals represented internally in Unicode to support 16-bit characters
  • Explicit grammar directly implements defacto and real standards and extensions
    • Full C (ISO 9899:1990) parser
    • Option for C99 (ISO 9899:1999) dialect
    • Option for C11 (ISO 9899:2011) dialect
    • Option for GNU C (GCC2/GCC3/GCC4/GCC5.0 including vector extensions) dialects
    • Option for Microsoft Visual6 C dialect
    • Easy extension for other dialects
  • Preprocessor support
    • Controllable include directory paths
    • Option to fully expand preprocessor directives
    • Option to parse include files for definitions
    • Option to parse preserving preprocessor conditional directives, macros and include directives
  • Automatic construction of complete abstract syntax tree
    • Capture of comments and formats (shape) of literal values
    • Capture of ambiguous parses during parsing
    • Ability to parse large systems of files into same workspace, enabling interprocedural and cross-file analysis/transformation
    • Ability to parse different languages into same workspace, enabling cross-language analysis/transformation
  • Facilities to process syntax trees
    • Complete procedural API to visit/query/update/construct/print syntax trees
    • Source regeneration by prettyprinting and/or fidelity printing of syntax trees with comments and lexical formats
    • Automatically generated source-to-source transformation system
    • Ability to define custom attribute-grammar-based analyzers
  • Name and Type resolution
    • Type representation system for all C types defined
    • All identifiers resolved to their C-defined type and stored in symbol tables
    • Automatic deletion of erroneous alternatives of ambiguous parses
    • Ability to condition transforms on identifier type
    • Abilility to visit/query/update symbol tables
  • Control flow graph extraction for each compilation unit
    • Constructed for each function definition
    • Ties control flow nodes to ASTs
    • Exposes sequence points
    • Computes Post-dominators
    • Computes Control Dependences (Sample control flow graph)
  • Application call graph extraction across all compilation units
    • Qualifies function pointers by "address taken" and argument types
    • Computes Transitive "Has-side-effect" information
  • Data Flow Analysis support
    • Forward and Backward Iterative Flow analyzers
    • Reaching definitions for scalar values, struct members, array elements, structs and arrays
      (Sample reaching definitions graph)
    • Use-definition chains
    • Definition-use chains
    • Reachable-uses analysis
  • Available as source code to enable complete customization
    • Means to manage multiple language dialects with highly shared common core
  • Robustness due to careful testing and application across many customers

Many of these facilities come as a consistent consequence of the front end being built on top of DMS.

Here are some sample tools (many offered by SD as products) built using the C front end:

Your organization may use DMS with the C front end to implement and deploy your own custom tools. The sample tools can be obtained in source form as part of the C front end for customization. Semantic Designs is also willing to build custom tools under contract.

For more information: [email protected]    Follow us at Twitter: @SemanticDesigns

C Parser
Front End