PGG - Parser Generator Generator for Functional Languages


PGG is a tool which generates parser-generators for functional languages. It offers an easy way to implement parser-generators for new programming languages. You only need basic C programming skill to gain a parser-generator written in your own language, if your language contains very basic syntactic components such as let-binding, pattern matching, recursive function and so on. See Requirements.

While a parser-generator takes as input an input-specification to generate a parser, PGG takes as input a translation module to generate a parser-generator. The translation module plays a role to translate an abstracted parser into a concrete parser written in your language. Implementing the translation module may seem difficult, but don't worry. PGG provides you with most implementation of it and you only have to implement simple functions and customize some variables with provided information. See Implementing Translation module .

The input-specification format of PGG-generated parser-generator(PGG parser-generator) is based on that of the existing parser-generator Ocamlyacc. If you are already familiar to Ocamlyacc, you will easily adapt yourself to the input specification format of PGG parser-generator. Between Ocamlyacc and PGG parser-generator input-specifications, only three differences exist; 1) header and trailer sections are divided into two subsections, 2) types of non-terminal symbols should be explicitly declared, 3) a new keyword '@' is introduced as a method to access information about token position. See Input-specification of PGG parser-generator.

According to our experiments, performances of parsers generated from PGG parser-generators are generally poor than parsers generated from existing parsers. When we measured elapsed time to parse all implementation files(*.ml, 597files, 85384lines) of Ocaml sources distribution(version 3.10.0) by using two Ocaml parsers generated from PGG parser-generator and Ocamlyacc, PGG parser is 4~5 seconds slower than Ocamlyacc parser. However, think about effort to imlpement a parser-generator. You can choose: struggle against complex algorithms to implement your parser-generator for a week, or use PGG to implement your own parser-generator just in several hours.


How to use PGG

Implementing Translation module

The translation module converts the abstracted parser into your own parser which is written in a specific programming language that you want. It traverses the tree structure of the abstracted parser and translates each syntactic components into the target language. Initially, translate.c is the translation module for Ocaml. You can implement your translation module by modifying translate.c. Open it and see 146 line. You only have to customize some variables and implement several subfunctions. All examples are based on Ocaml.

(1) Customizing variables

(2) Implementing translation functions

In the same way, you can easily implement other translation functions.

Input-specification of PGG parser-generator

The format of PGG parser-generator input-specification (PGG input-specification) is based on Ocamlyacc's input specification format. In this chapter, we just discuss the difference between them rather than all details because the Ocamlyacc input-specification is already well documented. See Ocamlyacc Tutorial .

Invoking PGG parser


  • Translation module for Haskell [Download]

  • Translation module for SML [Download]

    Email us at .

    Programming Language Laboratory
    Department of Computer Science and Engineering
    Pohang University of Science and Technology
    Republic of Korea