Automatic Correction of Syntax Errors in Programming Languages
A very substantial fraction of the time and efforts required to develop a program is devoted to the removal of errors. In order to simplify this task, a model to automatize the correction of syntax errors is developed. It is the first model which is both formal and fairly realistic to appear in the literature. The notion of error is defined and studied formally. Then, using this definition, a systematic error-correction process is modelled. This process makes local corrections over clusters of errors, using the context around the errors to determine the corrections and to insure that the different local corrections performed on the string do not interfere with one another. The error-correction process can be naturally embedded in many left-to-right syntax checking processes. It uses the recognizer both to detect errors and to find possible corrections. The process has two modes: a ``standard mode'' used for syntax checking and an ``error-correction mode'' used for determining the context of a cluster of errors and for finding all possible corrections of these errors. In the ``standard mode'', the syntax is checked at the same speed as if no error-correction mechanism is implemented. Thus, for programs which contain no errors, no price is paid for the presence of this mechanism. The ``error-correction mode'' consist of two phases: the backward move which locates the left context of the cluster, and the forward move which construct possible corrections and locates the right context of the cluster. This process seems the most natural way to perform left-to-right syntax checking and error correction. Some techniques for efficiently finding the range of the backward move are developed. The formal model is not practical when using the conventional context-free description of programming languages. In order to make it more practical, the notion of bracketed context-free language is introduced and proposed as a model for the syntax fo programming languages. Then, heuristic restrictions on the type of errors corrected are discussed. They may lead to a simpler process. In particular, assuming that brackets are corrected only when no other correction is possible, and that errors in deep levels of nesting (with respect to the point where the errors are detected) are neglected, it is shown how the process can be used to correct syntax errors in programming languages.
computer science; technical report
Previously Published As