1. Overview

This chapter gives the overview of semantic and its goals.

With Emacs, regular expressions (and syntax tables) are the basis of identifying components in a programming language source for purposes such as color highlighting. This approach has proved is usefulness, but have limitations.

semantic provides a new intrastructure that goes far beyond text analysis based on regular expressions.

semantic uses parsers to analyze programming language sources. For languages that can be described using a context-free grammar, parsers can be based on the grammar of the language. Or they can be external parsers implemented using any means. This allows the use of a regular expression parser for non-regular languages, or external programs for speed.

semantic provides extensive tools to help support a new language. An original LL parser, and a Bison-like LALR parser are included. So, for a regular language, all that the developer needs to do is write a grammar file along with appropriate semantic rules.

semantic allows an uniform representation of language components, and provides a common API so that programmers can develop applications that work for all languages. The distribution includes good set of tools and examples for the application writers, that demonstrate the usefulness of semantic.

The following diagram illustrates the benefits of using semantic:

Please Note:: The words in all-capital are those that semantic itself provides. Others are current or future languages or applications that are not distributed along with semantic.

                                                             Applications
                                                                 and
                                                              Utilities
                                                                -------
                                                               /       \
               +---------------+    +--------+    +--------+
         C --->| C      PARSER |--->|        |    |        |
               +---------------+    |        |    |        |
               +---------------+    | COMMON |    | COMMON |<--- SPEEDBAR
      Java --->| JAVA   PARSER |--->|        |    |        |
               +---------------+    | PARSE  |    | PARSE  |<--- SENATOR
               +---------------+    |        |    |        |
    Python --->| PYTHON PARSER |--->| TREE   |    | TREE   |<--- DOCUMENT
               +---------------+    |        |    |        |
               +---------------+    | FORMAT |    | API    |<--- SEMANTICDB
    Scheme --->| SCHEME PARSER |--->|        |    |        |
               +---------------+    |        |    |        |<--- jdee
               +---------------+    |        |    |        |
   Texinfo --->| TEXI.  PARSER |--->|        |    |        |<--- ecb
               +---------------+    |        |    |        |

                    ...                ...           ...         ...

               +---------------+    |        |    |        |<--- app. 1
   Lang. A --->| A      Parser |--->|        |    |        |
               +---------------+    |        |    |        |<--- app. 2
               +---------------+    |        |    |        |
   Lang. B --->| B      Parser |--->|        |    |        |<--- app. 3
               +---------------+    |        |    |        |

                     ...        ...     ...          ...       ...

               +---------------+    |        |    |        |
   Lang. Y --->| Y      Parser |--->|        |    |        |<--- app. ?
               +---------------+    |        |    |        |
               +---------------+    |        |    |        |<--- app. ?
   Lang. Z --->| Z      Parser |--->|        |    |        |
               +---------------+    +--------+    +--------+

1.1 Semantic Components

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

1.1 Semantic Components

This chapter gives an overview of major components of semantic and how they interact with each other to perform its job.

The first step of parsing is to break up the input file into its fundamental components. This step is called lexical analysis. The output of the lexical analyzer is a list of tokens that make up the file.

        syntax table, keywords list, and options
                         |
                         |
                         v
    input file  ---->  Lexer   ----> token stream

The next step is the parsing shown below.

                    parser tables
                         |
                         v
    token stream --->  Parser  ----> parse tree

The end result, the parse tree, is created based on the parser tables, which are in the internal representation of the language grammar used by semantic.

The semantic database provides caching of the parse trees by saving them into files named `semantic.cache' automatically when loading them when appropriate instead of re-parsing. The reason for this is to save the time it takes to parse a file which could take several seconds or more for large files.