[Index]

5. Writing Submode Classes

Sometimes (perhaps often) you may want to use MMM with a syntax for which it is suited, but for which no submode is supplied. In such cases you may have to write your own submode class. This chapter briefly describes how to write a submode class, from the basic to the advanced, with examples.

5.1 Writing Basic Submode Classes		Writing a simple submode class.
5.2 Matching Paired Delimiters		Matching paired delimiters.
5.3 Placing Submode Regions Precisely		Placing the region more accurately.
5.4 Defining Groups of Submodes		Grouping several classes together.
5.5 Calculating the Correct Submode		Deciding the submode at run-time.
5.6 Calculating the Correct Highlight Face		Deciding the display face at run-time.
5.7 Specifying Insertion Commands		Inserting regions automatically.
5.8 Giving Names to Submode Regions for Grouping		Naming regions for syntax grouping.
5.9 Other Hooks into the Scanning Process		Running code at arbitrary points.
5.10 Controlling the Delimiter Regions and Forms		Controlling delimiter overlays.
5.11 Miscellaneous Other Keyword Arguments		Other miscellaneous options.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

5.1 Writing Basic Submode Classes

Writing a submode class can become rather complex, if the syntax to match is complicated and you want to take advantage of some of MMM Mode's extra features. But a simple submode class is not particularly difficult to write. This section describes the basics of writing submode classes.

Submode classes are stored in the variable mmm-classes-alist. Each element of this list represents a single submode class. For convenience, the function mmm-add-classes takes a list of submode classes and adds them all to this alist. Each class is represented by a list containing the class name--a symbol such as mason or html-js---followed by pairs of keywords and arguments called a class specifier. For example, consider the specifier for the submode class embedded-css:

(mmm-add-classes
 '((embedded-css
    :submode css
    :face mmm-declaration-submode-face
    :front "<style[^>]*>"
    :back "</style>")))

The name of the submode is embedded-css, the first element of the list. The rest of the list consists of pairs of keywords (symbols beginning with a colon) such as :submode and :front, and arguments, such as css and "<style[^>]*>". It is the keywords and arguments that specify how the submode works. The order of keywords is not important; all that matters is the arguments that follow them.

The three most important keywords are :submode, :front, and :back. The argument following :submode names the major mode to use in submode regions. It can be either a symbol naming a major mode, such as text-mode or c++-mode, or a symbol to look up in mmm-major-mode-preferences (see section 3.2 Preferred Major Modes) such as css, as in this case.

The arguments following :front and :back are regular expressions (see section `Regexps' in The Emacs Manual) that should match the delimiter strings which begin and end the submode regions. In our example, CSS regions begin with a `<style>' tag, possibly with parameters, and end with a `</style>' tag.

The argument following :face specifies the face (background color) to use when mmm-submode-decoration-level is 2 (high coloring). See section 3.1 Customizing Region Coloring, for a list of canonical available faces.

There are many more possible keywords arguments. In the following sections, we will examine each of them and their uses in writing submode classes.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

5.2 Matching Paired Delimiters

A simple pair of regular expressions does not always suffice to exactly specify the beginning and end of submode regions correctly. For this reason, there are several other possible keyword/argument pairs which influence the matching process.

Many submode regions are marked by paired delimiters. For example, the tags used by Mason (see section 4.1 Mason: Perl in HTML) include `<%init>...</%init>' and `<%args>...</%args>'. It would be possible to write a separate submode class for each type of region, but there is an easier way: the keyword argument :save-matches. If supplied and non-nil, it causes the regular expression :back, before being searched for, to be formatted by replacing all strings of the form `~N' (where N is an integer) with the corresponding numbered subexpression of the match for :front. As an example, here is an excerpt from the here-doc submode class. See section 4.3 Here-documents, for more information about this submode.

:front "<<\\([a-zA-Z0-9_-]+\\)"
:back "^~1$"
:save-matches 1

The regular expression for :front matches `<<' followed by a string of one or more alphanumeric characters, underscores, and dashes. The latter string, which happens to be the name of the here-document, is saved as the first subexpression, since it is surrounded by `\(...\)'. Then, because the value of :save-matches is present and non-nil, the string `~1' is replaced in the value of :back by the name of the here-document, thus creating a regular expression to match the correct ending delimiter.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

5.3 Placing Submode Regions Precisely

Normally, a submode region begins immediately after the end of the string matching the :front regular expression and ends immediately before the beginning of the string matching the :back regular expression. This can be changed with the keywords :include-front and :include-back. If their arguments are nil, or they do not appear, the default behavior is unchanged. But if the argument of :include-front (respectively, :include-back) is non-nil, the submode region will begin (respectively, end) immediately before (respectively, after) the string matching the :front (respectively, :back) regular expression. In other words, these keywords specify whether or not the delimiter strings are included in the submode region.

When :front and :back are regexps, the delimiter is normally considered to be the entire matched region. This can be changed using the :front-match and :back-match keywords. The values of the keywords is a number specifying the submatch. This defaults to zero (specifying the whole regexp).

Two more keywords which affect the placement of the region :front-offset and :back-offset, which both take integers as arguments. The argument of :front-offset (respectively, :back-offset) gives the distance in characters from the beginning (respectively, ending) location specified so far, to the actual point where the submode region begins (respectively, ends). For example, if :include-front is nil or unsupplied and :front-offset is 2, the submode region will begin two characters after the end of the match for :front, and if :include-back is non-nil and :back-offset is -1, the region will end one character before the end of the match for :back.

In addition to integers, the arguments of :front-offset and :back-offset can be functions which are invoked to move the point from the position specified by the matches and inclusions to the correct beginning or end of the submode region, or lists whose elements are either functions or numbers and whose effects are applied in sequence. To help disentangle these options, here is another excerpt from the here-doc submode class:

:front "<<\\([a-zA-Z0-9_-]+\\)"
:front-offset (end-of-line 1)
:back "^~1$"
:save-matches 1

Here the value of :front-offset is the list (end-of-line 1), meaning that from the end of the match for :front, go to the end of the line, and then one more character forward (thus to the beginning of the next line), and begin the submode region there. This coincides with the normal behavior of here-documents: they begin on the following line and go until the ending flag.

If the :back should not be able to start a new submode region, set the :end-not-begin keyword to non-nil.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

5.4 Defining Groups of Submodes

Sometimes more than one submode class is required to accurately reflect the behavior of a single type of syntax. For example, Mason has three very different types of Perl regions: blocks bounded by matched tags such as `<%perl>...</%perl>', inline output expressions bounded by `<%...%>', and single lines of code which simply begin with a `%' character. In cases like these, it is possible to specify an "umbrella" class, to turn all these classes on or off together.

Function: mmm-add-group group classes: The submode classes classes, which should be a list of lists, similar to what might be passed to mmm-add-classes, are added just as by that function. Furthermore, another class named group is added, which encompasses all the classes in classes.

Technically, an group class is specified with a :classes keyword argument, and the subsidiary classes are given a non-nil :private keyword argument to make them invisible. But in general, all you should ever need to know is how to invoke the function above.

Function: mmm-add-to-group group classes

Adds a list of classes to an already existing group. This can be used, for instance, to add a new quoting definition to html-js using this example to add the quote characters "%=%":

(mmm-add-to-group 'html-js '((js-html
			     :submode javascript
			     :face mmm-code-submode-face
			     :front "%=%"
			     :back "%=%"
			     :end-not-begin t)))

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

5.5 Calculating the Correct Submode

In most cases, the author of a submode class will know in advance what major mode to use, such as text-mode or c++-mode. If there are multiple possible modes that the user might desire, then mmm-major-mode-preferences should be used (see section 3.2 Preferred Major Modes). The function mmm-set-major-mode-preferences can be used, with a third argument, to ensure than the mode is present.

In some cases, however, the author has no way of knowing in advance even what language the submode region will be in. The here-doc class is one of these. In such cases, instead of the :submode keyword, the :match-submode keyword must be used. Its argument should be a function, probably written by the author of the submode class, which calculates what major mode each region should use.

It is invoked immediately after a match is found for :front, and is passed one argument: a string representing the front delimiter. Normally this string is simply whatever was matched by :front, but this can be changed with the keyword :front-form (see section 5.10 Controlling the Delimiter Regions and Forms). The function should then return a symbol that would be a valid argument to :submode: either the name of a mode, or that of a language to look up a preferred mode. If it detects an invalid match--for example, the user has specified a mode which is not available--it should (signal 'mmm-no-matching-submode nil).

Since here-documents can contain code in any language, the here-doc submode class uses :match-submode rather than :submode. The function it uses is mmm-here-doc-get-mode, defined in `mmm-sample.el', which inspects the name of the here-document for flags indicating the proper mode. For example, this code should probably be in perl-mode (or cperl-mode):

print <<PERL;
s/foo/bar/g;
PERL

This function is also a good example of proper elisp hygiene: when writing accessory functions for a submode class, they should usually be prefixed with `mmm-' followed by the name of the submode class, to avoid namespace conflicts.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

5.6 Calculating the Correct Highlight Face

As explained in 5.1 Writing Basic Submode Classes, the keyword :face should be used to specify which of the standard submode faces (see section 3.1 Customizing Region Coloring) a submode region should be highlighted with under high decoration. However, sometimes the function of a region can depend on the form of the delimiters as well. In this case, a more flexible alternative to :face is :match-face. Its value can be a function, which is called with one argument--the form of the front delimiter, as with :match-submode---and should return the face to use. A more common value for :match-face is an association list, a list of pairs (delim . face), each specifying that if the delimiter is delim, the corresponding region should be highlighted with face. For example, here is an excerpt from the embperl submode class:

:submode perl
:front "\\[\\([-\\+!\\*\\$]\\)"
:back "~1\\]"
:save-matches 1
:match-face (("[+" . mmm-output-submode-face)
             ("[-" . mmm-code-submode-face)
             ("[!" . mmm-init-submode-face)
             ("[*" . mmm-code-submode-face)
             ("[$" . mmm-special-submode-face))

Thus, regions beginning with `[+' are highlighted as output expressions, which they are, while `[-' and `[*' regions are highlighted as simple executed code, and so on. Note that mmm-submode-decoration-level must be set to 2 (high decoration) for different faces to be displayed.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

5.7 Specifying Insertion Commands

As described in 2.4 Inserting new submode regions, submode classes can specify key sequences which automatically insert submode regions, with delimiters already in place. This is done by the keyword argument :insert. Its value should be a list, each element of which specifies a single insertion key sequence. As an example, consider the following insertion key sequence specifier, from the embperl submode class:

(?p embperl "Region Type (Character): "
    @ "[" str @ " " _ " " @ str "]" @)

As you can see, the specifier is a list. The first element of the list is the character `p'. (The question mark tells Emacs that this is a character object, not a one-character symbol.) In general, the first element can be any key, including both characters such as `?p' and function keys such as `return'. It can also be a dotted pair in which the first element is a modifier symbol such as meta, and the second is a character or function key. The use of any other modifier than meta is discouraged, as `mmm-insert-modifiers' is sometimes set to \(control), and other modifiers are not very portable. The second element is a symbol identifying this key sequence. The third element is a prompt string which is used to ask the user for input when this key sequence is invoked. If it is nil, the user is not prompted.

The rest of the list specifies the actual text to be inserted, where the submode region and delimiters should be, and where the point should end up. (Actually, this string is simply passed to skeleton-insert; see the documentation string of that function for more details on the permissible elements of such a skeleton.) Strings and variable names are inserted and interpolated. The value entered by the user when prompted, if any, is available in the variable str. The final location of the point (or the text around which the region is to be wrapped) is marked with a single underscore `_'. Finally, the @-signs mark the delimiters and submode regions. There should be four @-signs: one at the beginning of the front delimiter, one at the beginning of the submode region, one at the end of the submode region, and one at the end of the back delimiter.

The above key sequence, bound by default to C-c % p, always prompts the user for the type of region to insert. It can also be convenient to have separate key sequences for each type of region to be inserted, such as C-c % + for `[+...+]' regions, C-c % - for `[-...-]' regions, and so on. So that the whole skeleton doesn't have to be written out half a dozen times, there is a shortcut syntax, as follows:

(?+ embperl+ ?p . "+")

If the key sequence specification is a dotted list with four elements, as this example is, it means to use the skeleton defined for the key sequence given as the third element (?p), but to pass it the fourth (dotted) element ("+") as the `str' variable; the user is not prompted.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

5.8 Giving Names to Submode Regions for Grouping

Submode regions can be given "names" which are used for grouping. Names are always strings and are compared as strings. Regions with the same name are considered part of the same chunk of code. This is used by the syntax and fontification functions. Unnamed regions are not grouped with any others.

By default, regions are nameless, but with the :match-name keyword argument a name can be supplied. This argument must be a string or a function. If it is a function, it is passed a string representing the front delimiter found, and must return the name to use. If it is a string, it is used as-is for the name, unless :save-name has a non-nil value, in which case expressions such as `~1' are substituted with the corresponding matched subexpression from :front. This is the same as how :back is interpreted when :save-matches is non-nil.

As a special optimization for region insertion (see section 5.7 Specifying Insertion Commands), the argument :skel-name can be set to a non-nil value, in which case the insertion code will use the user-prompted string value as the region name, instead of going through the normal matching procedure.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

5.9 Other Hooks into the Scanning Process

Sometimes, even the flexibility allowed by all the keyword arguments discussed so far is insufficient to correctly match submode regions. There are several other keyword arguments which accept custom functions to be invoked at various points in the MMM-ification process.

First of all, the arguments of :front and :back, in addition to regular expressions, can be themselves functions. Such functions should "act like" a regular expression search: they should start searching at point, take one argument as a limit for the search, and return its result by setting the match data (presumably by calling some regexp matching function).

This is rarely necessary, however, because often all that is needed is a simple regexp search, followed by some sort of verification. The keyword arguments :front-verify and :back-verify, if supplied, may be functions which are invoked after a match is found for :front or :back, respectively, and should inspect the match data (such as with match-string) and return non-nil if a submode region should be begun at this match, nil if this match should be ignored and the search continue after it.

The keyword argument :creation-hook, if supplied, should be a function that is invoked whenever a submode region of this class is created, with point at the beginning of the new region. This can be used, for example, to set local variables appropriately.

Finally, the entire MMM-ification process has a "back door" which allows class authors to take control of the entire thing. If the keyword argument :handler is supplied, it overrides any other processing and is called, and passed all other class keyword arguments, instead of mmm-ify to create submode regions. If you need to write a handler function, I suggest looking at the source for mmm-ify to get an idea of what must be done.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

5.10 Controlling the Delimiter Regions and Forms

MMM also makes overlays for the delimiter regions, to keep track of their position and form. Normally, the front delimiter overlay starts at the beginning of the match for :front and ends at the beginning of the submode region overlay, while the back delimiter overlay starts at the end of the submode region overlay and ends at the end of the match for :back. You can supply offsets from these positions using the keyword arguments :front-delim and :back-delim, which take values of the same sort as :front-offset and :back-offset.

In addition, the delimiter regions can be in a major mode of their own. There are usually only two meaningful modes to use: the primary mode or a non-mode like fundamental-mode. These correspond to the following two situations:

If the delimiter syntax which specifies the submode regions is something added to the syntax of the primary mode by a pre-interpreter, then the delimiter regions should be in a non-mode. This is the case, for example, with all server-side HTML script extensions, such as See section 4.1 Mason: Perl in HTML, See section 4.6 Embperl: More Perl in HTML, and See section 4.7 ePerl: General Perl Embedding. It is also the case for literate programming such as See section 4.10 Noweb literate programming. This is the default behavior. The non-mode used is controlled by the variable mmm-delimiter-mode, which defaults to fundamental-mode.
If, on the other hand, the delimiter syntax and inclusion of different modes is an intrinsic part of the primary mode, then the delimiter regions should remain in the primary mode. This is the case, for example, with See section 4.5 CSS embedded in HTML, and See section 4.4 Javascript in HTML, since the <style> and <script> tags are perfectly valid HTML. In this case, you should give the keyword parameter :delimiter-mode with a value of nil, meaning to use the primary mode.

The keyword parameter :delimiter-mode can be given any major mode as an argument, but the above two situations should cover the vast majority of cases.

The delimiter regions can also be highlighted, if you wish. The keyword parameters :front-face and :back-face may be faces specifying how to highlight these regions under high decoration. Under low decoration, the value of the variable mmm-delimiter-face is used (by default, nothing), and of course under no decoration there is no coloring.

Finally, for each submode region overlay, MMM Mode stores the "form" of the front and back delimiters, which are regular expressions that match the delimiters. At present these are not used for much, but in the future they may be used to help with automatic updating of regions as you type. Normally, the form stored is the result of evaluating the expression (regexp-quote (match-string 0)) after each match is found.

You can customize this with the keyword argument :front-form (respectively, :back-form). If it is a string, it is used verbatim for the front (respectively, back) form. If it is a function, that function is called and should inspect the match data and return the regular expression to use as the form.

In addition, the form itself can be set to a function, by giving a one-element list containing only that function as the argument to :front-form or :back-form. Such a function should take 1-2 arguments. The first argument is the overlay to match the delimiter for. If the second is non-nil, it means to insert the delimiter and adjust the overlay; if nil it means to match the delimiter and return the result in the match data.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

5.11 Miscellaneous Other Keyword Arguments

You can specify whether delimiter searches should be case-sensitive with the keyword argument :case-fold-search. It defaults to t, meaning that case should be ignored. See the documentation for the variable case-fold-search.

[ << ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

This document was generated by XEmacs shared group account on December, 19 2009 using texi2html 1.65.