COHERENT manpages
This page displays the COHERENT manpage for lex [Lexical analyzer generator].
List of available manpages
Index
lex -- Command Lexical analyzer generator lex [-t][-v][file] cc lex.yy.c -ll Many programs, e.g., compilers, process highly structured input according to rules. Two of the most complicated parts of such programs are lexical analysis and parsing (also called syntax analysis). The COHERENT system includes two powerful tools called lex and yacc to help you construct these parts of a program. lex converts a set of lexical rules into a lexical analyzer, and yacc converts a set of parsing rules into a parser. The output of lex may be used directly, or may be used by a parser generated by yacc. lex reads a specification from the given file (or from the standard input if none), and generates a C function called yylex(). lex writes the generated function in the file lex.yy.c, or on standard output if you use the -t option. The -v option prints some statistics about the generated tables. The tutorial on lex that appear in this manual describes lex in detail. In brief, the generated function yylex() matches portions of its input to one pattern (sometimes called a regular expression) from a set of rules, or context, and executes associated C commands. Unmatched portions of the input are copied to the output stream. yylex() returns EOF when input has been exhausted. lex uses the following macros that you may replace with the preprocessor directive #undef if you wish: input() (read the standard input stream), and output(c) (write the character c to the standard output stream). You may also replace the following functions if you wish: main() (main function), error(...) (print error messages; takes same arguments as printf), and yywrap() (handle events at the end of a file). If an action is desired on end of file, such as arranging for more input, yywrap() should perform it, returning zero to keep going. A full lex specification has the following format: -> Macro definitions, of the form: name pattern -> Start condition declarations: %S NAME ... -> Context declarations: %C NAME ... -> Code to be included in the header section: %{ anything %} <tab or space> anything -> Rules section delimiter (must always be present): %% -> Code to appear at the start of yylex(): <tab or space> anything -> Rules for initial context, in any of the forms: rule action; rule | (means use next action) rule { <tab or space> action; <tab or space> } -> For each additional context: %C NAME ...rules for this context... -> End of rules section delimiter: %% -> Code to be copied verbatim, such as user provided input(), output(), yywrap(), or other. lex matches the longest string possible; if two rules match the same length string, the rule specified first takes precedence. lex puts the matched string, or token, in the char array yytext[], and sets the variable yyleng to its length. Actions may use the following: ECHO...........Output the token REJECT.........Perform action for lower precedence match BEGIN NAME.....Set start condition to NAME BEGIN 0........Clear start condition yyswitch(NAME).Switch to context NAME, return current yyswitch(0)....Switch to initial context yynext().......Steal next character from input yyback(c)......Put character c back into input yyless(n)......Reduce token length to n, put rest back yymore().......Append next token to this one yylook().......Returns number of chars in input buffer lex rules are contiguous strings of the form [ <NAME,...> ][ ^ ] token [ /lookahead ][ $ ] where brackets `[]' indicate optional items. <NAME,...>Match only under given start conditions ^..............Match the beginning of a line $..............Match the end of a line token..........Pattern that a given token is to match /lookahead.....Pattern that given trailing text is to match Pattern elements: a..............The character a \a.............The character a, even if special ...............Any character except newline [abx-z]........Any of a, b, or x through z [^abx-z].......Any except a, b, or x through z abc............The string abc, even if any are special {name}.........The macro definition name (exp)..........The pattern exp (grouping operator) Optional operators on elements: e?.............Zero or one occurrence of e e*.............Zero or more consecutive es e+.............One or more consecutive es e{n}...........n (a decimal number) consecutive es e{m,n}.........m through n consecutive es Patterns may be of the form: e1e2...........Matches the sequence e1 e2 e1|e2..........Matches either e1 or e2 lex recognizes the standard C escapes: \n, \t, \r, \b, \f, and \ooo (octal representation). The special characters \ ( ) < > { } % * + ? [ - ] ^ / $ . | must be prefixed with \ or enclosed within quotation marks (excepting " and \) to be normal. Within classes, only the characters . ^ - \ and ] are special. Files /usr/lib/libl.a /usr/src/libl/* -- library source code See Also commands, yacc Introduction to lex, the Lexical Analyzer