Syntax and lexical analysis¶
Global structure¶
LSP is a strongly typed, block structured programming language.
An LSP program is made of a sequence of functions located in one or multiple LSP
file(s) called “module(s)”. Each function contains a block of code consisting of
a list of expressions or statements surrounded by braces {}
.
The modeler is case-sensitive. Each instruction is separated from others by
character ;
.
Encoding¶
The modeler is able to handle many encodings and different charsets (UTF-8, all ISO-8859 standards and many windows code pages). For the complete list, please consult the Charset module.
The default encoding for a file is ISO-8859-1 (latin1), except if it starts with a byte-order mark (BOM). In that case, LocalSolver assumes that the file is encoded with UTF-8 or UTF-16 accordingly. If an unmappable character or an invalid byte sequence is encountered by the parser, an error will be thrown.
You can declare a different encoding with a special comment to put at
the beginning of your file. This comment must start with #
, contains the
word coding
followed by a colon or an equals sign followed by the name
of the encoding you want. This special comment must appear on the first or the
second line of the file. If it is the second line, the first line must be a
shebang comment. The following pattern is an example of a valid encoding
declaration:
# coding: <encoding-name>
where <encoding-name>
must be a valid and recognized encoding name.
For the complete list of supported encodings and their aliases, please
consult the Charset module.
Comments¶
Comments have no impact on the execution of the program. Three kinds of comments are allowed:
Mono-line comments. They are prefixed with characters
//
. Anything between these two characters and the end of the line is ignored.Multi-line comments. They start with characters
/*
and end when characters*/
are encountered. A multi-line comment must be closed. Nesting multi-line comments are forbidden, thus/* Comment 1 /* Comment 2 */ */
is forbidden.Special declaration comments. These declarations must start with
#
and are only allowed at the beginning of LSP files. For now, two kinds of declarations are supported:- The ‘shebang’ declaration starting with
#!
. This declaration is only allowed on the first line of the LSP program. Under Unix systems it allows specifying the interpreter executing the program (here LocalSolver). - The encoding declaration that must follow the regex
pattern
coding[=:]\s*[-\w]+
. As detailed above, this declaration switch the encoding of the file.
- The ‘shebang’ declaration starting with
Identifiers¶
Identifiers are used as variable or function names in LSP files. An identifier can only be composed of alphanumeric characters (latin letters) or underscores. It cannot start with a digit. Identifiers, as the rest of the modeler, are case-sensitive. There are described by the following lexical definition:
identifier : ("_" | letter) ("_" | letter | digit)*
letter : lowercase | uppercase
lowercase : "a".."z"
uppercase : "A".."Z"
digit : "0".."9"
Thus,
identifier
is different fromIdeNtiFier
_ident
is a valid identifier0ident
is not a valid identifier (it starts with a digit)àÀéÉùÛ
is not a valid identifier (accented characters are not supported for identifiers).안녕하세요
is not a valid identifier (non-latin letters are not supported for identifiers).for
is not a valid identifier (reserved keyword, see below)
Keywords¶
Keywords are reserved words having a specific significance for the modeler. You cannot use these keywords as variable or function name. Their use is subject to syntaxic rules described later in this document. Some keywords are reserved for future use.
Keywords having a specific significance:
true false nil nan inf
function local return this use
while do break continue
for in if else
minimize maximize constraint
try throw catch
is typeof
Changed in version 3.5: Keywords try, throw and catch added to implement exceptions.
-
Changed in version 5.5:
use keyword added to implement modules.
this keyword used to refer to current object
is and typeof keywords added to implement type introspection.
Keywords reserved for future use:
const var self import final
goto switch case class object
Literals¶
Literals represent constant values of some built-in types.
String literals¶
A string starts with character "
and ends with character "
.
A string can span on several lines. No limit is set on the length of the string.
Unlike identifiers, any unicode character is allowed between the two quotes of
the string, except backslashes and quotes which must be introduced through
escape sequences (see below).
Thus,
- “Simple literal” is valid
- “こんにちは (hello)” is valid.
- “안녕하세요 (hello)” is valid.
- “string literal \ invalid” is not valid : backslash is forbidden in a string literal.
Escape sequences¶
Some characters can be introduced through escape sequences. Escape sequences
are also the only way to write backslashes or quotes in a string.
An escape sequence starts with character \
(backslash) followed by a
letter or ASCII character.
The recognized escape sequences are:
Escape sequence | Associated character |
---|---|
\\ |
Backslash (\) |
\' |
Single quote |
\" |
Double quotes |
\b |
ASCII Backspace (U+0008) |
\t |
ASCII Horizontal Tabulation (U+0009) |
\n |
ASCII Linefeed (U+000A) |
\f |
ASCII Formfeed (U+000C) |
\r |
ASCII Carriage return (U+000D) |
\uxxxx |
Unicode character with 16-bit hexadecimal value |
\Uxxxxxxxx |
Unicode character with 32-bit hexadecimal value |
If the parser encounters an unrecognized escape sequence or an invalid
unicode character, it will throws an error. Thus "foo \c"
will
throw an error since "\c"
is not recognized as a valid escape
sequence. Same thing for "\uDBFF"
which will throw an error
since it is not a valid unicode character.
Integer literals¶
An integer is a sequence of 0-9 digits which does not start with 0. Only the decimal form is allowed and can be written. They are described by the following lexical definition:
integer : nonzerodigit digit* | "0"
digit : "0".."9"
nonzerodigit : "1".."9"
If a number written in the LSP file exceeds the allowed capacity, an error will
be thrown when parsing this number. Note that integer literals do not include
a sign. Thus, -42
is actually an expression composed of the unary operator
-
and the integer literal 42
.
Thus,
1234
is a valid integer literal01234
is not a valid integer literal (it starts with 0)100000000000000000000000
is not a valid integer literal (exceeds allowed capacity)
Floating point literals¶
LocalSolver handles double precision floating point numbers with point notation (e.g. 3.467) or exponential notation (e.g. 8.75e–11). They are described by the following lexical definition:
float : pointfloat | expfloat | "inf" | "nan"
pointfloat : digit* "." digit+
expfloat : (digit+ | pointfloat) "e" ["+" | "-"] digit+
digit : "0".."9"
The literal inf
denotes the infinity. The literal nan
denotes the
special floating value “not a number” (NaN) representing an undefined or
unrepresentable value (see IEEE 754 floating-point standard for more
explanations on this). Note that floating point literals do not include a sign:
-42.45
is actually an expression composed of the unary operator ‘-‘ and
he floating point literal 42.45
.
Thus,
12.45
is a valid floating point literal.4522
is a valid floating point literal4566e-12
is a valid floating point literal.e-45
is not a valid floating point literal
White spaces¶
White spaces and line breaks have no particular meaning in the modeler. They are merely ignored. Nonetheless, a white character is necessary to split two keywords, two identifiers or two literals.