Syntax and lexical analysis¶

Global structure¶

LSP is a strongly typed, block structured programming language. An LSP program is made of a sequence of functions located in one or multiple LSP file(s) called “module(s)”. Each function contains a block of code consisting of a list of expressions or statements surrounded by braces {}. The modeler is case-sensitive. Each instruction is separated from others by character ;.

Encoding¶

The modeler is able to handle ASCII and ASCII-extended characters on 8 bits only. UTF-8, UTF-16 or other encodings are not supported. Any use of a non-supported encoding can produce errors or unexpected behaviors.

Warning

Most of the time, the modeler will not produce any error with UTF-8 files, but string related functions will not work properly.

Comments¶

Comments are ignored by the parser of the modeler. They have no impact on the execution of the program. Three kinds of comments are allowed:

Mono-line comments. They are prefixed with characters //. Anything between these two characters and the end of the line is ignored.
Multi-line comments. They start with characters /* and end when characters */ are encountered. A multi-line comment must be closed. Nesting multi-line comments are forbidden, thus /* Comment 1 /* Comment 2 */ */ is forbidden.
The ‘shebang’ comment. It is a special comment starting with with #! only allowed on the first line of the LSP program. Under Unix systems it allows specifying the interpreter executing the program (here LocalSolver). Any occurence of characters #! elsewhere in the file will generate an error.

Identifiers¶

Identifiers are used as variable or function names in LSP files. An identifier can only be composed of alphanumeric characters or underscores. It cannot start with a digit. Identifiers, as the rest of the modeler, are case-sensitive. There are described by the following lexical definition

identifier :  ("_" | `letter`) ("_" | `letter` | `digit`)*
letter     :  `lowercase` | `uppercase`
lowercase  :  "a".."z"
uppercase  :  "A".."Z"
digit      :  "0".."9"

Thus,

identifier is different from IdeNtiFier
_ident is a valid identifier
0ident is not a valid identifier (it starts with a digit)
for is not a valid identifier (reserved keyword, see below)

Keywords¶

Keywords are reserved words having a specific significance for the modeler. You cannot use these keywords as variable or function name. Their use is subject to syntaxic rules described later in this document. Some keywords are reserved for future use.

Keywords having a specific significance:

true        false       nil         nan         inf
function    local       return      this        use
while       do          break       continue
for         in          if          else
minimize    maximize    constraint
try         throw       catch
is          typeof

Changed in version 3.5: Keywords try, throw and catch added to implement exceptions.

Changed in version 5.5:

use keyword added to implement modules.
this is no longer a reserved keyword.
is and typeof keywords added to implement type introspection.

Keywords reserved for future use:

const       var         self        import      final
goto        switch      case        class       object

Literals¶

Literals represent constant values of some built-in types.

String literals¶

A string starts with character " and ends with character ". A string can span on several lines. No limit is set on the length of the string. Any character is allowed between the two quotes in a string except backslashes and quotes.

Thus,

“Simple literal” is valid
“string literal \ invalid” is not valid : backslash is forbidden in a string literal (except for escape sequences, see below).

Escape sequences¶

Some characters need to be introduced through escape sequences. An escape sequence starts with character \ (backslash) followed by a letter or ASCII character.

The recognized escape sequences are:

Escape sequence	Associated character
\	Backslash ()
‘	Single quote
“	Double quotes
t	ASCII Horizontal Tabulation
r	ASCII Carriage return
n	ASCII Linefeed
b	ASCII Backspace
f	ASCII Formfeed

If the parser encounters an unrecognized escape sequence, it will throws an error. Thus "foo \c" will throw an error since "\c" is not recognized as a valid escape sequence.

Integer literals¶

An integer is a sequence of 0-9 digits which does not start with 0. Only the decimal form is allowed and can be written. They are described by the following lexical definition:

integer        :  `nonzerodigit` `digit`* | "0"
digit          :  "0".."9"
nonzerodigit   :  "1".."9"

If a number written in the LSP file exceeds the allowed capacity, an error will be thrown when parsing this number. Note that integer literals do not include a sign. Thus, -42 is actually an expression composed of the unary operator - and the integer literal 42.

Thus,

1234 is a valid integer literal
01234 is not a valid integer literal (it starts with 0)
100000000000000000000000 is not a valid integer literal (exceeds allowed capacity)

Floating point literals¶

LocalSolver handles double precision floating point numbers with point notation (e.g. 3.467) or exponential notation (e.g. 8.75e–11). They are described by the following lexical definition:

float          :  `pointfloat` | `expfloat` | "inf" | "nan"
pointfloat     :  [`digit`+] "." `digit`+
expfloat       :  (`digit`+ | `pointfloat`) "e" ["+" | "-"] `digit`+
digit          :  "0".."9"

The literal inf denotes the infinity. The literal nan denotes the special floating value “not a number” (NaN) representing an undefined or unrepresentable value (see IEEE 754 floating-point standard for more explanations on this). Note that floating point literals do not include a sign: -42.45 is actually an expression composed of the unary operator ‘-‘ and he floating point literal 42.45.

Thus,

12.45 is a valid floating point literal
.4522 is a valid floating point literal
4566e-12 is a valid floating point literal
.e-45 is not a valid floating point literal

White spaces¶

White spaces and line breaks have no particular meaning in the modeler. They are merely ignored. Nonetheless, a white character is necessary to split two keywords, two identifiers or two literals.