CSV module¶
This module implements the CSV (Comma-Separated Values) format, partly specified by RFC 4180.
Note
To use the features of this module, you have to put a
special import statement at the begining of your LSP file: use csv;
Functions¶
-
csv.
parse
(reader)¶ -
csv.
parse
(reader, options) -
csv.
parse
(filename) -
csv.
parse
(filename, options) -
csv.
parse
(filename, charset) -
csv.
parse
(filename, charset, options) Reads the CSV file and returns a
csvcontent
. You can provide the filename to parse or you can directly provide the stream to read, previously opened withio.openRead()
. When using a filename, you can also specify the encoding. If no encoding is provided, ISO-8859-1 is assumed.Several options can be used to customize the behavior of the parser. These options must be specified in a map. The supported options are detailed at the end of this page.
Parameters: - filename (string) – Path to the CSV file to convert.
- stream (streamreader) – Stream previously opened with
io.openRead()
. - charset (charset) – Encoding used to convert bytes to characters.
- options (map) – Optional parameters to customize the behavior of the parser.
Return type: csvcontent
-
csv.
deserialize
(content)¶ -
csv.
deserialize
(content, options) Identical to
parse()
but the CSV content is taken from a string instead of a file.Parameters: - content (string) – CSV content to parse.
- options (map) – Optional parameters to customize the behavior of the parser.
Return type: csvcontent
-
csv.
read
(reader)¶ -
csv.
read
(reader, options) -
csv.
read
(filename) -
csv.
read
(filename, options) -
csv.
read
(filename, charset) -
csv.
read
(filename, charset, options) Opens the CSV file and returns a
csvreader
useful to read the file line by line.Similarly to the parser method, you can provide the filename to parse or you can directly provide the stream to read, previously opened with
io.openRead()
. When using a filename, you can also specify the encoding. If no encoding is provided, ISO-8859-1 is assumed.Several options can be used to customize the behavior of the parser. These options must be specified in a map. The supported options are detailed at the end of this page.
Parameters: - filename (string) – Path to the CSV file to convert.
- stream (streamreader) – Stream previously opened with
io.openRead()
. - charset (charset) – Encoding used to convert bytes to characters.
- options (map) – Optional parameters to customize the behavior of the parser.
Return type: csvreader
Types¶
-
type
csvcontent
¶ -
nbRows
¶ Returns the number of rows.
Return type: int
-
nbCols
¶ Returns the number of columns.
Return type: int
-
colNames
¶ Returns the columns names. If no column names were found or provided,
nil
is returned.Return type: map (array of strings) or nil
-
cols
¶ Returns all columns as a map, indexed by column number. For each column, rows are indexed by number.
Return type: map
-
colsByName
¶ Returns all columns as a map, indexed by column name. For each column, rows are indexed by number.
If no column names were found or provided, this method will throw an error.
Return type: map
-
rows
¶ Returns all rows as a map, indexed by row number. For each row, columns are indexed by column number.
Return type: map
-
rowsByColName
¶ Returns all rows as a map, indexed by row number. For each row, columns are indexed by column name.
If no column names were found or provided, this method will throw an error.
Return type: map
-
-
type
csvreader
¶ -
rowNumber
¶ Returns the number of read rows.
Return type: int
-
colNames
¶ Returns the columns names. If no column names were found or provided,
nil
is returned.Return type: map (array of strings) or nil
-
nextRow
()¶ Read the next line of the CSV file. It returns the read row indexed by column numbers, or
nil
if the end of file is reached.Return type: map or nil
-
nextRowByColName
()¶ Read the next line of the CSV file. It returns the read row indexed by column names, or
nil
if the end of file is reached.If no column names were found or provided, this method will throw an error.
Return type: map or nil
-
Options summary¶
Global options¶
The following options apply to the entire CSV file.
Option name | Type | Default value | Description |
---|---|---|---|
delimiter | string (length 1) | nil | Character used to delimit columns. When the value is nil ,
the parser will automatically guess the most likely delimiter
among {, , ; , \t , | }. |
decimal | string (length 1) | . |
Character to recognize as decimal point. If the column
delimiter is ; (guessed or defined) and this parameter is
not overrided, the character used will be , . |
quote | string (length 1) | " |
Character used to denote the start and end of a quoted item. If the quoted items include column and/or row delimiters, they will be added to the string and their delimiter meaning will not be applied. |
escape | string (length 1) | " |
Character used to escape other characters. |
headerRow | int | 0 | Row number used as the column names, and the start of the data. If column names are specified in the columnOptions, the names found in this row will be overrided. If a negative number is set, no column headers will be parsed from the file and only the names specified in the columnOptions will be used (if present). |
skipLines | map (array of ints) | nil | Line numbers to skip (0-indexed). |
skipEmptyLines | bool | true | True to ignore the empty lines between the records, false to translate empty lines to empty records. An empty line is a blank line with no fields or a line with only empty fields. Note that if the useDefaultEmpty policy is activated on at least one column, the fields will not be considered empty and thus the line will not be ignored. |
trimWhitespace | bool | false | Trim leading and trailing spaces for each string field that is not between quotes. |
longLinePolicy | string | ignoreCols | Specifies what to do upon encountering a line with too many fields. Allowed values are :
|
shortLinePolicy | string | fillMissingCols | Specifies what to do upon encountering a line with too few fields. Allowed values are :
|
nanValues | map (array of strings) | Values to consider as nan . Default is {“#N/A”,
“#N/A N/A”, “#NA”, “-NaN”, “-nan”, “<NA>”, “N/A”, “NA”,
“NaN”, “n/a”, “nan”}. |
|
infValues | map (array of strings) | Values to consider as inf . Default is {“inf”, “Inf”}. |
|
trueValues | map (array of strings) | Values to consider as true . Default is {“true”, “True”,
“1”}. |
|
falseValues | map (array of strings) | Values to consider as false . Default is {“false”,
“False”, “0”}. |
|
nilValues | map (array of strings) | Values to consider as nil . Default is {“nil”, “null”,
“NULL”}. |
|
columnOptions | map | nil | Options for each column, indexed by column number (see below). |
internStrings | bool | false | Tells the parser to reuse the same strings rather than creating new duplicated ones. This option reduces the memory consumption of large CSVs but slightly decreases the parsing speed. |
For the parser to work properly, the options must meet the following:
- The characters used in delimiter, decimal and quote must all be different. In addition, they must not represent a line break. The line breaks supported are LF (
\n
) and CRLF (\r\n
).- If a string is present in one of the arrays representing the values true, false, nan, inf or nil, then it cannot be present in another of these arrays.
- The column names must be all different and of type string. If
nil
is found in the header row, the column name will be created asUnnamed: {column_index}
.
Column options¶
The following options are applied per column, and must be specified with the global option columnOptions.
Option name | Type | Default value | Description |
---|---|---|---|
name | string | nil | Column name to use. When the value is nil , the value is
automatically parsed from the header row. |
type | string | nil | Type of values expected in the column. Allowed values are
“bool”, “int”, “float” and “string”. If the type is nil ,
the parser will automatically guess the type according to
the parsed value. |
errorValuePolicy | string | setNil | Specifies what to do upon encountering a value that cannot be parsed in the specified type. Allowed values are :
Note that this parameter will have no effect if the type of the column is not specified, or if the type is “string”. |
defaultErrorValue | Value to be used when the policy
Note that this parameter will have no effect if the type of the column is not specified, or if the type is “string”. |
||
emptyValuePolicy | string | setNil | Specifies what to do upon encountering an empty value without quotes. Allowed values are :
|
defaultEmptyValue | Value to be used when the policy
|