save

save

The save operator saves array data to a file.

In Enterprise Edition, file I/O restrictions may apply.

Synopsis

save(src_array, file_path [, instance_id [, format]] )

save(src_array, file_path [, instance: instance_id ] [, format: format ] )

Summary

The AFL save operator saves the data from the cells of a SciDB array into a file. By default, it saves the data in SciDB text format. To specify a different output format, use the format parameter.

Use the second form to specify parameters with named keywords.  For example, you can say save(A, '/tmp/mydata', format: 'tsv') to save TSV data without having to remember that -2 is the default value of instance_id

 The save operator saves the latest version of the array by default.  To save a particular version, use arrayName@versionNumber for the source array, for example MYARRAY@5.   To save all versions of an array, refer to the -a/--all-versions option of scidb_backup.py.

Inputs

  • src_array:  The source array containing the data you want to save to a file.

  • file_path:  The complete path to the file to receive the data.

  • instance_id:   Optional. Specifies the instance for performing the save. The default saves all data using the query coordinator instance, that is, the instance to which the client program is connected. The value must be one of the following:

  • format:   Optional. The format string lets you specify how to save the data. The default format is SciDB-formatted text. Note that you must include the instance_id parameter to specify an output format. The format string has two parts. The first part indicates the type of file to create.  These may be:

Even though text is the default format, use a CSV or TSV format to save large amounts of data in non-binary form.  The text family of formats has subtle dependencies on array chunk sizes, and is not suitable for data interchange.

 

 

To preserve the binary value of floating point numbers, saving your output in opaque and/or binary formats is recommended. Saving floating point values in any other format can be lossy. Files saved in text, CSV, and TSV formats from prior releases used only six significant decimal digits. SciDB 16.9 saves double values with 15 significant decimal digits in text, CSV, and TSV formats.


For CSV and TSV formats, the second (optional) part of the format string consists of a colon followed by one or two option specifier characters.  Most of these options control how null values appear in the saved file. Note that these options affect only null values whose missing reason code is zero (?0). For null values with other missing reason codes (?1 - ?127), the output is unaffected. For example, a null value of ?42 will always appear in the TSV or CSV file as ?42.  See Special Values for Attributes for more about missing reason codes.

The specifier characters are:

Examples

For these examples, create a sparse two-dimensional array from a data file.  Some of the attributes are null (?0, ?42).

$ cat /tmp/example.tsv 0|0|?0|5.0 1|1|Marie Curie|3.14159265358979 2|2|Carl Sagan|?42 $ iquery -a AFL% create array saveMe<who:string,val:double>[i=0:2; j=0:2]; Query was executed successfully AFL% store( redimension( input(<i:int64,j:int64,who:string,val:double>[row=0:*],  '/tmp/example.tsv', -2, 'tsv:p'), -- :p means pipe '|' is field separator  saveMe),  saveMe); Query was executed successfully AFL% scan(saveMe); {i,j} who,val {0,0} null,5 {1,1} 'Marie Curie',3.14159 {2,2} 'Carl Sagan',?42 AFL% exit;

Save in SciDB 'text' Format

The default text format:

$ iquery -naq "save(saveMe, '/tmp/text.out')" Query was executed successfully $ cat /tmp/text.out {0,0}[[(null,5),(),()],[(),('Marie Curie',3.14159265358979),()],[(),(),('Carl Sagan',?42)]]

Save in SciDB 'sparse' Format

The sparse format.  No placeholders; each non-contiguous cell is marked with {i,j} coordinate pairs.

$ iquery -naq "save(saveMe, '/tmp/sparse.out', -2, 'sparse')" Query was executed successfully $ cat /tmp/sparse.out {0,0}[[{0,0}(null,5),{1,1}('Marie Curie',3.14159265358979),{2,2}('Carl Sagan',?42)]]

Save in CSV Format

The csv format.  Notice that cell coordinates do not save.

$ iquery -naq "save(saveMe, '/tmp/csv.out', -2, 'csv')" Query was executed successfully $ cat /tmp/csv.out null,5 'Marie Curie',3.14159265358979 'Carl Sagan',?42

Save in TSV+ Format

The tsv+ format records the cell coordinates.  The :l ("colon ell") option specifier produces a nameline.  Notice that the ordinary null appears as \N, but the missing reason 42 null appears as ?42 (you can control this behavior with option specifiers).

$ iquery -naq "save(saveMe, '/tmp/tsvplus.out', -2, 'tsv+:l')" Query was executed successfully $ cat /tmp/tsvplus.out i j who val 0 0 \N 5 1 1 Marie Curie 3.14159265358979 2 2 Carl Sagan ?42

Save in Binary Format

Schema attributes are nullable by default, but types in binary format strings are not.  See Binary Files for how to interpret the od program output.

$ iquery -naq "save(saveMe, '/tmp/binary.out', -2, '(string null, double null)')" Query was executed successfully $ od -c /tmp/binary.out 0000000 \0 \0 \0 \0 \0 377 \0 \0 \0 \0 \0 \0 024 @ 377 \f 0000020 \0 \0 \0 M a r i e C u r i e \0 377 0000040 021 - D T 373 ! \t @ 377 \v \0 \0 \0 C a r 0000060 l S a g a n \0 * \0 \0 \0 \0 \0 \0 \0 0000100 \0 0000101

 

Remove the array:

$ iquery -naq "remove(saveMe)"