/
File I/O Restrictions

File I/O Restrictions

Non-privileged user accounts may not read or write arbitrary files on SciDB server hosts.  Their file I/O operations are limited to per-user directories, or to directories configured on the io-paths-list.

  • Relative path reads and writes occur in per-user subdirectories of the instance data directory.  Parallel file I/O occurs in these directories on each instance in the cluster.
  • Absolute path reads and writes are only permitted to directories listed configured on the io-paths-list, or to their subdirectories.  Parallel file I/O cannot be used with absolute paths.
  • The special file /dev/null is always readable and writeable by any user.


File I/O restrictions do not apply to the scidbadmin user account or to user accounts granted the admin role.  They do, however, apply to user accounts granted the operator role.  See Roles.

Affected Operators

AFL operators that perform file I/O are:

Per-user Directories

Whenever a non-administrative user reads or writes a file with a relative path name, that file will be read from or written to the appropriate per-user directory on the instance or instances involved in the query.  Users cannot escape their per-user directory using ".." path components.

Locating Per-user Directories

The location of per-user directories is dictated by the base-path configuration option and any applicable data-dir-prefix-*-* configuration options.  See Configuring SciDB.

For example, suppose the SciDB cluster is configured with a base-path of /usr/datafiles/scidb . Then the instance data directory for server 0, instance 1 would be /usr/datafiles/scidb/0/1/ .  For a user account joe , the per-user directory for file I/O would then be /usr/datafiles/scidb/0/1/users/joe/ .

Suppose the SciDB cluster configuration includes a data-dir-prefix-1-2 directive with value /vdisk1/more_data for instance 2 running on server 1.   On that instance only, user joe will have a per-user directory of /vdisk1/more_data/users/joe .

Managing Per-user Directories

Automatic creation

Per-user directories are created automatically the first time they are referenced.  For example, here newly-created user account betty does not yet have a per-user directory:

$ ls /usr/datafiles/scidb/0/0/users/betty
ls: cannot access /usr/datafiles/scidb/0/0/users/betty: No such file or directory
$ 

Betty attempts to read a non-existent file, which fails.  But afterwards, her per-user directory has been created.

$ iquery -A /tmp/betty.auth -aq "input(<s:string>[i=0:0], 'hello.txt')"
UserQueryException in file: src/query/ops/input/LogicalInput.cpp function: inferSchema line: 194
Error id: scidb::SCIDB_SE_INFER_SCHEMA::SCIDB_LE_FILE_NOT_FOUND
Error description: Error during schema inferring. File '"/usr/datafiles/scidb/0/0/users/betty/hello.txt"' not found.
input(<s:string>[i=0:0], 'hello.txt')
                        ^^^^^^^^^^^^
$ ls -ld stage/DB-bender/0/0/users/betty
drwx------. 2 scidb scidb 4096 Jul 10 17:20 /user/datafiles/scidb/0/0/users/betty/
$

Automatically created per-user directories are owned by the Linux account that runs the SciDB executable (here, scidb) and have file permissions that allow read, write, and search access (rwx) by that user and no one else.  To allow scp(1) access to the directory, the system administrator can manually alter the directory ownership and permissions as desired, but must allow at least read+search (r-x) access for the SciDB executable account.

Manual creation

Per-user directories can also be created manually by a system administrator using the base-path and data-dir-prefix-*-* values in the cluster's config.ini file, as described in the previous section.

The users parent directory of all per-user directories on an instance may be a symbolic link.  System administrators must configure this symbolic link manually.

If SciDB detects any problem with the users symbolic link or the directory it refers to, SciDB will log an error and refuse to start.  For example, the users symbolic link must not refer to any directory covered by the io-paths-list (see below), because that would allow users unrestricted access to each others' data files.

Disk space management

Files and subdirectories in per-user directories are not automatically removed or compressed, nor can they be removed by the owning user via an AFL query.  System administrators may want to periodically garbage collect or monitor space usage in these directories.

The io-paths-list Configuration Option

By default, non-administrative users may not read or write files using absolute path names.  System administrators can change this behavior by setting the io-paths-list configuration option in the config.ini file and restarting the SciDB cluster.

The io-paths-list is a colon-separated list of absolute directory names.  A typical setting for io-paths-list is

io-paths-list = /tmp:/dev/shm

This allows all cluster users to read and write files in /tmp and /dev/shm .