Configuring The secure_scan Operator
This section describes how to configure secured arrays and permissions arrays for installations that use the secure_scan operator.
This section describes the secure_scan operator integrated with the core SciDB database engine, not the deprecated external plugin version. If you are upgrading a cluster that used the plugin version, YOU MUST unload_library('secure_scan')
BEFORE PERFORMING THE UPGRADE. See the Release Notes.
Secured Arrays
Secured arrays are ordinary stored arrays that are divided into datasets along a dataset dimension. By default the dataset dimension is called dataset_id
, but other names can be configured (see below).
Best practice places the dataset dimension as the first dimension of the secured arrays' schema, and uses a chunk size of 1. For example,
<red:int16, green:int16, blue:int16>[dataset_id=0:*:0:1; x=0:1919:0:1920; y=0:1079:0:1080]
This way, each physical chunk of the secured array represents data from exactly one dataset, and SciDB does not need to filter individual chunks to separate cells from different datasets.
Since the public
namespace is world-readable, secured arrays must reside in a non-public namespace. Ordinary cluster users should be granted list
access (but not read
access) to the secured array namespace. See Namespaces and Permissions .
There are three circumstances when a secure_scan user can access all datasets in a secured array:
User
scidbadmin
can read all of a secured array.Any user with the
admin
role can read all of a secured array. See Roles .Any user with
read
access to the secured array's namespace can read all arrays in that namespace.
Granting namespace read
access to a user allows the user to read the entire array. This may be desirable for certain privileged users who are permitted full access to all datasets in the secured namespace. However, it is contrary to the motivation for using secure_scan in
the first place.
Permissions Arrays
A permissions array is a two-dimensional array with user and dataset dimensions. The first bool
attribute in a permissions array cell determines access. Here is an example permissions array schema:
AFL% create array perms.permarray <allow:bool> [ user_id=-1:*; dataset_id=0:* ] ;
If allow
is true in the cell at location {12,5}, then user 12 will be able to read dataset 5 when she calls secure_scan on any secured array bound to permissions array perms.permarray
. (You can discover the user ids of particular users with the list('users') operator.)
The user id -1 is special. If the permissions array cell at {-1,7} is true, then dataset 7 is a public dataset accessible to all secure_scan users.
In SciDB release 22.5 and earlier, create permissions arrays with distribution replicated
to save overhead for secure_scan calls that reference it:
AFL% create array perms.permarray <allow:bool> [ user_id=-1:*; dataset_id=0:* ]
CON> distribution replicated;
Since permissions arrays are typically small, the storage cost of keeping them replicated is also small. If the permissions array is not stored replicated, secure_scan will replicate (but not store) it during each invocation. (In later releases, permissions arrays are cached in memory, so the performance penalty for not replicating is eliminated for most queries.)
Permissions arrays should reside in a namespace where only administrators have access.
Neither secured arrays nor permissions arrays may reside in the public
namespace.
Permissions arrays may not be temp arrays. They should not be empty.
secure-scan-config Configuration Parameter
The secure_scan operator can work only on secured arrays that are bound to permissions arrays.
Secured arrays are bound to permissions arrays using the secure-scan-config
SciDB configuration parameter in the cluster config.ini file. The value of secure-scan-config
is a JSON object describing all permissions arrays and secured array bindings.
You must restart the SciDB cluster for any change to the secure-scan-config
parameter to take effect. Changes made with _setopt will have no effect.
After restarting the cluster, run a few secure_scan test queries and examine the coordinator instance scidb.log file for errors. Some configuration errors can't be detected until secure_scan tries to use the permissions array binding.
Here is an example config.ini secure-scan-config
setting:
The entire multi-line JSON string is enclosed in single quotes.
There are two sections,
permissions
andsecured
. Each section contains a JSON array of descriptor objects.Descriptor objects contain an array entry with the fully qualified name of the referent secured array or permissions array.
The
perms.permissions
array on line 3 is the default permissions array. If a secure_scan secured array is not mentioned in thesecured
section, this permissions array will be used. There are nodataset-dim
oruser-dim
qualifiers on this entry, soperms.permissions
uses the default names for these dimensions,dataset_id
anduser_id
.The
perms.sis_id
permissions array on line 4 has a custom dataset dimension name,sis_id
. Secured arrays that use this permissions array must having a matching dataset dimension name.The
perms.uid
array on line 5 has a custom user dimension name,uid
. The secure_scan operator will use this dimension rather thanuser_id
to match against the calling user's actual id. You can customize either or both dimension names for each permissions array.The
secured
section starts on line 6 and contains bindings from secured data arrays to permission arrays. When secure_scan is called on an array that doesn't have an entry in thesecured
section, secure_scan will use the default permissions array if one is defined, otherwise it will abort the query with an error.Wildcarding the array name portion of a
secured
entry's fully qualified name using*
(lines 7 and 9) gives all arrays in that namespace access using the permissions array named in theperm
element.Even if a namespace-wide binding is in effect, individual arrays within a namespace can always independently specify their own permissions array binding (lines 8 and 10).
No permissions array or secured array can be in the
public
namespace. Thosesecure-scan-config
entries will be ignored.
Â