filter
The filter operator produces a result array, filtering out elements based on a supplied boolean expression.
Synopsis
filter(array,expression);
Summary
The filter operator filters out data from an array based on an expression over the attribute and dimension values. The result array has the same schema as the supplied array. Each cell of the result array is either empty or identical to the corresponding cell of the supplied array according to the following rules. If the expression at the cell evaluates to:
- TRUE, the result cell is identical to the supplied cell.
- FALSE, the result cell is empty.
- NULL, the result cell is empty.
Examples
Filter an Array to Remove Outlying Values
To filter an array to remove outlying values, do the following:
Create a 4×4 array:
AFL% CREATE ARRAY m4x4<val:double>[i=0:3; j=0:3];
Put values between 0 and 15 into the non-diagonal elements of m4x4 and values greater than 100 into the diagonal elements:
AFL% store(build(m4x4,iif(i=j,100+i,i*4+j)),m4x4);
The output is:[ [(100),(1),(2),(3)], [(4),(101),(6),(7)], [(8),(9),(102),(11)], [(12),(13),(14),(103)] ]
Filter all values of 100 or greater out of m4x4:
AFL% filter(m4x4,val<100);
The output is:[ [(),(1),(2),(3)], [(4),(),(6),(7)], [(8),(9),(),(11)], [(12),(13),(14),()] ]
Use a Regular Expression to Filter the List of Available SciDB Operators
To use a regular expression to filter the list of available SciDB operators, and return the operators containing
the letter 'q', do the following:
AFL% filter(list('operators'),regex(name,'(.*)q(.*)'));
The output is:
name,library 'quantile','scidb' 'uniq','scidb'
Hint on Filtering by Dimension
Since 2017, SciDB's filter() operator is significantly improved over previous releases, when the expression involves dimension ranges. Query processing will focus on data in the dimension ranges. So the speedup over previous releases is more if the specified dimension ranges cover a smaller portion of the space.
In order to receive full benefit of the optimization, structure your expression in the form of ((dimension_conjunctions or ...) and other_conditions). For instance, let your array have two dimensions x and y; in order to filter for records in the first or the third quadrants, and with some other conditions, write your expression as: (((x>0 and y>0) or (x<0 and y<0)) and other_conditions).