quantile
 The quantile operator returns the quantiles of the specified array.
Synopsis
quantile( srcArray, q_num [ , attribute [ , dimension_n ... ] ] )
Summary
A q_quantile is a point taken at a specified interval on a sorted data set that divides the data set into q subsets. The quantiles are the data values marking the boundaries between consecutive subsets.
You specify the source array (srcArray) and the number of quantiles (q_num). Optionally, you can specify an attribute and a dimension for grouping. To group by one or more dimensions, you must specify the attribute.
Note the following:Â
- The quantile operator returns q_num+1 values, which correspond to the lower and upper bounds for each subset.
- The quantile operator returns the same datatype as the attribute.
- The q_num argument must be a positive integer. Otherwise SciDB returns an error.
Inputs
The quantile operator takes the following arguments:
- srcArray: A source array with one or more attributes and one or more dimensions.
- q_num: The number of quantiles.
- attribute:Â An optional attribute to use for the quantiles. If you don't specify an attribute, SciDB uses the first one.
- dimension_n: An optional list of dimensions to group by.
Examples
Calculate the 2-Quantile for a 1-Dimensional Array
To calculate the 2-quantile for a 1-dimensional array, do the following:
Create a 1-dimensional array called quantile_array:
AFL% create array quantile_array <val:int64>[i=0:10];
The output is:Query was executed successfully.Â
Put eleven numerical values between 0 and 11 into quantile_array:
AFL% store(build(quantile_array, '[10,3,0,3,4,5,9,11,7,3,3]', true), quantile_array);
The output is:{i} val {0} 10 {1} 3 {2} 0 {3} 3 {4} 4 {5} 5 {6} 9 {7} 11 {8} 7 {9} 3 {10} 3
Â
Find the 2-quantile of quantile_array
AFL% quantile(quantile_array,2); Â
The output is:{quantile} percentage,val_quantile {0} 0,0 {1} 0.5,4 {2} 1,11
Remove the quantile_array
AFL% remove(quantile_array); Â
The output is:Query was executed successfully
The Group-by-Dimension Parameter
To see/use the group-by-dimension parameter, do the following:
Start with a 5x5 array, with a single, integer attribute:
AFL% create array m5x5<val:int32>[i=0:4; j=0:4];
The output is:Query was executed successfully.Â
Initialize the data in the array
AFL% store(build(m5x5, '[[16,13,22,7,13],[11,19,23,21,24],[16,21,15,7,16],[10,19,0,23,23],[12,7,18,7,8]]', true), m5x5);
The output is:{i,j} val {0,0} 16 {0,1} 13 {0,2} 22 {0,3} 7 {0,4} 13 {1,0} 11 {1,1} 19 {1,2} 23 {1,3} 21 {1,4} 24 {2,0} 16 {2,1} 21 {2,2} 15 {2,3} 7 {2,4} 16 {3,0} 10 {3,1} 19 {3,2} 0 {3,3} 23 {3,4} 23 {4,0} 12 {4,1} 7 {4,2} 18 {4,3} 7 {4,4} 8
Find the 2-quantile of the array, and then by the first dimension, and then by the second dimension.
AFL% quantile(m5x5,2); Â
The output is:{quantile} percentage,val_quantile {0} 0,0 {1} 0.5,16 {2} 1,24Â
AFL% quantile(m5x5,2,val,i);
The output is:{i,quantile} percentage,val_quantile {0,0} 0,7 {0,1} 0.5,13 {0,2} 1,22 {1,0} 0,11 {1,1} 0.5,21 {1,2} 1,24 {2,0} 0,7 {2,1} 0.5,16 {2,2} 1,21 {3,0} 0,0 {3,1} 0.5,19 {3,2} 1,23 {4,0} 0,7 {4,1} 0.5,8 {4,2} 1,18
AFL% quantile(m5x5,2,val,j);
The output is:{j,quantile} percentage,val_quantile {0,0} 0,10 {0,1} 0.5,12 {0,2} 1,16 {1,0} 0,7 {1,1} 0.5,19 {1,2} 1,21 {2,0} 0,0 {2,1} 0.5,18 {2,2} 1,23 {3,0} 0,7 {3,1} 0.5,7 {3,2} 1,23 {4,0} 0,8 {4,1} 0.5,16 {4,2} 1,24
Remove the arrayÂ
AFL% remove(m5x5); Â
The output is:Query was executed successfully