sort
The sort operator produces a 1-dimensional result array by sorting non-empty cells of a source array.
Synopsis
sort(array [, attribute [asc|desc]]... [, chunk_size] [, dimension: name ] );
Summary
The sort operator produces a one-dimensional result array, even if the source array has multiple dimensions. The result array contains each non-empty cell of the source array.
- The result array does not show values of the original dimensions in the source array.
- If the
dimension: name
keyword parameter is given, the output dimension will have the supplied name. Otherwise the result array's dimension name is $n
. - The output dimension is unbounded.
- You can use the sort operator to sort by one or more attributes. The operator first sorts by the first attribute, then by the second, and so on.
- Use the optional keyword asc or desc to control the sort order for each attribute, ascending or descending. The default is ascending.
- You can control the chunk size of the resulting array with the optional chunk_size parameter.
- To sort by dimensions, use the apply operator to return a dimension as an attribute, and then sort.
- null values are sorted as having a rank lower than any non-null value. See Semantics of Special Values.
Inputs
The sort operator takes the following arguments:
- array: A source array with one or more attributes and one or more dimensions.
- attribute: Zero or more attributes. If you do not provide an attribute, SciDB uses the first attribute in the source array.
- chunk_size: An optional chunk size for the result array.
Examples
The examples use an array 'champions'. To create the array do the following:
Create a file named '/tmp/winners.csv' with the following contents:
1996,0,Bailey,Canada,9.84 1996,1,Thugwane,USA,7956 1996,2,Keter,Kenya,487.12 2000,0,Greene,USA,9.87 2000,1,Abera,Ethiopia,7811 2000,2,Kosgei,Kenya,503.17 2004,0,Gatlin,USA,9.85 2004,1,Baldini,Italy,7855 2004,2,Kemboi,Kenya,485.81 2008,0,Bolt,Jamaica,9.69 2008,1,Wanjiru,Kenya,7596 2008,2,Kipruto,Kenya,490.34
Create an a flatwinners array and load the file:
create array flatwinners <year:int64, event_id:int64, person:string, country:string, time:double>[i=0:99]; load(flatwinners, '/tmp/winners.csv', -2 , 'csv');
The output is:Query was executed successfully {i} year,event_id,person,country,time {0} 1996,0,'Bailey','Canada',9.84 {1} 1996,1,'Thugwane','USA',7956 {2} 1996,2,'Keter','Kenya',487.12 {3} 2000,0,'Greene','USA',9.87 {4} 2000,1,'Abera','Ethiopia',7811 {5} 2000,2,'Kosgei','Kenya',503.17 {6} 2004,0,'Gatlin','USA',9.85 {7} 2004,1,'Baldini','Italy',7855 {8} 2004,2,'Kemboi','Kenya',485.81 {9} 2008,0,'Bolt','Jamaica',9.69 {10} 2008,1,'Wanjiru','Kenya',7596 {11} 2008,2,'Kipruto','Kenya',490.34
Redimension the flatwinners and store the result in a new array, champions:
store(redimension(flatwinners,<person:string,country:string,time:double>[year=1900:*; event_id=0:*]),champions);
The output is:{year,event_id} person,country,time {1996,0} 'Bailey','Canada',9.84 {1996,1} 'Thugwane','USA',7956 {1996,2} 'Keter','Kenya',487.12 {2000,0} 'Greene','USA',9.87 {2000,1} 'Abera','Ethiopia',7811 {2000,2} 'Kosgei','Kenya',503.17 {2004,0} 'Gatlin','USA',9.85 {2004,1} 'Baldini','Italy',7855 {2004,2} 'Kemboi','Kenya',485.81 {2008,0} 'Bolt','Jamaica',9.69 {2008,1} 'Wanjiru','Kenya',7596 {2008,2} 'Kipruto','Kenya',490.34
Scan a 2-Dimension array, Sort it Several Ways
To sort it by ascending country, then sort it by descending country and ascending person , do the following:
Sort by country (ascending):
AFL% sort(champions,country);
The output is:{$n} person,country,time {0} 'Bailey','Canada',9.84 {1} 'Abera','Ethiopia',7811 {2} 'Baldini','Italy',7855 {3} 'Bolt','Jamaica',9.69 {4} 'Keter','Kenya',487.12 {5} 'Kosgei','Kenya',503.17 {6} 'Kemboi','Kenya',485.81 {7} 'Wanjiru','Kenya',7596 {8} 'Kipruto','Kenya',490.34 {9} 'Thugwane','USA',7956 {10} 'Greene','USA',9.87 {11} 'Gatlin','USA',9.85
To sort by country descending, and then by person ascending:
AFL% sort(champions,country desc, person asc);
The output is:{$n} person,country,time {0} 'Gatlin','USA',9.85 {1} 'Greene','USA',9.87 {2} 'Thugwane','USA',7956 {3} 'Kemboi','Kenya',485.81 {4} 'Keter','Kenya',487.12 {5} 'Kipruto','Kenya',490.34 {6} 'Kosgei','Kenya',503.17 {7} 'Wanjiru','Kenya',7596 {8} 'Bolt','Jamaica',9.69 {9} 'Baldini','Italy',7855 {10} 'Abera','Ethiopia',7811 {11} 'Bailey','Canada',9.84
Sorting by Dimension
To sort by a dimension, use the apply operator to create a new attribute with the value of the dimension and sort by that new attribute.
Sort by the dimension, event_id:
AFL% sort(apply(champions,event,event_id),event);
The output is:{$n} person,country,time,event {0} 'Bailey','Canada',9.84,0 {1} 'Greene','USA',9.87,0 {2} 'Gatlin','USA',9.85,0 {3} 'Bolt','Jamaica',9.69,0 {4} 'Thugwane','USA',7956,1 {5} 'Abera','Ethiopia',7811,1 {6} 'Baldini','Italy',7855,1 {7} 'Wanjiru','Kenya',7596,1 {8} 'Keter','Kenya',487.12,2 {9} 'Kosgei','Kenya',503.17,2 {10} 'Kemboi','Kenya',485.81,2 {11} 'Kipruto','Kenya',490.34,2
To sort by the dimension, event_id, and then by the time ascending, run:
AFL% sort(apply(champions,event,event_id),event , time asc);
The output is:{$n} person,country,time,event {0} 'Bolt','Jamaica',9.69,0 {1} 'Bailey','Canada',9.84,0 {2} 'Gatlin','USA',9.85,0 {3} 'Greene','USA',9.87,0 {4} 'Wanjiru','Kenya',7596,1 {5} 'Abera','Ethiopia',7811,1 {6} 'Baldini','Italy',7855,1 {7} 'Thugwane','USA',7956,1 {8} 'Kemboi','Kenya',485.81,2 {9} 'Keter','Kenya',487.12,2 {10} 'Kipruto','Kenya',490.34,2 {11} 'Kosgei','Kenya',503.17,2
Handling Null Values
This example uses the the array null_sort_example. To create the array do the following:
AFL% store(build(<number:int64>[i=0:4],'[-2,-1,0,(null),2]',true),null_sort_example);
The output is:
{i} number {0} -2 {1} -1 {2} 0 {3} null {4} 2
To see how the sort operator handles null values by scanning an array that includes a null value, sorting the cells in ascending order, then sorting them in descending order, do the following:
Sort by number (ascending):
AFL% sort(null_sort_example, number asc);
The output is:{$n} number {0} null {1} -2 {2} -1 {3} 0 {4} 2
Sort by number (descending):
AFL% sort(null_sort_example, number desc);
The output is:{$n} number {0} 2 {1} 0 {2} -1 {3} -2 {4} null
Remove the examples by running:
AFL% remove(flatwinners); remove(champions); remove(null_sort_example);