sort

The sort operator produces a 1-dimensional result array by sorting non-empty cells of a source array.

Synopsis

sort(array [, attribute [asc|desc]]... [, chunk_size] [, dimension: name ] );

Summary

The sort operator produces a one-dimensional result array, even if the source array has multiple dimensions. The result array contains each non-empty cell of the source array.

  • The result array does not show values of the original dimensions in the source array.
  • If the dimension: name keyword parameter is given, the output dimension will have the supplied name.   Otherwise the result array's dimension name is $n .
  • The output dimension is unbounded.
  • You can use the sort operator to sort by one or more attributes. The operator first sorts by the first attribute, then by the second, and so on.  If no attribute parameters are given, the operator sorts by the first attribute in ascending order.
  • Use the optional keyword asc or desc to control the sort order for each attribute, ascending or descending. The default is ascending. 
  • You can control the chunk size of the resulting array with the optional chunk_size parameter.
  • To sort by dimensions, use the apply operator to return a dimension as an attribute, and then sort.
  • null values are sorted as having a rank lower than any non-null value. See Semantics of Special Values.

Inputs

The sort operator takes the following arguments:

  • array: A source array with one or more attributes and one or more dimensions.
  • attribute: Zero or more attributes. If you do not provide an attribute, SciDB uses the first attribute in the source array.
  • chunk_size: An optional chunk size for the result array.

Examples

The examples use an array 'champions'. To create the array do the following:

  1. Create a file named '/tmp/winners.csv' with the following contents:

    1996,0,Bailey,Canada,9.84
    1996,1,Thugwane,USA,7956
    1996,2,Keter,Kenya,487.12
    2000,0,Greene,USA,9.87
    2000,1,Abera,Ethiopia,7811
    2000,2,Kosgei,Kenya,503.17
    2004,0,Gatlin,USA,9.85
    2004,1,Baldini,Italy,7855
    2004,2,Kemboi,Kenya,485.81
    2008,0,Bolt,Jamaica,9.69
    2008,1,Wanjiru,Kenya,7596
    2008,2,Kipruto,Kenya,490.34
  2. Create an a flatwinners array and  load the file:

    create array flatwinners <year:int64, event_id:int64, person:string, country:string, time:double>[i=0:99];
    load(flatwinners, '/tmp/winners.csv', -2 , 'csv');


    The output is:

    Query was executed successfully
    {i} year,event_id,person,country,time
    {0} 1996,0,'Bailey','Canada',9.84
    {1} 1996,1,'Thugwane','USA',7956
    {2} 1996,2,'Keter','Kenya',487.12
    {3} 2000,0,'Greene','USA',9.87
    {4} 2000,1,'Abera','Ethiopia',7811
    {5} 2000,2,'Kosgei','Kenya',503.17
    {6} 2004,0,'Gatlin','USA',9.85
    {7} 2004,1,'Baldini','Italy',7855
    {8} 2004,2,'Kemboi','Kenya',485.81
    {9} 2008,0,'Bolt','Jamaica',9.69
    {10} 2008,1,'Wanjiru','Kenya',7596
    {11} 2008,2,'Kipruto','Kenya',490.34
  3. Redimension the flatwinners and store the result in a new array, champions:

    store(redimension(flatwinners,<person:string,country:string,time:double>[year=1900:*; event_id=0:*]),champions);


    The output is:

    {year,event_id} person,country,time
    {1996,0} 'Bailey','Canada',9.84
    {1996,1} 'Thugwane','USA',7956
    {1996,2} 'Keter','Kenya',487.12
    {2000,0} 'Greene','USA',9.87
    {2000,1} 'Abera','Ethiopia',7811
    {2000,2} 'Kosgei','Kenya',503.17
    {2004,0} 'Gatlin','USA',9.85
    {2004,1} 'Baldini','Italy',7855
    {2004,2} 'Kemboi','Kenya',485.81
    {2008,0} 'Bolt','Jamaica',9.69
    {2008,1} 'Wanjiru','Kenya',7596
    {2008,2} 'Kipruto','Kenya',490.34

Scan a 2-Dimension array, Sort it Several Ways

To sort it by ascending country, then sort it by descending country and ascending person , do the following:

  1. Sort by country (ascending):

    AFL% sort(champions,country); 


    The output is:

    {$n} person,country,time
    {0} 'Bailey','Canada',9.84
    {1} 'Abera','Ethiopia',7811
    {2} 'Baldini','Italy',7855
    {3} 'Bolt','Jamaica',9.69
    {4} 'Keter','Kenya',487.12
    {5} 'Kosgei','Kenya',503.17
    {6} 'Kemboi','Kenya',485.81
    {7} 'Wanjiru','Kenya',7596
    {8} 'Kipruto','Kenya',490.34
    {9} 'Thugwane','USA',7956
    {10} 'Greene','USA',9.87
    {11} 'Gatlin','USA',9.85 
  2. To sort by country descending, and then by person ascending:

    AFL% sort(champions,country desc, person asc);


    The output is:

    {$n} person,country,time
    {0} 'Gatlin','USA',9.85
    {1} 'Greene','USA',9.87
    {2} 'Thugwane','USA',7956
    {3} 'Kemboi','Kenya',485.81
    {4} 'Keter','Kenya',487.12
    {5} 'Kipruto','Kenya',490.34
    {6} 'Kosgei','Kenya',503.17
    {7} 'Wanjiru','Kenya',7596
    {8} 'Bolt','Jamaica',9.69
    {9} 'Baldini','Italy',7855
    {10} 'Abera','Ethiopia',7811
    {11} 'Bailey','Canada',9.84

Sorting by Dimension

To sort by a dimension, use the apply operator to create a new attribute with the value of the dimension and sort by that new attribute.

  1. Sort by the dimension, event_id:

    AFL% sort(apply(champions,event,event_id),event);  


    The output is:

    {$n} person,country,time,event
    {0} 'Bailey','Canada',9.84,0
    {1} 'Greene','USA',9.87,0
    {2} 'Gatlin','USA',9.85,0
    {3} 'Bolt','Jamaica',9.69,0
    {4} 'Thugwane','USA',7956,1
    {5} 'Abera','Ethiopia',7811,1
    {6} 'Baldini','Italy',7855,1
    {7} 'Wanjiru','Kenya',7596,1
    {8} 'Keter','Kenya',487.12,2
    {9} 'Kosgei','Kenya',503.17,2
    {10} 'Kemboi','Kenya',485.81,2
    {11} 'Kipruto','Kenya',490.34,2
  2. To sort by the dimension, event_id, and then by the time ascending, run:

    AFL% sort(apply(champions,event,event_id),event , time asc);  


    The output is:

    {$n} person,country,time,event
    {0} 'Bolt','Jamaica',9.69,0
    {1} 'Bailey','Canada',9.84,0
    {2} 'Gatlin','USA',9.85,0
    {3} 'Greene','USA',9.87,0
    {4} 'Wanjiru','Kenya',7596,1
    {5} 'Abera','Ethiopia',7811,1
    {6} 'Baldini','Italy',7855,1
    {7} 'Thugwane','USA',7956,1
    {8} 'Kemboi','Kenya',485.81,2
    {9} 'Keter','Kenya',487.12,2
    {10} 'Kipruto','Kenya',490.34,2
    {11} 'Kosgei','Kenya',503.17,2

Handling Null Values

This example uses the the array null_sort_example. To create the array do the following:

AFL%  store(build(<number:int64>[i=0:4],'[-2,-1,0,(null),2]',true),null_sort_example);


The output is:

{i} number
{0} -2
{1} -1
{2} 0
{3} null
{4} 2

To see how the sort operator handles null values by scanning an array that includes a null value, sorting the cells in ascending order, then sorting them in descending order, do the following:

  1. Sort by number (ascending):

    AFL% sort(null_sort_example, number asc);


    The output is:

    {$n} number
    {0} null
    {1} -2
    {2} -1
    {3} 0
    {4} 2
  2. Sort by number (descending):

    AFL% sort(null_sort_example, number desc);


    The output is:

    {$n} number
    {0} 2
    {1} 0
    {2} -1
    {3} -2
    {4} null
  3. Remove the examples by running:

    AFL% remove(flatwinners); remove(champions); remove(null_sort_example);