uniq

The uniq operator returns a result array with consecutive duplicate values removed.

Synopsis

uniq(array[, chunk_size: chunk_size]);

Summary

The uniq operator takes as input a one-dimensional array, and returns an array with all consecutive duplicate values removed. It is analogous to the Unix uniq command.

Note the following:

  • The input array requires a single attribute of any type and a single dimension.
  • To remove all duplicates and not just consecutive ones, sort the input data first.
  • The result array has the same attribute name as the input array; its sole dimension is named i, starting at 0; and has a default chunk size of one million (1,000,000).  For a different chunk size, use the optional named parameter, chunk_size.
  • The result array has no null values.

Examples

Using the Operator

To demonstrate uniq operator, do the following:

  1. Create an array with duplicates:

    AFL% store(build(<v:int64>[i=0:5], '[(8),(8),(42),(8),(8),(17)]', true), A);


    The output is:

    {i} v
    {0} 8
    {1} 8
    {2} 42
    {3} 8
    {4} 8
    {5} 17
  2. Eliminate the consecutive duplicates.  Note that A was not already sorted, so two values of eight still remain, one for each run of consecutive eights.

    AFL% uniq(a);

    The output is:

    {i} v
    {0} 8
    {1} 42
    {2} 8
    {3} 17
  3. Eliminate all duplicates, by sorting the input first.

    AFL% uniq(sort(A)); 


    The output is:

    {i} v
    {0} 8
    {1} 17
    {2} 42
  4. Remove the array:

    AFL% remove(A);