repart

The repart operator produces a result array similar to a source array via re-partitioning, but with different chunk sizes, chunk overlaps, or both.

Synopsis

AFL% repart(array, template_array|schema_definition)

Summary

The repart operator produces a result array similar to a source array, but with different chunk sizes, chunk overlaps, or both. The new array must conform to the schema of an existing template array or to the schema definition supplied with the operator. The repart operator does not alter the source array. The new array requires the same attributes and dimensions as the source array.

If you supply a schema definition, you can omit the chunk sizes in that definition or specify them with asterisk and SciDB estimates the chunk size based on the cell distribution of the input array.  Choosing the best performing chunk lengths depends on the query mix and the nature of the data, so consider the values that repart chooses as reasonable starting points for possible further optimization.  The repart operator uses the same chunk length estimation algorithm as redimension, and supports the cells_per_chunk: N and phys_chunk_size: N  keyword arguments.

Inputs

  • array - An existing array in the SciDb system that is to be repartitioned.
  • template_array - An existing array in the SciDB system from which the new schema will be retrieved and used to generate the resulting array.
  • schema_definition - The schema to be used for the resulting array
  • cells_per_chunk: N - Optional.  Use N as the goal when estimating chunk lengths by cell count.
  • phys_chunk_size: N - Optional.  Use N mebibytes as the goal when estimating chunk lengths by physical chunk size.

Example

To repartition a 4×4 array with 16 1x1 chunks into a 4x4 array with four 2x2 chunks, do the following:

  1. Create a 2-dimensional array called source where each dimension uses a chunk size of 1:

    AFL% CREATE ARRAY source <val:double>[x=0:3:0:1; y=0:3:0:1];


    The output is:

    AFL% Query was executed successfully
  2. Add values of 0–12 to source:

    AFL% store(build(source,x*3+y),source);  


    The output is:

    {x,y} val
    {0,0} 0
    {0,1} 1
    {0,2} 2
    {0,3} 3
    {1,0} 3
    {1,1} 4
    {1,2} 5
    {1,3} 6
    {2,0} 6
    {2,1} 7
    {2,2} 8
    {2,3} 9
    {3,0} 9
    {3,1} 10
    {3,2} 11
    {3,3} 12 

     

  3. Repartition the source array into 2-by-2 chunks, using the schema_definition method, and store the result in an array called target: 

    The schema_definition is: <val:double>[x=0:3:0:2; y=0:3:0:2]

    AFL% store(repart(source, <val:double>[x=0:3:0:2; y=0:3:0:2]),target);  


    The output is:

    {x,y} val
    {0,0} 0
    {0,1} 1
    {1,0} 3
    {1,1} 4
    {0,2} 2
    {0,3} 3
    {1,2} 5
    {1,3} 6
    {2,0} 6
    {2,1} 7
    {3,0} 9
    {3,1} 10
    {2,2} 8
    {2,3} 9
    {3,2} 11
    {3,3} 12  
  4. Remove the target array:

    AFL% remove(target);


    The output is:

    Query was executed successfully 
  5. Create another array to act as the template array:

    AFL% create array template <val:double>[x=0:3:0:2; y=0:3:0:2];


    The output is:

    Query was executed successfully 
  6. Repartition the source array into 2-by-2 chunks, using the template_array method, and store the result in an array called target:

    AFL% store(repart(source, template),target);


    The output is:

    {x,y} val
    {0,0} 0
    {0,1} 1
    {1,0} 3
    {1,1} 4
    {0,2} 2
    {0,3} 3
    {1,2} 5
    {1,3} 6
    {2,0} 6
    {2,1} 7
    {3,0} 9
    {3,1} 10
    {2,2} 8
    {2,3} 9
    {3,2} 11
    {3,3} 12  

    Note:  the output of the template_array and schema_definition methods are identical.


  7. Remove the example arrays.

    AFL% remove(target); remove(source); remove(template);



    Â