cast

The cast operator produces a result array with the same numbers of dimensions, attributes, and cells as the source array, but with differences that can include name changes, attribute type conversions, and certain dimension bound changes.

Synopsis

cast( array, [ schema_definition | template_array | ( old_name, new_name ) ]+ )

Summary

The cast operator takes an input array and at least one additional argument.  These arguments may be either

  • a schema definition,
  • a template array (in which case the array's schema definition is used as if it had been supplied directly), or
  • a pair of names enclosed in parentheses.

Initially, the schema of the input array is taken as the working schema of the cast.  Each successive argument alters the working schema in some way.  When all the arguments are processed, the final working schema is validated to ensure a legal cast (see below), and if so it becomes the result schema of the cast operation.

If a schema definition argument (or the schema of a template array argument) has exactly the same number of attributes and dimensions as the input array, then the working schema is overwritten with the schema argument.

If a schema definition argument does not match the input array in both number of attributes and number of dimensions, then each attribute and dimension of the schema argument is processed in turn.  If an argument attribute name matches an attribute in the working schema, the working attribute's type is replaced by the type specified by the argument attribute.  If an argument dimension name matches a dimension in the working schema, the working dimension's specification is replaced by the specification of the argument dimension.  If a schema argument attribute or dimension does not match any in the working schema, it is ignored.

Individual working schema attributes or dimensions can be renamed using pairs of names enclosed in parentheses.  The first name of the pair must exist in the working schema, and the second name must not exist.

Sometimes altering an attribute or dimension is more easily done by position instead of by name.  You can refer to the attributes of the working schema using positional names, for example $0, $1, and so on.  Similarly, you can refer to the dimensions of the working schema using $_0, $_1 etc.  This positional notation works for both renaming and type casting of attributes and dimensions.

Limitations

After all arguments are processed, the working schema is checked for validity.  To be valid, the following restrictions apply.

  • You may not cast a nullable attribute to NOT NULL.  (You can safely cast a NOT NULL attribute to NULL though.)
  • You may only cast attributes to types that have valid conversions from the input array's corresponding original attribute type.
  • Only dimension upper bounds can be changed using the cast operator.  You cannot change any of the other dimension parameters.
  • You may increase the upper bound of a dimension, but you cannot decrease it.

Example

Consider an array containing geophysical data:

AFL% show(GEO);
{i} schema
{0} 'GEO<temp:double NOT NULL,pressure:int32,device_name:string,altitude:int32> [lattitude=-90:90:0:10; longitude=-180:180:0:20; time=0:365:0:30]'
AFL% scan(GEO);
{lattitude,longitude,time} temp,pressure,device_name,altitude
{41,70,0} 32.5,30,'HAL 9000',7
{42,70,0} 98.6,29,'Bob\'s Drone',6
{42,71,0} 40.2,33,'six iron',8
{43,72,0} 63.6,31,'pitching wedge',9
AFL% 

We'd like to make several changes:

  • The names lattitude, longitude, and device_name are too long, replace them with lat, lon, and sensor.
  • The temp attribute should be renamed celsius, and should allow null values (in case the temperature sensor fails).
  • We need to load another year's worth of data, so the time dimension upper bound should be increased to 731.
  • The altitude ought to be a floating point number.

The cast operator gives us a lot of flexibility in how we modify the array schema.  Each of the following example cast queries achieves the same result.

All changes at once: specify complete schema.
AFL% store(cast(GEO, <celsius:double, pressure:int32, sensor:string, altitude:float>[lat=-90:90:0:10; lon=-180:180:0:20; time=0:731:0:30]), G1);
Query was executed successfully
AFL% show(G1);
{i} schema
{0} 'G1<celsius:double,pressure:int32,sensor:string,altitude:float> [lat=-90:90:0:10; lon=-180:180:0:20; time=0:731:0:30]'
AFL% scan(G1);
{lat,lon,time} celsius,pressure,sensor,altitude
{41,70,0} 32.5,30,'HAL 9000',7
{42,70,0} 98.6,29,'Bob\'s Drone',6
{42,71,0} 40.2,33,'six iron',8
{43,72,0} 63.6,31,'pitching wedge',9
AFL% 

Note that cast changes neither the cell attribute values nor the cell positions of the array data (except to convert the int32 values of altitude to their floating point representation).

Incremental changes to the working schema.
AFL% store(
CON>   cast(GEO,
CON>      (temp, celsius),        -- rename attribute
CON>      <celsius: double>,      -- type cast (working schema name!)
CON>      ($_0, lat),             -- rename dimension by position
CON>      ($2, sensor),           -- rename attribute by position
CON>      <$3 : float>,           -- type cast by position
CON>      (longitude, lon),       -- yet another rename
CON>      <_dummy : int32>[time=0:731:0:30]  -- alter dimension by name
CON> ), G2);
Query was executed successfully
AFL% show(G2);
{i} schema
{0} 'G2<celsius:double,pressure:int32,sensor:string,altitude:float> [lat=-90:90:0:10; lon=-180:180:0:20; time=0:731:0:30]'
AFL% scan(G2);
{lat,lon,time} celsius,pressure,sensor,altitude
{41,70,0} 32.5,30,'HAL 9000',7
{42,70,0} 98.6,29,'Bob\'s Drone',6
{42,71,0} 40.2,33,'six iron',8
{43,72,0} 63.6,31,'pitching wedge',9
AFL%