tsvd

The tsvd operator estimates the truncated singular value decomposition. Available only in the Enterprise Edition.

Synopsis

tsvd(A, AT, n [, tolerance][,iterations][,initialVector][,left][,right])

Library

The tsvd operator resides in the Linear Algebra library. Run the following query to load this library:

AFL% load_library('linear_algebra');

Summary

Let A be an r x c 2-D SciDB array. The tsvd operator estimates the truncated singular value decomposition:

where U is an r x n matrix with orthonormal columns, S is an n x n diagonal matrix of singular values, and V is a c x n matrix with orthonormal columns.

  • The matrices U, S, and VT (the transpose of V) are returned in a sparse 3-D output array in index positions 0,1, and 2, respectively.
  • Use the optional initialVector argument to supply a starting vector for the algorithm subspace iterations. The default value is a vector with values in the interval [-0.5, 0.5], sampled from a uniform random distribution.
  • Use the optional left and right input vector subspace factors to compute the truncated SVD of
    A – left * rightT
    (for example to compute a principal components analysis as illustrated in the examples).

Inputs

A: <double>[r=..., c=...]

The input r x c matrix of double-precision numbers.

AT: <double>[c=...,r=...]

The transpose of A.

n (scalar integer)

The desired number of singular values.

tolerance (scalar double)

An optional convergence tolerance (default 0.0001).

iterations (scalar integer)

An optional cap on the number of iterations (default 100).

initialVector <double>[r...]

An optional length r 1-D array

left <double>[r...]

An optional length r 1-D array deflation subspace factor.

right <double>[c...]

An optional length c 1-D array deflation subspace factor.

Outputs

The tsvd operator returns a sparse 3-D array; the matrix U is in index position 0, S is in position 1, and VT (the transpose of V) is in position 2.

Limitations

The tsvd algorithm presently requires the following:

  • All columns of A and AT must be in a single chunk.
  • The matrices A and AT must be zero-indexed, must have precisely one, non-null, double-precision attribute, and must have zero chunk overlap.

Example

To compute the 3 principal components of a randomly generated matrix, do the following:

  1. Create and store a matrix X:

    AFL% store(
           redimension(
             project(
               apply(
                 rng_uniform(<v:double>[x=0:15],0,1,'drand48',12345),
                 v, rng_uniform,
                 i, x/4,
                 j, x%4
               ),
               v, i, j
             ),
             <v:double NOT NULL>[i=0:3; j=0:3]
           ),
           X
         ); 


    The output is:

    {i,j} v
    {0,0} 0.105882
    {0,1} 0.79826
    {0,2} 0.016059
    {0,3} 0.664037
    {1,0} 0.0429341
    {1,1} 0.99479
    {1,2} 0.845222
    {1,3} 0.217724
    {2,0} 0.276606
    {2,1} 0.418719
    {2,2} 0.297666
    {2,3} 0.824748
    {3,0} 0.156347
    {3,1} 0.503388
    {3,2} 0.78659
    {3,3} 0.935067
  2. Transpose the matrix X into XT with the store operator:

    AFL% store(transpose(X), XT); 


    The output is:

    {j,i} v
    {0,0} 0.105882
    {0,1} 0.0429341
    {0,2} 0.276606
    {0,3} 0.156347
    {1,0} 0.79826
    {1,1} 0.99479
    {1,2} 0.418719
    {1,3} 0.503388
    {2,0} 0.016059
    {2,1} 0.845222
    {2,2} 0.297666
    {2,3} 0.78659
    {3,0} 0.664037
    {3,1} 0.217724
    {3,2} 0.824748
    {3,3} 0.935067
  3. Compute and store a truncated SVD of dimension 2 in output array S:

    AFL% store(tsvd(X, XT, 1), S);  


    The output is:

    {matrix,i,j} value
    {0,0,0} -0.422756
    {0,1,0} -0.537098
    {0,2,0} -0.431605
    {0,3,0} -0.58866
    {1,0,0} 2.16913
    {2,0,0} -0.128723
    {2,0,1} -0.62183
    {2,0,2} -0.48511
    {2,0,3} -0.601187
    {3,0,0} 0
    {3,0,1} 3
    {3,0,2} 8
  4. Show the computed singular values:

    AFL% between(S, 1, 0, 0, 1, 4, 4);  


    The output is:

    {matrix,i,j} value
    {1,0,0} 2.16913

    You can view the left singular values in the subarray(S, 0,0,0, 0,19,19), and the right singular values in the subarray(S, 2,0,0, 2,19,19).

  5. Use the following queries to compute the three principal components of X:

    Because the initial array contains the random() function, the example output for these operators varies from run to run.

    AFL% store(build(<v:double NOT NULL>[i=0:3],random()),initial); 
    AFL% store(build(<v:double NOT NULL>[i=0:3],1),left); 
    AFL% store(
           project(
             -- using substitute() to turn the nullable attribute to non-nullable.
             substitute(
               apply(
                 aggregate(
                   X,
                   sum(v) as colsum,
                   count(v) as colcount,
                   j
                 ),
                 colmean, colsum/colcount
               ),
               -- a single cell array that contains the value to substitute null with.
               build(<v:double NOT NULL>[i=0:0], 0.0),
               colmean
             ),
             colmean
           ),
           right
         ); 
    AFL% tsvd(X,XT,1,0.001,10,initial,left,right);
  6. Remove example arrays:

    remove(initial); remove(left); remove(right); remove(S); remove(X); remove(XT);