spgemm
The spgemm operator computes the matrix product of two sparse matrices and produces a sparse matrix as the result. It also includes options to use specialized arithmetic that permits matrix multiplication to be used to solve shortest-path network problems and other minimization or maximization problems.
Synopsis
spgemm(A, B [, 'min.+' | 'max.+'] [, right_replicate: true|false]);
Library
The spgemm operator resides in the Linear Algebra library. Run the following query to load this library:
AFL% load_library('linear_algebra');
Summary
The spgemm operator calculates the matrix product and returns the result as a sparse matrix .
- Here, a sparse matrix is an array of two dimensions where empty cells are treated as implied zeros.
- The algorithm is optimized for input arrays where such implicit zeros predominate.
- The product does not contain actual zero values: any zeros generated become empty cells in the output.
Requirements:
- and must have a single attribute with the same type, which may be float or double.
- The dimension start, end, and chunk interval of the columns of A must equal those of the rows of B.
- and may not be nullable.
Result:
If A has dimensions and B has dimensions , then the product C is and is defined as:
If B has replicated distribution, spgemm can be faster (so long as right_replicate: is not false)
If B has col_cyclic distribution, spgemm can be slightly faster (so long as right_replicate: is not true)
Limitations
Matrices with nullable attributes cannot be multiplied. Use cast to change the nullability of attributes.
Examples
Multiply a 2×3 Matrix and a 3×2 Matrix.
To multiply a sparse 2×3 matrix and a sparse 3×2 matrix, do the following:
Create a 2×3 non-nullable sparse array:
AFL% store(build(<val:double NOT NULL>[row=0:1; col=0:2],'[[(),1, 2],[(),4,()]]',true),A);
The output is:{0,1} 1 {0,2} 2 {1,1} 4
Create a 3×2 non-nullable sparse array (including an explicit zero to demonstrate it is tolerated):
AFL% store(build(<val:double NOT NULL>[row=0:2; col=0:1],'[[(),10],[(),0],[(),50]]',true),B);
The output is:{row,col} val {0,1} 10 {1,1} 0 {2,1} 50
Multiply and .
AFL% spgemm(A, B)
The output is:{row,col} multiply {0,1} 100
Notice that there are no explicit zeros in the output.
Alternative Arithmetic for Shortest-Path and Dynamic Programming Algorithms
Spgemm has an optional third parameter which can be used to find solutions to combinatorial minimization (or maximization) problems such as shortest-path problems. Transforming such problems into sparse-matrix form allows them to be solved in parallel on computer clusters. For more on transforming shortest-path problems in to sparse matrix problems, the following references are recommended:
M. Mohri, Semiring Frameworks and Algorithms for Shortest-Distance Problems, Journal of Automata, Languages and Combinatorics, January 2002, pp 321-350
J. Baras and G. Theodorakopoulos, Path Problems in Networks, 2010
J. Kepner and J. Gilbert, eds, Graph Algorithms in the Language of Linear Algebra, SIAM, 2014
Result with 'min.+':
Setting the third parameter to 'min.+' changes the matrix product from being computed with the usual '+,*' definitions, and substitutes an arithmetic where '+' is replaced by 'min' and '*' is replaced with '+'. This 'min.+' arithmetic is also known as the tropical semiring in modern algebra. The product C is then defined as:
The additive identity or "zero" in this system of arithmetic is . Empty cells are treated as implied , and is replaced by empty cells in the output.
Result with 'max.+':
The third parameter can also be 'max,+' (also known as the arctic semiring). The product C is then defined as:
The additive identity or "zero" in this system of arithmetic is . Empty cells are treated as implied , and is replaced by empty cells in the output.
Examples
SciDB provides an example implementation of the Bellman-Ford Single-Source Shortest Path algorithm using the spgemm operator. See /opt/scidb/<version>/bin/bellman_ford_example.sh
Compute the Shortest Path from Vertex 0 in a Graph using bellman_ford_example.sh.
Consider a directed graph with the following edges and associated weights:
- 0→1 : 1
- 0→2 : 2
- 1→3 : 3
2→3 : 1
Store the graph as a (sparse) adjacency matrix, with dimensions v0 and v1, with the weights stored as attribute 'w' (these names are expected by the script) :
bash$ iquery -aq "store(build(<w:float NOT NULL>[v0=0:3; v1=0:3],'[[(),1, 2,()],[(),(),(),3],[(),(),(),1],[]]',true),A);"
The output is:{v0,v1} w {0,1} 1 {0,2} 2 {1,3} 3 {2,3} 1
Make sure the vector you're using as the result, MYRESULT, does not exist:
bash$ iquery -aq "remove(MYRESULT)"
The output is:... Error description: Query processor error. Array 'public.MYRESULT' does not exist. ...
Run the script, specifying the matrix name A, the result vector MYRESULT, and the starting vertex 0, as the inputs:
bash$ /opt/scidb/15.12/bin/bellman_ford_example.sh A MYRESULT 0
The output is:Query was executed successfully NUM_V0_USED is 3 Query was executed successfully ... Query was executed successfully
Print MYRESULT:
iquery -aq"scan(MYRESULT)"
The output is:{v1,dummy} multiply {0,0} 0 {1,0} 1 {2,0} 2 {3,0} 3
Note that there were two paths from vertex 0 to vertex 3: the one through vertex 1 has length 4, while the one through vertex 2 has length 3, and the shorter one is listed.