How to read a matrix from a file in Chapel

Question

This time I have a matrix --IN A FILE-- called "matrix.csv" and I want to read it in. I can do it in two flavors, dense and sparse.

Dense

matrix.csv

3.0, 0.8, 1.1, 0.0, 2.0
0.8, 3.0, 1.3, 1.0, 0.0
1.1, 1.3, 4.0, 0.5, 1.7
0.0, 1.0, 0.5, 3.0, 1.5
2.0, 0.0, 1.7, 1.5, 3.0

Sparse

matrix.csv
1,1,3.0
1,2,0,8
1,3,1.1
// 1,4 is missing
1,5,2.0
...
5,5,3.0

Assume the file is pretty large. In both cases, I want to read these into a Matrix with the appropriate dimensions. In the dense case I probably don't need to provide meta-data. In the second, I was thinking I should provide the "frame" of the matrix, like

matrix.csv
nrows:5
ncols:5

But I don't know the standard patterns.

== UPDATE ==

It's a bit difficult to find, but the mmreadsp can change your day from "Crashing the server" to "done in 11 seconds". Thanks to Brad Cray (not his real name) for pointing it out!

ben-albrecht · Accepted Answer

Preface

Since Chapel matrices are represented as arrays, this question is equivalent to:

"How to read an array from a file in Chapel".

Ideally, a csv module or a specialized IO-formatter (similar to JSON formatter) would handle csv I/O more elegantly, but this answer reflects the array I/O options available as of Chapel 1.16 pre-release.

Dense Array I/O

Dense arrays are the easy case, since DefaultRectangular arrays (the default type of a Chapel array) come with a .readWriteThis(f) method. This method allows one to read and write an array with built-in write() and read() methods, as shown below:

var A: [1..5, 1..5] real;

// Give this array some values
[(i,j) in A.domain] A[i,j] = i + 10*j;

var writer = open('dense.txt', iomode.cw).writer();

writer.write(A);

writer.close();

var B: [1..5, 1..5] real;
var reader = open('dense.txt', iomode.r).reader();
reader.read(B);
reader.close();

assert(A == B);

The dense.txt looks like this:

11.0 21.0 31.0 41.0 51.0
12.0 22.0 32.0 42.0 52.0
13.0 23.0 33.0 43.0 53.0
14.0 24.0 34.0 44.0 54.0
15.0 25.0 35.0 45.0 55.0

However, this assumes you know the array shape in advance. We can remove this constraint by writing the array shape at the top of the file, as shown below:

var A: [1..5, 1..5] real;

[(i,j) in A.domain] A[i,j] = i + 10*j;

var writer = open('dense.txt', iomode.cw).writer();

writer.writeln(A.shape);
writer.write(A);

writer.close();

var reader = open('dense.txt', iomode.r).reader();
var shape: 2*int;
reader.read(shape);
var B: [1..shape[1], 1..shape[2]] real;
reader.read(B);
reader.close();

assert(A == B);

Now, dense.txt looks like this:

(5, 5)
11.0 21.0 31.0 41.0 51.0
12.0 22.0 32.0 42.0 52.0
13.0 23.0 33.0 43.0 53.0
14.0 24.0 34.0 44.0 54.0
15.0 25.0 35.0 45.0 55.0

Sparse Array I/O

Sparse arrays require a little more work, because DefaultSparse arrays (the default type of a sparse Chapel array) only provide a .writeThis(f) method and not a .readThis(f) method as of Chapel 1.16 pre-release. This means we have builtin support for writing sparse arrays, but not reading them.

Since you specifically requested csv format, we'll do sparse arrays in csv:

// Create parent domain, sparse subdomain, and sparse array
const D = {1..10, 1..10};
var spD: sparse subdomain(D);
var A: [spD] real;

// Add some non-zeros:
spD += [(1,1), (1,5), (2,7), (5, 4), (6, 6), (9,3), (10,10)];

// Set non-zeros to 1.0 (to make things interesting?)
A = 1.0;

var writer = open('sparse.csv', iomode.cw).writer();

// Write shape
writer.writef('%n,%n
', A.shape[1], A.shape[2]);

// Iterate over non-zero indices, writing: i,j,value
for (i,j) in spD {
  writer.writef('%n,%n,%n
', i, j, A[i,j]);
}

writer.close();

var reader = open('sparse.csv', iomode.r).reader();

// Read shape
var shape: 2*int;
reader.readf('%n,%n', shape[1], shape[2]);

// Create parent domain, sparse subdomain, and sparse array
const Bdom = {1..shape[1], 1..shape[2]};
var spBdom: sparse subdomain(Bdom);
var B: [spBdom] real;

// This is an optimization that bulk-adds the indices. We could instead add
// the indices directly to spBdom and the value to B[i,j] each iteration
var indices: [1..0] 2*int,
    values: [1..0] real;

// Variables to be read into
var i, j: int,
    val: real;
while reader.readf('%n,%n,%n', i, j, val) {
  indices.push_back((i,j));
  values.push_back(val);
}

// bulk add the indices to spBdom and add values to B element-wise
spBdom += indices;
for (ij, v) in zip(indices, values) {
  B[ij] = v;
}

reader.close();

// Sparse arrays can't be zippered with anything other than their domains and
// sibling arrays, so we need to do an element-wise assertion:
assert(A.domain == B.domain);
for (i,j) in A.domain {
  assert(A[i,j] == B[i,j]);
}

And sparse.csv looks like this:

10,10
1,1,1
1,5,1
2,7,1
5,4,1
6,6,1
9,3,1
10,10,1

MatrixMarket Module

Lastly, I'll mention that there is a MatrixMarket package module that supports dense & sparse array I/O using the matrix market format. This is currently not shown on the public documentation, because it is intended to be moved out as a standalone package once the package manager is reliable enough, but you can use it in your chapel programs with use MatrixMarket;, currently.

Here is the source code, which includes documentation for the interface as comments.

Here are the tests, if you prefer to learn from example, rather than documentation & source code.

How to read a matrix from a file in Chapel

Answers (2)

Preface

Dense Array I/O

Sparse Array I/O

MatrixMarket Module

A tribute to prof. Rudolf Zitny & prof. Petr Vopenka

Observation:

This said, the workflow always needs to know a-priori:

Related Questions