Raj
Raj

Reputation: 3001

Parse data from csv file into given format using Prolog

I have the csv file that contains data as:

  A  B  C
A -  4  5
B 8  -  6
C 2  3  -

I want to have facts in the following form:

num(a,b,4).
num(a,c,5).
num(b,a,8).
num(b,c,6).
num(c,a,2).
num(c,b,3).

There should not be facts for similar alphabets like num(a,a,-).

I am using prolog's csv_read_file as:

csv_read_file(Path, Rows, [functor(num), arity(4)]), maplist(assert, Rows).

and its giving me output as:

Rows = [num('', 'A', 'B', 'C'), num('A', -, 4, 5), num('B', 8, -, 6), num('C', 2, 3, -)]

It seems to be a basic question, but I am not able to think about condition to perform this. Any help will be highly appreciated.

As per Isabelle Newbie Answer:

Open :- csv_read_file('Path', Rows, [functor(num), arity(4)]), table_entry(Rows, Row).


header_row_entry(Header,Row,Entry):-
    arg(1, Row, RowName),
    functor(Header, _, Arity),
    between(2,Arity,ArgIndex),
    arg(ArgIndex, Header, ColumnName),
    arg(ArgIndex, Row, Value),
    Entry = num(RowName, ColumnName, Value),
    writeln(Entry).

table_entry(Entries, Entry):-
    Entries = [Header | Rows],
    member(Row, Rows),
    header_row_entry(Header, Row, Entry).

Now, can anyone explain how and where I should use maplist to convert the rows in form of facts (neglect filtering of '-' and lowercase for now) so that when I query:

?-num(A,B,X).

I should get:

X=4

Next task is, I want to implement depth first search algorithm on it. Any details regarding this will be highly appreciated.

Upvotes: 2

Views: 662

Answers (1)

Isabelle Newbie
Isabelle Newbie

Reputation: 9378

Consider a table header num('', 'A', 'B', 'C') and a row in the table num('B', 8, -, 6). From this you want to compute a table entry identified by the row's name, which here is 'B', and by a column name: the column name being 'A' for the first value (8), 'B' for the second (-), 'C' for the third (6).

Here's a simple way to do this, involving some typing and the obligatory copy-and-paste errors:

header_row_entry(Header, Row, Entry) :-
    Header = num('', ColumnName, _, _),
    Row = num(RowName, Value, _, _),
    Entry = num(RowName, ColumnName, Value).
header_row_entry(Header, Row, Entry) :-
    Header = num('', _, ColumnName, _),
    Row = num(RowName, _, Value, _),
    Entry = num(RowName, ColumnName, Value).
header_row_entry(Header, Row, Entry) :-
    Header = num('', _, _, ColumnName),
    Row = num(RowName, _, _, Value),
    Entry = num(RowName, ColumnName, Value).

This enumerates all the entries in a row on backtracking:

?- Header = num('', 'A', 'B', 'C'), Row = num('B', 8, -, 6),
      header_row_entry(Header, Row, Entry).
Header = num('', 'A', 'B', 'C'),
Row = num('B', 8, -, 6),
Entry = num('B', 'A', 8) ;
Header = num('', 'A', 'B', 'C'),
Row = num('B', 8, -, 6),
Entry = num('B', 'B', -) ;
Header = num('', 'A', 'B', 'C'),
Row = num('B', 8, -, 6),
Entry = num('B', 'C', 6).

To enumerate all the entries in an entire table, it remains to enumerate all rows, then enumerate row entries as above. Here this is:

table_entry(Entries, Entry) :-
    Entries = [Header | Rows],
    member(Row, Rows),
    header_row_entry(Header, Row, Entry).

And now, given your table:

?- Table = [num('', 'A', 'B', 'C'), num('A', -, 4, 5), num('B', 8, -, 6), num('C', 2, 3, -)], table_entry(Table, Entry).
Table = [num('', 'A', 'B', 'C'), num('A', -, 4, 5), num('B', 8, -, 6), num('C', 2, 3, -)],
Entry = num('A', 'A', -) ;
Table = [num('', 'A', 'B', 'C'), num('A', -, 4, 5), num('B', 8, -, 6), num('C', 2, 3, -)],
Entry = num('A', 'B', 4) ;
Table = [num('', 'A', 'B', 'C'), num('A', -, 4, 5), num('B', 8, -, 6), num('C', 2, 3, -)],
Entry = num('A', 'C', 5) ;
Table = [num('', 'A', 'B', 'C'), num('A', -, 4, 5), num('B', 8, -, 6), num('C', 2, 3, -)],
Entry = num('B', 'A', 8) ;
Table = [num('', 'A', 'B', 'C'), num('A', -, 4, 5), num('B', 8, -, 6), num('C', 2, 3, -)],
Entry = num('B', 'B', -) .  % etc.

Depending on what you want exactly, it remains to lowercase the row and column names (the irritatingly named downcase_atom in SWI-Prolog, for example) and filter out the - entries. You can then assert the entries using a failure-driven loop or by collecting all of them using findall and asserting using maplist.

Now that we have a working solution, we might want header_row_entry to be a bit nicer. We can use arg/3 to capture more explicitly that we are trying to pair a column name and a value that are at the same argument position in their respective header and row terms:

header_row_entry(Header, Row, Entry) :-
    arg(1, Row, RowName),
    functor(Header, _, Arity),
    between(2, Arity, ArgIndex),
    arg(ArgIndex, Header, ColumnName),
    arg(ArgIndex, Row,    Value),
    Entry = num(RowName, ColumnName, Value).

This is shorter than the above and applicable to any number of columns in the table.

Upvotes: 3

Related Questions