CleanSock
CleanSock

Reputation: 383

How to build a dictionary from contents of a csv file in kdb?

I have a csv file with contents like below

source,address,table,tableName,sym,symSet
source_one,addr1:port1:id1:pass1,table_one,tableName1,syms_one,SYM1 SYM2 SYM3
source_two,addr2:port2:id2:pass2,table_two,tableName2,syms_two,SYM21 SYM22 SYM23

My code to load a csv into a table is as below

table:("******";enlist ",") 0: `sourceFileName.csv

I want to create a dictionary out of contents of 'table' in the below format

source_one|addr1:port1:id1:pass1
table_one|tableName1
syms_one|SYM1 SYM2 SYM3
source_two|addr2:port2:id2:pass2
table_two|tableName2
syms_two|SYM21 SYM22 SYM23

How do I achieve this?

Thanks!

Upvotes: 3

Views: 738

Answers (4)

Alexander Belopolsky
Alexander Belopolsky

Reputation: 2268

I would forego the table creation and do something like this:

q)(!). flip 2 cut raze ","vs/:1_read0`source.csv
"source_one"| "addr1:port1:id1:pass1"
"table_one" | "tableName1"
"syms_one"  | "SYM1 SYM2 SYM3"
"source_two"| "addr2:port2:id2:pass2"
"table_two" | "tableName2"
"syms_two"  | "SYM21 SYM22 SYM23"

Explanation. From right to left, first, 1_read0 reads the source file as a list of lines and discards the first line. Second, ","vs/: cuts each line on "," separators. Third, 2 cut raze flattens the list of lists and cuts it in pairs. Fourth, flip transposes the list of pairs turning it into a pair of lists. Last, (!). constructs a dictionary from a pair of lists containing keys and values. Note that (!).(x;y) translates into x!y.

Upvotes: 1

Thomas Smyth
Thomas Smyth

Reputation: 5644

Given the table defined above you could make use of value to extract the data from the table without column headers:

q)value each table
"source_one" "addr1:port1:id1:pass1" "table_one" "tableName1" "syms_one" "SYM1 SYM2 SYM3"
"source_two" "addr2:port2:id2:pass2" "table_two" "tableName2" "syms_two" "SYM21 SYM22 SYM23"

From here you can raze the ouptut to give a single list which can then be cut into pairs (2):

q)2 cut raze value each table
"source_one" "addr1:port1:id1:pass1"
"table_one"  "tableName1"
"syms_one"   "SYM1 SYM2 SYM3"
...

Finally using flip puts it into a format that can be used to make a dictionary using !:

(!). flip 2 cut raze value each table
"source_one"| "addr1:port1:id1:pass1"
"table_one" | "tableName1"
"syms_one"  | "SYM1 SYM2 SYM3"
"source_two"| "addr2:port2:id2:pass2"
"table_two" | "tableName2"
"syms_two"  | "SYM21 SYM22 SYM23"

If the keys need to be symbols then you can make use of @ apply to convert them before creating the dictionary:

(!). @[;0;`$]flip 2 cut raze value each table

A better approach may be to create the table without the use of enlist and dropping the column headers with 1_, before making use of the same method to create the dictionary:

(!). flip raze cut[2]each 1_flip("******";",") 0: `:source.csv

Upvotes: 1

emc211
emc211

Reputation: 1379

You can also use 0: to directly Parse Key-Value Pairs however it would require a change to the way your text file is stored.

Need to drop the first line and add comma on the end of each line:

$ cat test.txt
source_one=addr1:port1:id1:pass1,table_one=tableName1,syms_one=SYM1 SYM2 SYM3,
source_two=addr2:port2:id2:pass2,table_two=tableName2,syms_two=SYM21 SYM22 SYM23,

If its easy to change the load then becomes one line:

q)(!). "S=,"0: raze  read0 `:test.txt
source_one| "addr1:port1:id1:pass1"
table_one | "tableName1"
syms_one  | "SYM1 SYM2 SYM3"
source_two| "addr2:port2:id2:pass2"
table_two | "tableName2"
syms_two  | "SYM21 SYM22 SYM23"

This has the advantage over loading to a table if the data is irregular, .e.g not ever line has source and table and syms. If they did why not just have those as column names in a table?

Upvotes: 3

Jonathon McMurray
Jonathon McMurray

Reputation: 2981

One way would be something like this:

q)(!) . flip raze 2 cut'1_flip("******";",")0:`:test.csv
"source_one"| "addr1:port1:id1:pass1"
"table_one" | "tableName1"
"syms_one"  | "SYM1 SYM2 SYM3"
"source_two"| "addr2:port2:id2:pass2"
"table_two" | "tableName2"
"syms_two"  | "SYM21 SYM22 SYM23"

(If you want the keys & values as symbols, replace the * in the 0: params with S)

This works by reading the file in as lists of strings, flipping into the original rows, dropping the first (i.e. the headers), performing 2 cut on each row to split into pairs, using raze to remove a level of nesting and then finally uses dot apply to apply the ! function (i.e. make a dictionary) to this flipped, so that the left arg to ! is the keys and the right arg is the values.

Upvotes: 2

Related Questions