Reputation: 1
currently presented with data can ingest a csv file..
Currently data is all in one column but need to split the data below into sperate columns currently a space indicates its a new column and char count for each col may be a good way to distinguish separation? tried using sv/ssr but getting errors. Will also need to account for sometimes columns being empty.
(`$( ssr[;" ";""] each x . (::;0)))
Current 1 row per ""
"EEEE 00000Z AUTO 0000KT 1234 SA 0VC000 0504 Q111 COL RE"
"EAAA 00000Z AUTO 0000KT RA 0VC000 Q111 COL RE"
any pointers would be helpful
Unsure if ssr is the right operator or if sv would be better in this case?
(`$( ssr[;" ";""] each x . (::;0)))
Desired output is
Col1 Col2 Col3 Col4 Col5 Col6 Col7 Col8 Col9 Col10 Col11
EEEE 00000Z AUTO 0000KT 1234 SA OVC000 0504 Q111 COL RE
EAAA 00000Z AUTO 0000KT RA 0VC000 Q111 COL RE
Upvotes: 0
Views: 94
Reputation: 5654
It looks like you are dealing with a file that contains fixed widths. The operator 0:
can interpret these fixed-format files.
The syntax is:
(types; widths) 0: text
For your use case, parsing all columns as strings:
("***********";4 7 6 7 5 3 8 8 7 4 4)0:txt
"EEEE" "EAAA"
" 00000Z" " 00000Z"
" AUTO" " AUTO"
" 0000KT" " 0000KT"
" 1234" " "
" SA" " RA"
" 0VC000" " 0VC000"
" 0504" " "
" Q111" " Q111"
" COL" " COL"
" RE" " RE"
If you need to remove any leading or trailing whitespace from the columns you can use trim
. In this case I have assumed some of the column types.
/ add column names
q)tab:{flip(`$"col",/:string 1+til count[x])!x}("S*S*JSS*SSS";4 7 6 7 5 3 8 8 7 4 4)0:txt
/ trim string columns
q)@[tab;exec c from meta[tab]where t="C";trim]
col1 col2 col3 col4 col5 col6 col7 col8 col9 col10 col11
--------------------------------------------------------------------
EEEE "00000Z" AUTO "0000KT" 1234 SA 0VC000 "0504" Q111 COL RE
EAAA "00000Z" AUTO "0000KT" RA 0VC000 "" Q111 COL RE
It may also be useful to know you could drop single (or multiple) spaces in this manner:
("* * * * * * * * * * *";4 1 6 1 5 1 6 1 4 1 2 1 7 1 7 1 6 1 3 1 3)0:txt
"EEEE" "EAAA"
"00000Z" "00000Z"
" AUTO" " AUTO"
"0000KT" "0000KT"
"1234" " "
"SA" "RA"
" 0VC000" " 0VC000"
" 0504" " "
" Q111" " Q111"
"COL" "COL"
" RE" " RE"
Upvotes: 3