Reputation: 77
i'm totally new to julia but i got stuck with the issue of storing results of a loop. I have several csv-files that are UTF-16 but i need to encode them to UTF-8. Therefore i thought i would loop them and afterwards i would like to put them all together in one DataFrame
This is my approach so far...
filelist = readdir("C:\\Users\\cd\\Documents\\Data\\Generation")
for i in filelist
encoded_csv = open("C:\\Users\\cd\\Documents\\Data\\Generation\\"*i,enc"UTF-16")
end
I would appreciate any help i could get :) Thank you very much!
Upvotes: 2
Views: 378
Reputation: 1158
I suggest using the CSV.jl package for reading CSV files. The general syntax should be:
using CSV, DataFrames, StringEncodings
df = DataFrame()
for i in filelist
append!(df, CSV.File(open(read, i, enc"UTF-16")))
end
Regarding UTF-16 encoding, this is explained here: https://csv.juliadata.org/stable/#Non-UTF-8-character-encodings
Edit: syntax for directly reading UTF-16 encoded files added.
Upvotes: 2
Reputation: 77
thanks for your help but i think i was not on point asking my question. I do use the CSV package to read the csv files. This works for csv files with UTF-8 Encoding. In order to change the encoding i simply use StringEncodings with the mentioned function.
If i just want to encode one single csv file this works fine! However i would like to use that approach for all the csv files in on folder. Therefore i thought about looping over those files. Unfortunately this went wrong because it stores everything in "encoded_csv". I would like to encode every single csv-file and store them in a individual variable therefore i can load them via CSV afterwards.
Thanks again and sorry for the inconvenience
Upvotes: 0
Reputation: 42214
Encoding of a file and the fact whether it is a CSV or not are two separate issues.
Regarding transforming the encodings the best way to do it would be to use StringEncodings
package. Here I just do it line-by-line:
using StringEncodings
f = open("u16.txt", enc"UTF-16", "r")
fout = open("u8.txt", enc"UTF-8", "w")
for l in eachline(f)
println(fout,l)
end
close(fout)
close(f)
Note that that such file stream can be passed directly to CSV.File
if you need. Simply do with an open stream:
CSV.File(f) |> DataFrame
Upvotes: 0