Coco
Coco

Reputation: 77

Store Results of an For Loop in Julia

i'm totally new to julia but i got stuck with the issue of storing results of a loop. I have several csv-files that are UTF-16 but i need to encode them to UTF-8. Therefore i thought i would loop them and afterwards i would like to put them all together in one DataFrame

This is my approach so far...

filelist = readdir("C:\\Users\\cd\\Documents\\Data\\Generation")

for i in filelist
    encoded_csv = open("C:\\Users\\cd\\Documents\\Data\\Generation\\"*i,enc"UTF-16")  
end

I would appreciate any help i could get :) Thank you very much!

Upvotes: 2

Views: 378

Answers (3)

lungben
lungben

Reputation: 1158

I suggest using the CSV.jl package for reading CSV files. The general syntax should be:

using CSV, DataFrames, StringEncodings
df = DataFrame()
for i in filelist
    append!(df, CSV.File(open(read, i, enc"UTF-16")))
end

Regarding UTF-16 encoding, this is explained here: https://csv.juliadata.org/stable/#Non-UTF-8-character-encodings

Edit: syntax for directly reading UTF-16 encoded files added.

Upvotes: 2

Coco
Coco

Reputation: 77

thanks for your help but i think i was not on point asking my question. I do use the CSV package to read the csv files. This works for csv files with UTF-8 Encoding. In order to change the encoding i simply use StringEncodings with the mentioned function.

If i just want to encode one single csv file this works fine! However i would like to use that approach for all the csv files in on folder. Therefore i thought about looping over those files. Unfortunately this went wrong because it stores everything in "encoded_csv". I would like to encode every single csv-file and store them in a individual variable therefore i can load them via CSV afterwards.

Thanks again and sorry for the inconvenience

Upvotes: 0

Przemyslaw Szufel
Przemyslaw Szufel

Reputation: 42214

Encoding of a file and the fact whether it is a CSV or not are two separate issues.

Regarding transforming the encodings the best way to do it would be to use StringEncodings package. Here I just do it line-by-line:

using StringEncodings
f = open("u16.txt", enc"UTF-16", "r")
fout  = open("u8.txt", enc"UTF-8", "w")
for l in eachline(f)
     println(fout,l)
end
close(fout)
close(f)

Note that that such file stream can be passed directly to CSV.File if you need. Simply do with an open stream:

CSV.File(f) |> DataFrame

Upvotes: 0

Related Questions