Luiza Ribeiro
Luiza Ribeiro

Reputation: 33

Cannot read a file from HTTP.get

I needed to get some xlsm files from a web page which has login and password. To get there, I found the auth cookie in my web browser, and used HTTP.get function as follows:

r = HTTP.get(url_file; cookies=Dict("FedAuth" => auth_cookie))

Once, I got the result, I would like to write it. I tried with the write function, and all sort of formats: .xlsx, xls, xlsm, csv, like the exemple:

write("dados_bdo.xlsm",r.body)

The file was written, and i was able to open it with libreoffice, however, when I try to read with Julia, using CSV.read, or XLSX.readxlsx I get errors.

> XLSX.readxlsx("dados_bdo.xlsx") 
ERROR: AssertionError: isempty(XML_GLOBAL_ERROR_STACK)

> CSV.read("dados_bdo.csv") 
ERROR: ArgumentError: Symbol name may not contain \0

The typeof(r.body) is Array{UInt8,1}. I really think the problem is in the writing part, but I don't know how to do in any other way.

Upvotes: 3

Views: 248

Answers (3)

Luiza Ribeiro
Luiza Ribeiro

Reputation: 33

I guess my problem is really the data. There are some xlsm file that I am able to open with XLSX.readxlsx("data.xlsx") , and others that creates the same error (ERROR: AssertionError: isempty(XML_GLOBAL_ERROR_STACK)).

Also, after I reproduce that error in one file, I'm not able to open any other file, even the ones that didn't reproduce any error.

I'll post here the link to two files of the same source, one that gets me the error and another that don't.Google Drive: Error File

Upvotes: 0

jerlich
jerlich

Reputation: 362

A google search for "ERROR: ArgumentError: Symbol name may not contain \0" reveals this github issue which suggests that you should try to use the following option in reading the CSV:

CSV.read(file; normalizenames=true)

Without knowing the URL (and the content) you are fetching, that's the best advice I can give. If you provide the URL, then I (or someone else) could try to reproduce the error and explain exactly why you are getting the error.

Upvotes: 0

Przemyslaw Szufel
Przemyslaw Szufel

Reputation: 42244

You are writing the bytes correctly.

I think that in order to debug you should look into your data.

For the CSV file you can always do String(r.body) and see what is inside the String. One possible problem might be using some character encoding in that case you should use the StringEncodings.jl

Regarding the xlsm - an excel file is in fact a zip file with lots of XMLs inside. Zip files have CRC sums. Hence to check if to downloaded correctly try unzipping it. If unzipping dados_bdo.xlsm" succeeds the file is downloaded and written correctly and the problem is somewhere else. You could also try reading this file via PyCall using one of Python's libraries and see if the problem persists.

Upvotes: 0

Related Questions