w--
w--

Reputation: 6675

csvkit & django a.k.a. using csvkit as modules instead of from command line

I need to do some csv file processing in a django app.
I heard about csvkit and it looks pretty cool.
github page

Want to try it out but I don't know how to consume csvkit as a module. Specifically, I want to use the CSVJSON utility. I need to pass it a csv file (and hopefully some other arguments,) but can't quite figure out how to do this.
CSV JSON Docs

I want to pass the utility an uploaded csv file, the uploaded file could be in memory(if it is small enough) or in the temporary storage area. CSVJSON looks like it takes a file path or stream. It will be a nice bonus if someone can tell me what I need to do to the uploaded file for CSVJSON to be able to consume it.

In django 1.3 i'm planning to do the work in the form_valid method.

Hoping someone with some python skills can help show me what i need to do. Thanks

Upvotes: 2

Views: 960

Answers (1)

Joe C.
Joe C.

Reputation: 1538

You can import the CSVKit JSON class using the following code:

from csvkit.utilities.csvjson import CSVJSON

The CSVKit classes take 2 constructor options; the first is the command-line arguments list, the second is the output stream. If the output stream isn't provided, it prints to the standard output.

The argparser module is used to parse the command-line arguments, so it's documentation will be helpful. The short version is that it's just like splitting the raw string of arguments you'd use on the actual command-line by spaces. For example:

$ csvjson --key Date /path/to/input/file

would translate into:

from csvkit.utilities.csvjson import CSVJSON
args = ["--key", "Date", "/path/to/input/file"]
CSVJSON(args).main()

If you don't want to read from an input file, but can't pass the input file into stdin from the command-line, you can replace the sys.stdin object with your in-memory version. The only stipulation is that the object must behave like an input file. Presuming you have the string version of the CSV file in a variable called input_string, you can use the StringIO library to create a string buffer:

import StringIO
import sys
new_stdin = StringIO.StringIO(input_string)
sys.stdin = new_stdin
args = ["--key", "Date"]
CSVJSON(args).main()

Lastly, if you want to print to a file instead of stdout, pass an open file object as the second parameter:

output_file = open("/path/to/output.txt", "w")
CSVJSON(args, output_file).main()
output_file.close()

Remember, it won't flush the buffer until you close the file object yourself; CSVJSON won't close it for you.

Upvotes: 5

Related Questions