Reputation: 21
I have problems in converting several files that is at .dta to .csv all at once using panda in python, could you help me on how to go about this because I have different files in like four folders that all contain .dta files?
Upvotes: 1
Views: 2376
Reputation: 68146
The pandas.io
module has a read_stata
function: http://pandas.pydata.org/pandas-docs/dev/generated/pandas.io.stata.read_stata.html.
That will read an individual stata file into a dataframe. From there you can use the dataframe's .to_csv
method to save a new file in your desired format.
When it comes to getting all of the data in your directories, I think your quickest path forward will look something like this (untested):
import glob
import os
import pandas
my_directories = ['/path/to/first', '/path/to/second', ..., '/path/to/nth']
for my_dir in my_directories:
stata_files = glob.glob(os.path.join(my_dir, '*.dta')) # collects all the stata files
for file in stata_files:
# get the file path/name without the ".dta" extension
file_name, file_extension = os.path.splitext(file)
# read your data
df = pandas.read_stata(file, ...)
# save the data and never think about stata again :)
df.to_csv(file_name + '.csv')
Upvotes: 2