Conversion of files to .csv from .dta(stata)

Question

I have problems in converting several files that is at .dta to .csv all at once using panda in python, could you help me on how to go about this because I have different files in like four folders that all contain .dta files?

Paul H · Accepted Answer

The pandas.io module has a read_stata function: http://pandas.pydata.org/pandas-docs/dev/generated/pandas.io.stata.read_stata.html.

That will read an individual stata file into a dataframe. From there you can use the dataframe's .to_csv method to save a new file in your desired format.

When it comes to getting all of the data in your directories, I think your quickest path forward will look something like this (untested):

import glob
import os
import pandas

my_directories = ['/path/to/first', '/path/to/second', ..., '/path/to/nth']
for my_dir in my_directories:
    stata_files = glob.glob(os.path.join(my_dir, '*.dta'))  # collects all the stata files
    for file in stata_files:
         # get the file path/name without the ".dta" extension
         file_name, file_extension = os.path.splitext(file)

         # read your data
         df = pandas.read_stata(file, ...)

         # save the data and never think about stata again :)
         df.to_csv(file_name + '.csv')

Conversion of files to .csv from .dta(stata)

Answers (1)

Related Questions