Reputation: 263
I have several different data files that I need to import using genfromtxt. Each data file has different content. For example, file 1 may have all floats, file 2 may have all strings, and file 3 may have a combination of floats and strings etc. Also the number of columns vary from file to file, and since there are hundreds of files, I don't know which columns are floats and strings in each file. However, all the entries in each column are the same data type.
Is there a way to set up a converter for genfromtxt that will detect the type of data in each column and convert it to the right data type?
Thanks!
Upvotes: 0
Views: 778
Reputation: 86300
If you're able to use the Pandas library, pandas.read_csv
is much more generally useful than np.genfromtxt
, and will automatically handle the kind of type inference mentioned in your question. The result will be a dataframe, but you can get out a numpy array in one of several ways. e.g.
import pandas as pd
data = pd.read_csv(filename)
# get a numpy array; this will be an object array if data has mixed/incompatible types
arr = data.values
# get a record array; this is how numpy handles mixed types in a single array
arr = data.to_records()
pd.read_csv
has dozens of options for various forms of text inputs; see more in the pandas.read_csv documentation.
Upvotes: 1