Reputation: 561
I have three txt files with data,4 columns of numbers.I need to load them to one data frame (dimension [3,n] where n is lenght of column).Becouse I need only one column from each file I decided to use Series.from_csv() function but I cannot comprehend the output. I have write this code:
names = glob.glob("*.txt")
for i in names:
rank = pd.Series.from_csv(i,sep=" ",index_col = 3)
print rank
And this print me one column of my data(thats good) but also one column filled entire with zeros like this:
0.039157 0
0.039001 0
0.038524 0
0.038579 0
0.038385 0
What I find more bizzare is when I use
rank = pd.Series.from_csv(i,sep=" ",index_col = 3).values
I got this:
[0 0 0 ..., 0 0 0]
[0 0 0 ..., 0 0 0]
[0 0 0 ..., 0 0 0]
So its mean that this zeros were values read from files? Then what is the first column from from before?I have tried many method,but I have failed to understand this.
Upvotes: 2
Views: 182
Reputation: 862511
I think you can use more common read_csv
with delim_whitespace=True
and usecols
for filtering column, first append all DataFrames
to list
dfs
and then use concat
:
dfs = []
names = glob.glob("*.txt")
for i in names:
rank = pd.read_csv(i,delim_whitespace=True,usecols=[3])
print rank
dfs.append(rank)
df = pd.concat(dfs, axis=1)
Or with sep='\s+'
- separator is arbitrary whitespace:
dfs = []
names = glob.glob("*.txt")
for i in names:
rank = pd.read_csv(i,sep='\s+',usecols=[3])
print rank
dfs.append(rank)
df = pd.concat(dfs, axis=1)
You can use also list comprehension
:
files = glob.glob("*.txt")
dfs = [pd.read_csv(fp, delim_whitespace=True,usecols=[3]) for fp in files]
df = pd.concat(dfs, axis=1)
Upvotes: 2