Reputation: 111
In my code i have received result like this one:
A B C
1 1 1
A B C
2 2 2
A B C
3 3 3
I need to merge those columns (dataframes) to one big dataframe like
A B C
1 1 1
2 2 2
3 3 3
To merge dataframes from different files its ease like pd.merge(df1,df2)
but how to do it when dataframes are in one file?
Thanks in advice!
EDIT: to receive my data i converted the lines in my dataset to get dataframes, and i have received in one output each dataset for each line. my code:
def coordinates():
with open('file.txt') as file:
for lines in file:
lines =StringIO(lines[35:61]) #i need only those fields in each line
abc=pd.read_csv(lines,sep=' ',header=None)
abc.columns=['A', 'B', 'C','D','E','F']
print abc
coordinates()
EDIT2: Proposition from s_vishnu its only good for prapared file with same multiple headers. But in my case i have multiple DataFrames generated to the file and each line after header have 0 value. It's many dataframes and each have only one line.
EDIT3:
in my file.txt
i have big amount of lines with about 80 letters in line like this:
AAA S S SSDAS ASDJAI A 234 33 43 234 2342999 2.31 22 33
SSS S D W2UUQ Q231WQ A 222 11 23 123 1231299 2.31 22 11
and from those line i need only part of information so thats why i did lines =StringIO(lines[35:61])
to take this info. In this example i will need letters
[30:55]
and create dataframe with them withcolumns=['A', 'B', 'C','D','E','F'] with sep=' '
Upvotes: 1
Views: 114
Reputation: 111
I have found the solution, I've changed the code at the beginning and that was helpfull:
def coordinates():
abc=open('file.txt')
lines=abc.readlines()
for line in lines:
abc2=line[20:-7] #i just cut the lines from the begining and from the end, and i dont need to take data from the middle
abc3=abc2.split()
pd.DataFrame(abc3)
print abc3
coordinates()
Upvotes: 0
Reputation: 2642
my_test.csv
:
A, B, C
1, 1 ,1
A, B, C
2, 2, 2
A, B, C
3, 3, 3
Use list slicing.
import pandas as pd
df = pd.read_csv("my_test.csv")
df=df[::2]
print(df)
output:
A B C
0 1 1 1
2 2 2 2
4 3 3 3
df=df[::2] This is advanced list slicing. Where in df[::2]
the 2 means starting from 0 increment by 2 step.
But note the index values. They too are in steps of 2. i.e 0,2,4,..
to change the index just do this.
import pandas as pd
df = pd.read_csv("my_test.csv")
df=df[::2]
df.index = range(len(df['A']))
print(df)
output:
A B C
0 1 1 1
1 2 2 2
2 3 3 3
So you get the values you desire.
Upvotes: 0