Reputation: 111

how to merge multiple dataframes inside one file, python

In my code i have received result like this one:

A B C
1 1 1
A B C
2 2 2
A B C
3 3 3

I need to merge those columns (dataframes) to one big dataframe like

To merge dataframes from different files its ease like pd.merge(df1,df2) but how to do it when dataframes are in one file? Thanks in advice!

EDIT: to receive my data i converted the lines in my dataset to get dataframes, and i have received in one output each dataset for each line. my code:

def coordinates():
    with open('file.txt') as file:
        for lines in file:
            lines =StringIO(lines[35:61]) #i need only those fields in each line
            abc=pd.read_csv(lines,sep=' ',header=None)
            abc.columns=['A', 'B', 'C','D','E','F']
            print abc

coordinates()

EDIT2: Proposition from s_vishnu its only good for prapared file with same multiple headers. But in my case i have multiple DataFrames generated to the file and each line after header have 0 value. It's many dataframes and each have only one line.

EDIT3: in my file.txt i have big amount of lines with about 80 letters in line like this:

AAA S S SSDAS ASDJAI A 234 33 43 234 2342999 2.31 22 33 SSS S D W2UUQ Q231WQ A 222 11 23 123 1231299 2.31 22 11

and from those line i need only part of information so thats why i did lines =StringIO(lines[35:61]) to take this info. In this example i will need letters [30:55] and create dataframe with them withcolumns=['A', 'B', 'C','D','E','F'] with sep=' '

Upvotes: 1

Answers (2)

Pawe

Reputation: 111

I have found the solution, I've changed the code at the beginning and that was helpfull:

def coordinates():
abc=open('file.txt')
lines=abc.readlines()
        for line in lines:
        abc2=line[20:-7] #i just cut the lines from the begining and from the end, and i dont need to take data from the middle
        abc3=abc2.split()
        pd.DataFrame(abc3) 
        print abc3

coordinates()

Upvotes: 0

void

Reputation: 2642

my_test.csv:

A, B, C
1, 1 ,1
A, B, C
2, 2, 2
A, B, C
3, 3, 3

Use list slicing.

import pandas as pd
df = pd.read_csv("my_test.csv")
df=df[::2]
print(df)

output:

   A    B   C
0  1   1    1
2  2    2   2
4  3    3   3

df=df[::2] This is advanced list slicing. Where in df[::2] the 2 means starting from 0 increment by 2 step.

But note the index values. They too are in steps of 2. i.e 0,2,4,.. to change the index just do this.

import pandas as pd
df = pd.read_csv("my_test.csv")
df=df[::2]

df.index = range(len(df['A']))
print(df)

output:

   A    B   C
0  1   1    1
1  2    2   2
2  3    3   3

So you get the values you desire.

Upvotes: 0

how to merge multiple dataframes inside one file, python

Answers (2)

Related Questions