lasse3434
lasse3434

Reputation: 59

combining Data from multiple excel files

I am trying to combine data from 2 Excel files with each other, but its just not working how i'd like it to.. My Code:

import pandas as pd
import numpy as np
import xlsxwriter
import warnings
    
open_tradein_xlsx = "Z_results.xlsx"
open_keepa_xlsx =  "keepa_data.xlsx"

with warnings.catch_warnings(record=True):
    warnings.simplefilter("always")
    keepa_data = pd.read_excel(open_keepa_xlsx, usecols=['Used: Lowest'])
    tradein_data = pd.read_excel(open_tradein_xlsx, index_col=0,)

dataframe = pd.DataFrame =(tradein_data,keepa_data)
data = pd.concat(dataframe, ignore_index=True)

print(data)

#if dataframe['Used: Lowest'] < dataframe['Rebuy'] or tradein_data['Momox']:
    #print(x)

and the output:

             ISBN       Rebuy       Momox  Used: Lowest
0      Unnamed: 0  Unnamed: 1  Unnamed: 1           NaN
1             NaN         NaN         NaN           NaN
2   9783630876672       12.19         2.6           NaN
3   9783423282789       11.48         2.8           NaN
4   9783833879500       16.92       10.15           NaN
5   9783898798822        7.07        2.28           NaN
6   9783453281417       13.06        7.41           NaN
7             NaN         NaN         NaN          13.5
8             NaN         NaN         NaN          14.0
9             NaN         NaN         NaN          19.9
10            NaN         NaN         NaN           2.0
11            NaN         NaN         NaN          16.4

Process finished with exit code 0

I guess you can see what i am trying to do, the 'Used: Lowest' Data should be in lines 2-6.

I already tried to do data = pd.concat(dataframe, ignore_index=True, axis=1) but then i get following error:pandas.errors.InvalidIndexError: Reindexing only valid with uniquely valued Index objects

the code that creates 'Z_results.xlsx':

import pandas as pd
import numpy as np
import xlsxwriter
import pandas as pd
from pathlib import Path

open_momox_xlsx = ("momox_ergebnisse.xlsx")
momox_data = pd.read_excel(open_momox_xlsx,usecols='B')

open_rebuy_xlsx = ("rebuy_ergebnisse.xlsx")
rebuy_data = pd.read_excel(open_rebuy_xlsx,usecols='B')

open_isbn_xlsx = ("momox_ergebnisse.xlsx")
isbn_data = pd.read_excel(open_rebuy_xlsx,usecols='A')

dataframe = pd.DataFrame =({'ISBN': isbn_data, 'Rebuy': rebuy_data, 'Momox': momox_data})
data = pd.concat(dataframe,axis=1)

data[['Rebuy','Momox']] = data[['Rebuy','Momox']].replace({"///": np.nan, ",": "."}, regex=True).astype(float)
data = data.loc[data[['Rebuy','Momox']].ge(1.).all(axis="columns")]

isbn_output = data['ISBN']

datatoexcel = pd.ExcelWriter("Z_results.xlsx", engine='xlsxwriter')

data.to_excel(datatoexcel)

datatoexcel.save()

np.savetxt("ISBN_output.txt",isbn_output,fmt = "%s")

I thought xlsx would be the best thing to use as storagetype but now i feel like its kind of complicated..

Upvotes: 0

Views: 89

Answers (1)

williamr21
williamr21

Reputation: 147

you don't have to create a new dataframe since you are concatenating it in the next line of code. The below code should work

import pandas as pd
import numpy as np
import xlsxwriter
import warnings

open_tradein_xlsx = "Z_results.xlsx"
open_keepa_xlsx =  "keepa_data.xlsx"

with warnings.catch_warnings(record=True):
    warnings.simplefilter("always")
    keepa_data = pd.read_excel(open_keepa_xlsx, usecols=['Used: Lowest'])
    tradein_data = pd.read_excel(open_tradein_xlsx, index_col=0,)

data = pd.concat([tradein_data, keepa_data], axis=1, ignore_index=True)
print(data)

Upvotes: 1

Related Questions