Reputation: 21
I have not worked with Pandas before and I am seeking guidance on the best course of action.
Currently, I have an excel(.xlsx)spreadsheet that I am reading into a data Pandas DataFrame. Within that excel spread sheet, it contains account data, document control number, contract id, manufacturer contract id, series number, include exclude, start date, end date and vendors customer id.
From that data, all of the account numbers need to be copied back to every row of data from document key co, document control number, contract id, manufacturer contract id, series number, include exclude, start date, end date and vendors customer id.
Here is a sample of the data:
I've read in the DataFrame and iterated over the DataFrame with the following code:
#reads in template data. Keeps leading zeros in column B and prevents "NaN" from appearing in blank cells
df = pd.read_excel('Contracts.xlsx', converters = {'document_key_co' : lambda x: str(x)}, na_filter = False)
#iterates over rows
for row in df.itertuples():
print(row)
After doing those things, that is where I am stuck. The desired outcome is this:
As you can see there are three accounts copied to the each of the contract id's.
Reading through the Pandas documentation, I considered separating each account into a separate DataFrame and using concat/merging it into another DataFrame that included document key co - vendors customer id, but felt like that was a lot of extra code when there's a likely a better solution.
Upvotes: 0
Views: 128
Reputation: 21
I was able to accomplish the task utilizing this snippet of code:
concats = []
for x in df.account.values:
concats.append(df.copy())
concats[-1].account = x
pd.concat(concats)
Upvotes: 1