Reputation: 97
I have a very wide dataset which consists of hundreds of date-value column pairs - however the heading of the values column contains the reference of the site from which the data is taken. I'd like to take this header as a new "site_name" column before pivoting this data to a long format.
The data for each site is the same 2-column format, so I'd like to be able to apply a solution across the whole dataset at once.
My code below illustrates the problem on a single date-value pair
Note: I've used asterisks to mean I'm describing the column names, rather than quoting them
import pandas as pd
current = pd.DataFrame({"*unnamed_date_column*" : ["2021-10-21", "2021-10-22", "2021-10-23"],
"*unique_site_name*" : [1.1, 1.2, 1.3]})
desired = pd.DataFrame({"date" : ["2021-10-21", "2021-10-22", "2021-10-23"],
"values" : [1.1, 1.2, 1.3],
"site" : ["unique_site_name", "unique_site_name", "unique_site_name"]})
Upvotes: 2
Views: 70
Reputation: 97
Solved:
Almost certainly not the optimal way to do this but worked for my 200-ish column data frame:
colnames = np.array_split(list(current.columns), len(list(current.columns)) / 2)
reorg = {}
for i in range(0, length):
df = current[colnames[i]]
rename_ind = [0, 1]
old_names = df.columns[rename_ind]
new_names = ['date', 'value']
new_df = df.assign(site = df.columns[1])
new_df.rename(columns = dict(zip(old_names, new_names)), inplace = True)
reorg[i] = new_df
pd.concat(reorg.values())
Upvotes: 0
Reputation: 24613
you can use melt :
desired = current.melt(id_vars=["*unnamed_date_column*"],var_name=['site']).rename(columns ={"*unnamed_date_column*": "date"})
output:
date site value
0 2021-10-21 *unique_site_name* 1.1
1 2021-10-22 *unique_site_name* 1.2
2 2021-10-23 *unique_site_name* 1.3
Upvotes: 1
Reputation: 262359
Difficult to know how this will generalize without knowing more examples, but you could try:
desired = (current
.assign(site=current.columns[-1]) # arbitrarily chose to index from end
.rename(columns=dict(zip(current.columns, ['date', 'values'])))
)
output:
date values site
0 2021-10-21 1.1 *unique_site_name*
1 2021-10-22 1.2 *unique_site_name*
2 2021-10-23 1.3 *unique_site_name*
Upvotes: 1