Convert yearly wide table to weekly long table

Question

I'm working with climate data in a pandas DataFrame that is currently in an wide table format where each row represents a year of data for a specific region and the weekly variables are in the column.

Is there a way that I can transform the table such that there is an additional 'week' column and the variable values for each week are listed in the columns?

For instance, my table looks like this currently, where the variable suffix indicates a week number:

ID	Year	precip1	precip2	precip3	max_temp1	max_temp2	max_temp3
1100	2000	5.3	3.0	3.1	13.3	15.3	3.1
1100	2001	6.6	3.2	1.1	11.3	12.3	6.1
5903	2000	3.4	0.5	2.1	10.3	18.3	8.1
5903	2001	1.7	3.8	8.1	12.3	16.3	5.1

But I want the resulting table to look like this:

ID	Year	Week	precip	max_temp
1100	2000	1	5.3	13.3
1100	2000	2	3.0	15.3
1100	2000	3	3.1	3.1
1100	2001	1	6.6	11.3
1100	2001	2	3.2	12.3
1100	2001	3	1.1	6.1
5903	2000	1	3.4	10.3
5903	2000	2	0.5	18.3
5903	2000	3	2.1	8.1
5903	2001	1	1.7	12.3
5903	2001	2	3.8	16.3
5903	2001	3	8.1	5.1

I've tried using pd.melt() on the entire DataFrame but the resulting table isn't what I am looking for.

bruno-uy · Accepted Answer

Let's name your initial DataFrame as dfso. The following code will do exactly what you want using melt:

# Unpivot precipitation columns
dfp = dfso[["ID", "Year", "precip1", "precip2", "precip3"]].melt(["ID", "Year"], var_name="Week", value_name="precip")
# Clean the Week column
dfp["Week"] = dfp["Week"].str.replace("precip", "")

# Unpivot max temperature columns
dft = dfso[["ID", "Year", "max_temp1", "max_temp2", "max_temp3"]].melt(["ID", "Year"], var_name="Week", value_name="max_temp")
# Clean the Week column
dft["Week"] = dft["Week"].str.replace("max_temp", "")

# Merge both for desired result
result = dfp.merge(dft, on=["ID", "Year", "Week"], how="inner")

Updated:

Use dft["Week"] = dft["Week"].str.replace("max_temp", "") instead of dft["Week"].apply(lambda x: x.replace("max_temp", "")) which is faster (this is the max_temp case, but apply for both transformations). Thanks for the comment in the answer @tdy.

Another option (thanks to the comment on the question) could be:

result = pd.wide_to_long(
    dfso,
    i=["ID", "Year"],
    stubnames=["precip", "max_temp"],
    j="Week",
    suffix="."
).reset_index()

Convert yearly wide table to weekly long table

Answers (1)

Related Questions