Reputation: 19
I have a pandas dataframe with column range and strings similar to this:
STREET LOWADD HIGHADD POSTAL SECTOR
0 ABBERLY CIR 1900 2000 23112 A6
1 ABBEY VILLAGE CIR 500 600 23114 B6
I need to expand/transform it to the below, between the LOWADD and HIGHADD columns and forward filling the data in STREET, POSTAL and SECTOR:
New_Street POSTAL SECTOR
1901 ABBERLY CIR 23112 A6
1902 ABBERLY CIR 23112 A6
1903 ABBERLY CIR 23112 A6
1904 ABBERLY CIR 23112 A6
1905 ABBERLY CIR 23112 A6
Whats the best way to do this with pandas?
Upvotes: 1
Views: 221
Reputation: 863166
Idea is subtract columns for number of repeated rows by Series.sub
, then repeat by Index.repeat
and DataFrame.loc
and last add counter Series by GroupBy.cumcount
to Street
column:
df = df.reset_index(drop=True)
diff = df['HIGHADD'].sub(df['LOWADD'])
df = df.loc[df.index.repeat(diff)]
s = df.groupby(level=0).cumcount().add(1).add(df['LOWADD']).astype(str)
df['STREET'] = s + ' ' + df['STREET']
df = df.drop(['LOWADD','HIGHADD'], axis=1).reset_index(drop=True)
print (df)
STREET POSTAL SECTOR
0 1901 ABBERLY CIR 23112 A6
1 1902 ABBERLY CIR 23112 A6
2 1903 ABBERLY CIR 23112 A6
3 1904 ABBERLY CIR 23112 A6
4 1905 ABBERLY CIR 23112 A6
.. ... ... ...
195 596 ABBEY VILLAGE CIR 23114 B6
196 597 ABBEY VILLAGE CIR 23114 B6
197 598 ABBEY VILLAGE CIR 23114 B6
198 599 ABBEY VILLAGE CIR 23114 B6
199 600 ABBEY VILLAGE CIR 23114 B6
[200 rows x 3 columns]
Upvotes: 2