Reputation: 840
I have this Data frame with these columns
dd = pd.DataFrame({'a':[1],'1':[1],'2':[1],'4':[1],'6':[1],'b':[1]})
a 1 2 4 6 b
0 1 1 1 1 1 1
I want to add the missing column numbers like col 3 and col 5 is missing in its sequential manner, I can surely do this which gives the expected output.
dd['3'] = 0
dd['5'] = 0
dd=dd.reindex(columns= ['a', '1','2','3','4','5','6','b'])
a 1 2 3 4 5 6 b
0 1 1 1 0 1 0 1 1
I have thousands of columns I can't do it manually is there a way we can add them via a loop or something
Upvotes: 2
Views: 97
Reputation: 71689
Let's filter
the numeric columns then using get_loc
obtain the location in the dataframe correspoding to the start and end location of the numeric columns, finally use reindex
with fill_value=0
to reindex accordingly:
c = dd.filter(regex=r'^\d+$').columns
l1, l2 = dd.columns.get_loc(c[0]), dd.columns.get_loc(c[-1])
idx = np.hstack([dd.columns[:l1], np.r_[c.astype(int).min():c.astype(int).max() + 1].astype(str), dd.columns[l2 + 1:]])
dd = dd.reindex(idx, axis=1, fill_value=0)
a 1 2 3 4 5 6 b
0 1 1 1 0 1 0 1 1
Upvotes: 3
Reputation: 1614
Please try this:
for i in range(1, int(df.columns[-2])):
if str(i) not in df.columns:
df.insert(i, str(i), 0)
Prints:
a 1 2 3 4 5 6 b
0 1 1 1 0 1 0 1 1
Assuming this as per the comments that the numbered column sequence starts from second column till the second last column. The code also works if you have only one numbered column between start and end columns.
Upvotes: 1