Reputation: 3800
I have the following string variable
data='Industry \t& Company \t\t\t \t\t & Variable Name \n Oil \t& Mobil \t\t\t \t\t & MOBIL \n \t& Texaco \t\t\t \t\t & TEXACO \n Computers \t& IBM \t\t\t \t\t & IBM \n \t \t& Digital Equipment Co. \t\t & DEC \t\t \n \t \t& Data General \t\t\t\t\t & DATGEN \n Electricity & Consolidated Edison \t\t & CONED \n \t & Public Service of New Hampshire & PSNH \n \t & General Public Utilities \t\t & GPU \n Forestry & Weyerhauser \t\t\t\t\t & WEYER \n \t & Boise \t\t\t\t\t\t & BOISE \n Electronics & Motorola \t\t\t\t\t\t & MOTOR \n \t & Tandy \t\t\t\t\t\t & TANDY \n Airlines & Pan American \t\t\t\t\t & PANAM \n \t & Delta \t\t\t\t\t\t & DELTA \n Banks & Continental Illinois \t\t\t & CONTIL \n \t & Citicorp\t\t\t\t\t\t & CITCRP \n Food & Gerber \t\t\t\t\t\t & GERBER \n \t & General Mills \t\t\t\t & GENMIL \n Chemicals & Dow \t\t\t\t\t\t & DOW \n \t & Dupont \t\t\t\t\t\t & DUPONT \n \t & Conoco \t\t\t\t\t\t & CONOCO '
I was able to convert it to a panda table using the following codes (it would be nice if you have an easier way of doing it)
lines = data.split("\n")
array = np.zeros(shape=(len(lines),3))
array=array.astype('str')
for i1 in range(len(lines)):
set1=lines[i1].split('&')
for i, v in enumerate(set1):
set1[i]=v.replace('\t', '').replace(' ', '')
for i2 in range(3):
array[i1,i2]=set1[i2]
df=pd.DataFrame(array[1:],columns=array[0])
So now my df looks as follow
Is there a way to replace empty cells like the one in 0 oil and 1, 1, to computer and 2,2 to electricity. Such that the empty cell copy the one above it.
Thank you so much in advance
Upvotes: 1
Views: 41
Reputation: 51
Use str.get_dummies()
Refer to this, https://www.geeksforgeeks.org/python-pandas-series-str-get_dummies/
Upvotes: 0
Reputation: 9081
Use -
df['Industry'] = df['Industry'].replace('', np.nan).ffill()
Output
0 Oil
1 Oil
2 Computers
3 Computers
4 Computers
5 Electricity
6 Electricity
7 Electricity
8 Forestry
9 Forestry
10 Electronics
11 Electronics
12 Airlines
13 Airlines
14 Banks
15 Banks
16 Food
17 Food
18 Chemicals
19 Chemicals
20 Chemicals
Name: Industry, dtype: object
Upvotes: 1