Reputation: 161
I have a long list (sample below)
df_list = ['Joe',
'UK',
'Buyout',
'10083',
'4323',
'http://info2.com',
'Linda',
'US',
'Liquidate',
'97656',
'1223',
'http://global.com',
'[email protected]'
]
As you can see, the list contains information about an individual (Joe and Linda's). However, the problem is that for some observations (Joe in this example), I am missing 7th element, which corresponds to the entity's email address, because for Linda, we do have this person's email, thus populated.
I want to turn this list into a dataframe with 7 columns (below), and for observations that do not have a valid email address (does not contain "@"), I want to put Null/empty values, rather than the next element, which would be the next observation's NAME column for email column.
cols = ['NAME'
,'COUNTRY'
,'STRATEGIES'
,'TOTAL FUNDS'
,'ESTIMATED PAYOFF'
,'WEBSITE'
,'EMAIL']
So far, this is where I am at
big_list = [] #intention is to append N (number of unique entity) small_lists into a big_list and call pd.DataFrame(big_list)
small_list = [] #intention is to create a small_list for each observation/entity, containing 7 values, including email or null if empty
for element in df_list:
small_list.append(element)
if ("@" not in small_list):
small_list[-1] = None
Any help would be highly appreciated! Thanks
Upvotes: 2
Views: 83
Reputation: 17322
you could use a generator:
def gen_batch(df_list):
i = 6
while i <= len(df_list):
if i < len(df_list) and '@' in df_list[i]:
yield df_list[i-6: i+1]
i += 7
else:
yield df_list[i-6: i] + [pd.np.NAN]
i += 6
pd.DataFrame(gen_batch(df_list), columns=cols)
Upvotes: 1
Reputation: 13401
IIUC you need:
new_list = []
counter = 0
while True:
try:
if "@" not in df_list[counter+6]:
new_list.append(df_list[counter:counter+6])
counter += 6
else:
new_list.append(df_list[counter:counter+7])
counter += 7
except IndexError:
break
df = pd.DataFrame(new_list, columns=cols)
print(df)
Output:
NAME COUNTRY STRATEGIES TOTAL FUNDS ESTIMATED PAYOFF WEBSITE \
0 Joe UK Buyout 10083 4323 http://info2.com
1 Linda US Liquidate 97656 1223 http://global.com
EMAIL
0 None
1 [email protected]
Upvotes: 1