Reputation: 645
I have a pandas dataframe which has a column structured as well:
sequences
-------------
[(1838, 2038)]
[]
[]
[(809, 1090)]
I'need to loop row by row, so I structured the loop as well:
for index, row in df.iterrows():
true_anom_seq = json.loads(row['sequences'])
What I wanna do is create a nested loop like [[1838, 2038], [], [], [809, 1090]]
so I can iterate through it. The problem is that the code I wrote gives me the error:
JSONDecodeError: Expecting value: line 1 column 2 (char 1)
I also tried to print row['sequences'][0]
and it gives me [
, so it is reading it as a string.
How can I convert this string to a list?
Upvotes: 2
Views: 1411
Reputation: 13106
No need to iterate through the dataframe itself nor use regex. Just apply the literal_eval function to each row in the sequence
column and wrap it as a list:
from ast import literal_eval
import pandas as pd
col = {'index': [1,2,3,4], 'sequence':['[(1838, 2038)]', '[]', '[]', '[(809, 1090)]']}
new_sequence = []
new_df = pd.DataFrame(col)
list(new_df.sequence.apply(literal_eval))
[[(1838, 2038)], [], [], [(809, 1090)]]
Upvotes: 1
Reputation: 347
import pandas as pd
import re
col = {'index': [1,2,3,4], 'sequence':['[(1838, 2038)]', '[]', '[]', '[(809, 1090)]']}
new_sequence = []
new_df = pd.DataFrame(col)
for index, row in new_df.iterrows():
one_item = []
true_anom_seq = re.findall(r'\d+', row['sequence'])
for match in true_anom_seq:
one_item.append(match)
new_sequence.append(one_item)
print(new_sequence)
Upvotes: 1
Reputation: 11
Use ast.literal_eval to convert strings to list/dict/...:
from ast import literal_eval
>>> literal_eval('[1,2,3]')
[1,2,3]
Upvotes: 1