How to convert string of list of list to list?

Question

I have this file, it is the result of the MapReduce job so it has key-value format:

'null\t[0, [[0, 21], [1, 4], [2, 5]]]\n'
'null\t[1, [[0, 3], [1, 1], [2, 2]]]\n'

I want to remove all the character except the second element of this value list:

[[0, 21], [1, 4], [2, 5]]
[[0, 3], [1, 1], [2, 2]]

And finally, add each to a single list:

[[[0, 21], [1, 4], [2, 5]], [[0, 3], [1, 1], [2, 2]]]

This is my attempt so far:

with open(FILENAME) as f:
    content = f.readlines()

for line in content:
    # Just match all the chars upto "[[" then replace the matched chars with "["
    clean_line = re.sub(r'^.*?\[\[', '[', line)
    # And remove "\n" and the last 2 "]]" of the string
    clean_line = re.sub('[\n]', '', clean_line)[:-2]
    corpus.append(clean_line)

Output:

['[0, 21], [1, 4], [2, 5]', '[0, 3], [1, 1], [2, 2]']

You can see it is still str type, how can I make it to list type?

Sayse · Accepted Answer

Treat it as a line of json and just replace parts of your lines with json documents as needed

import json
corpus = [json.loads(line.replace('null	', '{"a":').replace("
", "}"))["a"][1] for line in content]

How to convert string of list of list to list?

Answers (2)

Related Questions