Reputation: 23

Nested list of strings, split by whitespace

Hi there this is my first post here so sorry if im formatting it weirdly or something. Anyways so i am currently in python trying to solve a challenge. And im supposed to read a table of text into my program and what i want it to do is basically split each line of words into a list and then have each word be at its own index. this is the code im currently using:

with open(database, "r") as data:
    datatext = data.read()

datatext = datatext.replace(",", " ")

datarr = datatext.split("\n")

reader = csv.reader(datarr)

print([word for word in [row for row in reader]])

where database is representing the file of text. What this does is it makes each line of text in database its own nested list however it takes all of the word separated by whitespace and turns it in to one single string, so each nested list only have index 0. like this:

[['name AGATC TTTTTTCT AATG TCTAG GATA TATC GAAA TCTG'], ['Albus 15 49 38 5 14 44 14 12'], ['Cedric 31 21 41 28 30 9 36 44'], ['Draco 9 13 8 26 15 25 41 39']... etc]

but what i in reality want is:

[['name', 'AGATC', 'TTTTTTCT', 'AATG', 'TCTAG', 'GATA', 'TATC', 'GAAA', TCTG'], ['Albus', '15', '49', '38', '5', '14', '44', '14', '12']... etc]

basically i want each word/string to be its own index inside of the nested list. Could anyone please help me with this? I have been trying to google around but havent been able to find the correct solution. Hopefully this was not vaguely written. Greatful for any answers :)

edit: How the text file is written:

name,AGATC,TTTTTTCT,AATG,TCTAG,GATA,TATC,GAAA,TCTG
Albus,15,49,38,5,14,44,14,12
Cedric,31,21,41,28,30,9,36,44
Draco,9,13,8,26,15,25,41,39

last edit: removing the datatext.replace(...) solved it :=)

Upvotes: 0

Answers (2)

user2390182

Reputation: 73450

You're overcomplicating things. The following should suffice:

with open(database, "r") as data:
     reader = csv.reader(data)
     print([row for row in reader])

Upvotes: 1

Gripp

Reputation: 166

Hard to help without a sample of the database that you're opening. But I would imagine that it's the csv.reader that's splitting on the whitespace. Try playing the parameter delimiter in that function. I imagine something like csv.reader(datarr, delimiter=' ') would work. But, again, would need sample data to work with.

Upvotes: 0

Nested list of strings, split by whitespace

Answers (2)

Related Questions