Reputation: 3

How to find phrases in a text file

My text file is this:

123 Numbers 4.5
456 Words 6.7
789 Sentences 8.9

And my code is this:

s = open('test.txt', 'r')
file = s.read()
numbers, words, decimals = [], [], []

I've gotten thus far, and i'm trying to work out how to create a list for all the numbers, words and decimals in the file. I've heard you can use the split method, so i tried this:

with open('test.txt', 'r') as f:
    for line in f:
        numbers, words, decimals = f.split(","), f.split(","), f.split(",")

I did this assuming it would split every time it encountered a space, but that didn't happen, i just got the error:

AttributeError: '_io.TextIOWrapper' object has no attribute 'split'

Any help would be appreciated. If any elaboration is necessary on what i want to do please tell me, i'm aware this may have been worded poorly.

Upvotes: 0

Answers (6)

Louise Davies

Reputation: 15941

It should be line.split and not f.split since you're splitting the line and not the file. Also, you're separating your file on commas but the example file is separated by spaces? If it is separated by spaces you need to use line.split(" ") Also, using with open() as f you don't need to open you're file beforehand or close it afterwards as it sorts that for you. Also, you were saving the entire line split array to each variable and overwriting them each time. Overall code:

numbers, words, decimals = [], [], []
with open('test.txt', 'r') as f:
    for line in f:
        numbers.append(line.split(" ")[0])
        words.append(line.split(" ")[1])
        decimals.append(line.split(" ")[2])

Upvotes: 0

Chris Mueller

Reputation: 6680

First of all, the text file you've posted does not have commas separating the columns, so splitting the string at commas won't work. If you can trust that every line of the file will be identical in structure, then you can simply change your code to be

numbers, words, decimals = [], [], []
with open('test.txt', 'r') as f:
    for line in f:
        number, word, decimal = line.split() 
        numbers.append(number)
        words.append(word)
        decimals.append(decimal)

Upvotes: 2

Dayananda

Reputation: 305

a,b,c=[],[],[]
with open('new.txt', 'r') as f:
for line in f:
    m=line.split()
    a.append(m[0])
    b.append(m[1])
    c.append(m[2])
print a,b,c

Check if this is what you wanted to achieve.

Upvotes: 0

chaos

Reputation: 490

If I understand your question correctly what you should be looking at is actually nltk. That will give you an insight in how to tokenize your text based either on words or sentences. The rest should be easy.

Upvotes: 0

Steven Rumbalski

Reputation: 45542

with open('test.txt', 'r') as f:
    numbers, words, decimals = zip(*(line.split() for line in f))

Upvotes: 1

Ondřej Grover

Reputation: 739

You want to split each line into fields

with open('test.txt', 'r') as f:
    for line in f:
        number, word, decimal = line.split()  # split on whitespace as indicated by your example file which does not use commas
        numbers.append(int(number))
        words.append(word)
        decimals.append(float(decimal))

If you really intend to use ral decimals than you should use decimal.Decimal instead of float.

Unless you are constrained in some way, I'd recommend using some library designed for working with tabular data, e.g. pandas where all this would be just

import pandas as pd
df = pd.read_table('test.txt', delim_whitespace=True)

Upvotes: 0

How to find phrases in a text file

Answers (6)

Related Questions