Alexander Marcussen
Alexander Marcussen

Reputation: 155

Split a string into a list in Python

I have a textfile that I want to put into lists.

The textfile looks like this:

New  Distribution  Votes  Rank  Title
     0000000125  1196672  9.2  The Shawshank Redemption (1994)
     0000000125  829707   9.2  The Godfather (1972)
     0000000124  547511   9.0  The Godfather: Part II (1974)
     0000000124  1160800  8.9   The Dark Knight (2008)

I have tried splitting the list with this code:

x = open("ratings.list.txt","r")
movread = x.readlines()
x.close()


s = raw_input('Search: ')
for ns in movread:
    if s in ns:
        print(ns.split()[0:100])

Output:

      Search: #1 Single
     ['1000000103', '56', '6.3', '"#1', 'Single"', '(2006)']

But it does not give me the output i want

It splits on the spaces between the Title.

How can I split it into a list without breaking up the title?

Expected output:

 Search: #1 Single

  Distribution  Votes  Rank           Title
 ['1000000103', '56', '6.3', '"#1 Single" (2006)']

Upvotes: 3

Views: 322

Answers (5)

Vinay Bhargav
Vinay Bhargav

Reputation: 365

Syntax for splitting is: str.split([sep[, maxsplit]])

'sep' is the seperator used to split strings(by default it matches any white space character)
'maxsplit' argument can be used to limit no. of splits as mentioned by Tim

Here if you are using '\t' in between your columns, you can just use '\t' as seperator

As per standard practice, '\t' is used as seperator for columns so that splitting won't interfere with other spaces in strings. And moreover there won't be any compatibility issues with whatever python version you are using.

Hope this helps : )

Upvotes: 0

Jonas Byström
Jonas Byström

Reputation: 26189

import re
s = input('Search: ').lower()
for ns in open("ratings.list.txt","rt"):
    if s in ns.lower():
        print(ns.split(maxsplit=3))

Upvotes: 1

Tim Pietzcker
Tim Pietzcker

Reputation: 336478

split() takes an optional maxsplit argument:

In Python 3:

>>> s = "     0000000125  1196672  9.2  The Shawshank Redemption (1994)"
>>> s.split()
['0000000125', '1196672', '9.2', 'The', 'Shawshank', 'Redemption', '(1994)']
>>> s.split(maxsplit=3)
['0000000125', '1196672', '9.2', 'The Shawshank Redemption (1994)']

In Python 2, you need to specify the maxsplit argument as a positional argument:

>>> s = "     0000000125  1196672  9.2  The Shawshank Redemption (1994)"
>>> s.split(None, 3)
['0000000125', '1196672', '9.2', 'The Shawshank Redemption (1994)']

Upvotes: 10

sundar nataraj
sundar nataraj

Reputation: 8702

Read the docs:

  s = "     0000000125  1196672  9.2  The Shawshank Redemption (1994)"   
    print  s.split(None,3)

    #output ['0000000125', '1196672', '9.2', 'The Shawshank Redemption (1994)']

Upvotes: 1

Sam
Sam

Reputation: 747

may be u can try using re.split('your patter, string) , which should give you proper list based on your regex.

import re
d = re.split('\s+',s,3)
print d

Upvotes: 1

Related Questions