BoinQ
BoinQ

Reputation: 27

Python Regex split string into 5 pieces

I'm playing around with Python, and i have run into a problem. I have a large data file where each string is structured like this:

"id";"userid";"userstat";"message";"2013-10-19 06:33:20 (date)"

I need to split each line into 5 pieces, semicolon being the delimiter. But at the same time within the quotations.

It's hard to explain, so i hope you understand what i mean.

Upvotes: 0

Views: 139

Answers (3)

DSM
DSM

Reputation: 353449

That format looks a lot like ssv: semicolon-separated valued (like "csv", but semicolons instead of commas). We can use the csv module to handle this:

import csv

with open("yourfile.txt", "rb") as infile:
    reader = csv.reader(infile, delimiter=";")
    for row in reader:
        print row

produces

['id', 'userid', 'userstat', 'message', '2013-10-19 06:33:20 (date)']

One advantage of this method is that it will correctly handle the case of semicolons within the quoted data automatically.

Upvotes: 4

Iłya Bursov
Iłya Bursov

Reputation: 24209

you can split by ";" in you case, also consider using of regexp, like ^("[^"]+");("[^"]+");("[^"]+");("[^"]+");("[^"]+")$

Upvotes: 0

Ashwini Chaudhary
Ashwini Chaudhary

Reputation: 251116

Use str.split, no need of regex:

>>> strs = '"id";"userid";"userstat";"message";"2013-10-19 06:33:20 (date)"'
>>> strs.split(';')
['"id"', '"userid"', '"userstat"', '"message"', '"2013-10-19 06:33:20 (date)"']

If you don't want the double quotes as well, then:

>>> [x.strip('"') for x in strs.split(';')]
['id', 'userid', 'userstat', 'message', '2013-10-19 06:33:20 (date)']

Upvotes: 3

Related Questions