ed1t
ed1t

Reputation: 8709

Python strings / match case

I have a CSV file which has the following format:

id,case1,case2,case3

Here is a sample:

123,null,X,Y

342,X,X,Y

456,null,null,null

789,null,null,X

For each line I need to know which of the cases is not null. Is there an easy way to find out which case(s) are not null without splitting the string and going through each element?

This is what the result should look like:

123,case2:case3

342,case1:case2:case3

456:None

789:case3

Upvotes: 0

Views: 367

Answers (4)

Christopher
Christopher

Reputation: 9094

Anyway you slice it, you are still going to have to go through the list. There are more and less elegant ways to do it. Depending on the python version you are using, you can use list comprehensions.

ids=line.split(",")
print "%s:%s" % (ids[0], ":".join(["case%d" % x for x in range(1, len(ids)) if ids[x] != "null"])

Upvotes: 1

cdleary
cdleary

Reputation: 71464

You probably want to take a look at the CSV module, which has readers and writers that will enable you to create transforms.

>>> from StringIO import StringIO
>>> from csv import DictReader
>>> fh = StringIO("""
... id,case1,case2,case3
... 
... 123,null,X,Y
... 
... 342,X,X,Y
... 
... 456,null,null,null
... 
... 789,null,null,X
... """.strip())
>>> dr = DictReader(fh)
>>> dr.next()
{'case1': 'null', 'case3': 'Y', 'case2': 'X', 'id': '123'}

At which point you can do something like:

>>> from csv import DictWriter
>>> out_fh = StringIO()
>>> writer = DictWriter(fh, fieldnames=dr.fieldnames)
>>> for mapping in dr:
...     writer.write(dict((k, v) for k, v in mapping.items() if v != 'null'))
...

The last bit is just pseudocode -- not sure dr.fieldnames is actually a property. Replace out_fh with the filehandle that you'd like to output to.

Upvotes: 2

Paweł Polewicz
Paweł Polewicz

Reputation: 3852

You could use the Python csv module, comes in with the standard installation of python... It will not be much easier, though...

Upvotes: 0

Grzegorz Oledzki
Grzegorz Oledzki

Reputation: 24271

Why do you treat spliting as a problem? For performance reasons?

Literally you could avoid splitting with smart regexps (like:

\d+,null,\w+,\w+
\d+,\w+,null,\w+
...

but I find it a worse solution than reparsing the data into lists.

Upvotes: 0

Related Questions