Paul Ledesma
Paul Ledesma

Reputation: 100

Reading values in a csv file while deleting unwanted ones

I want to read a csv file that contains floats and arrays. I only want to collect float values and get rid of array ones.

I have tried the following code :

with open('resultsMC_100_var.csv', "r") as input:
with open('new.csv', "w") as output :
    for line in input :
        if not line.count(('[') or (']')) :
           output.write(line)

But the problem is that array values are written on several lines so the code does not work as intended...

I show you the first line of my csv file so you can have an idea of how it is built :

51.3402815384;28.1789716134;76.7144759149;28.5590830355;50.719035557;4.83225361254;[  23.35145494   23.6919634    21.1406396    77.35953884  121.68508966   23.02126533   24.64623985   22.30757623   59.53286234   86.01880338   22.34363071   29.75759786   30.94420056   27.24198645   21.62989704
   22.57036406   23.09155954   26.32781992   22.82521813   99.12230864
   22.04329951   22.50081984  104.84634521   59.48921929   34.47985424

What I would like to do is a code that reads all the values, then stops if it meets the symbol [ and then reads again as soon as it meets ]. I do not know how to do it properly and I have not found a similar topic on this website so I will be thankful to anyone who can helps me.

Upvotes: 1

Views: 55

Answers (2)

Krishna Kulkarni
Krishna Kulkarni

Reputation: 1

You could try using regex. Here's what I think will work.

import re

inp = open("results.csv", "r")
inp_data = inp.read()

out_data = re.sub(r"\[[^\[\]]*\]", "", inp_data)
out = open("xyz.csv", "w")

out.write(out_data)

This first reads your input data into a string. It then replaces all arrays with "". You can then write this updated string to a new file. Hope this helps!

Upvotes: 0

olinox14
olinox14

Reputation: 6643

The problem with your statement is that line.count(('[') or (']')) is the same as writing line.count('['), since a non-empty string is evaluated to True...

A simple solution here would be to use a regex:

import re

with open('test.txt', "r") as f:
    content = f.read()

    with open('new.txt', "w") as output :
        new_line = re.sub(r"\[[^\[\]]*\]", "", content, flags=re.MULTILINE)
        output.write(new_line)

Upvotes: 1

Related Questions