Reputation: 100
I want to read a csv file that contains floats and arrays. I only want to collect float values and get rid of array ones.
I have tried the following code :
with open('resultsMC_100_var.csv', "r") as input:
with open('new.csv', "w") as output :
for line in input :
if not line.count(('[') or (']')) :
output.write(line)
But the problem is that array values are written on several lines so the code does not work as intended...
I show you the first line of my csv file so you can have an idea of how it is built :
51.3402815384;28.1789716134;76.7144759149;28.5590830355;50.719035557;4.83225361254;[ 23.35145494 23.6919634 21.1406396 77.35953884 121.68508966 23.02126533 24.64623985 22.30757623 59.53286234 86.01880338 22.34363071 29.75759786 30.94420056 27.24198645 21.62989704
22.57036406 23.09155954 26.32781992 22.82521813 99.12230864
22.04329951 22.50081984 104.84634521 59.48921929 34.47985424
What I would like to do is a code that reads all the values, then stops if it meets the symbol [
and then reads again as soon as it meets ]
. I do not know how to do it properly and I have not found a similar topic on this website so I will be thankful to anyone who can helps me.
Upvotes: 1
Views: 55
Reputation: 1
You could try using regex. Here's what I think will work.
import re
inp = open("results.csv", "r")
inp_data = inp.read()
out_data = re.sub(r"\[[^\[\]]*\]", "", inp_data)
out = open("xyz.csv", "w")
out.write(out_data)
This first reads your input data into a string. It then replaces all arrays with "". You can then write this updated string to a new file. Hope this helps!
Upvotes: 0
Reputation: 6643
The problem with your statement is that line.count(('[') or (']'))
is the same as writing line.count('[')
, since a non-empty string is evaluated to True...
A simple solution here would be to use a regex:
import re
with open('test.txt', "r") as f:
content = f.read()
with open('new.txt', "w") as output :
new_line = re.sub(r"\[[^\[\]]*\]", "", content, flags=re.MULTILINE)
output.write(new_line)
Upvotes: 1