user5095342
user5095342

Reputation:

Python find text in file between quotation marks

I'm trying to capture the text within quotation marks and set them as variables for me to change them at a later stage. I know how to do this in bash shell, but I'm at a loss how to do this in Python.

I've started with this, but I was hoping someone can point me where my mistakes are.

import re
input = open(filename, 'r')
quotes = re.findall(r'"[^"]*"', input.read(), re.U)
print quotes

Sadly though, this outputs:

['"test1"', '"test2"']

while I'm looking for:

value1 = test1
value2 = test2

In Bash I used this (but I obviously cant use it this way!):

i=0
regex='"([^"]*)"'
while read line
do
    if [[ $line =~ $regex ]]; then
        printf -v "text$i" '%s' "${BASH_REMATCH[1]}"
        i=$((i + 1))
    fi
done < filename

echo "value1: $text0"
echo "value2: $text1"

Upvotes: 3

Views: 2200

Answers (3)

Dima Tisnek
Dima Tisnek

Reputation: 11779

Use a non-capturing group (?:...), like this:

In [18]: re.findall('(?:")([^"]*)(?:")', '''hello "foo" "bar" haha''')
Out[18]: ['foo', 'bar']

Or use non-consuming groups (?<=...) etc:

In [14]: re.findall('(?<=")[^"]*(?=")', '''hello "foo" "bar" haha''')
Out[14]: ['foo', ' ', 'bar']

The latter has a side-effect of also selecting " " between "foo" and "bar".

Upvotes: 2

Kenly
Kenly

Reputation: 26748

The problem here is regex matching between two strings (" ").
Use the following:

vars = re.findall('"(.*?)"', text)

Upvotes: 0

cyrilc
cyrilc

Reputation: 43

the regex you're using in python isn't the same in bash.it should work with "([^"]*)". i tried..

import re
input = open(filename, 'r')
quotes = re.findall(r'"([^"]*)"', input.read(), re.U)
for value in quotes :
    print value

Upvotes: 0

Related Questions