Dinesh Reddy
Dinesh Reddy

Reputation: 825

Python regex for diffstat output

I would like to match the following strings using python regex and extract the numbers.

1 file changed, 1 insertion(+), 1 deletion(-)
2 files changed, 10 insertions(+), 10 deletions(-)
1 file changed, 1 insertion(+)
1 file changed, 2 deletions(-)

So i though to use the named groups in python regex and look ahead patterns. But that is not working as expected.

#!/usr/bin/python
import re
pat='\s*(\d+).*changed,\s+(\d*)(?P<in>=\s+insertion).*(\d+)(?P<del>=\s+deletion.*')
diff_stats = re.compile(pat)
obj = diff_stats.match(line)

Upvotes: 3

Views: 232

Answers (2)

karthik manchala
karthik manchala

Reputation: 13640

Remove = from named capture group.. Also.. your last group is not closed!

\s*(\d+).*changed,\s+(\d*)(?P<in>\s+insertion).*(\d+)(?P<del>\s+deletion).*
                                 ↑                           ↑          ↑

See DEMO

Edit: Improved regex for + and - too and named capture of digits:

\s*(\d+)\s+files?\s+changed,\s*((?P<in>\d+)\s*(insertions?)\([+-]\))?,?\s*((?P<del>\d+)\s*(deletions?)\([+-]\))?

See DEMO

Upvotes: 1

Avinash Raj
Avinash Raj

Reputation: 174786

You must need to add end of the line anchor. So that you get a complete match. And also you need to make some parts as optional.

^\s*(\d+).*\bchanged,\s+(?:(\d*)(?P<in>\s+insertion).*?)?(?:(\d+)(?P<del>\s+deletion.*))?$

DEMO

Upvotes: 1

Related Questions