brenda89
brenda89

Reputation: 119

make regex non greedy

I have an column of strings:

'19.8983.00', '19.8984.00', '19.8985.00', '19.8986.00', '19.8989.00', '19.8990.00', '19.8991.00', '19.8992.00', '19.8993.00', '19.8994.00', '21.0515.00', '21.0520.00', '21.0521.00', '21.0523.00', '21.0530.00', '21.0531.00', '21.0532.00', '21.0533.00', '21.0534.00', '21.0535.00'

I want to remove the “19.” From the start of the string, the “.21” from the start of the string and “.00” from the end of the string.

I have tried this with regex

The problem is that the following strings:

'19.1400.00', '19.1702.00', '19.2113.00', '19.2123.00', '19.2130.00', '19.2141.00', '19.2152.00', '19.2154.00', '19.2301.00', '19.2302.00',

Are converted to:

'1400', '1702', '3', '0', '1', '2', '4', '2301', '2302',

My regex is close, but not quite correct (e.g. 19.2154.00 is somehow converted to 4). How do I make my regex correct and non-greedy so that it only works on the first match (and the last match in case of the .00)?

Upvotes: 0

Views: 60

Answers (1)

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521103

Using str.replace with a single regex alternation covering all three conditions for replacement:

composition['compound_code'] = composition['compound_code'].str.replace(r'^(?:19|21)\.|\.00$', '')

Upvotes: 2

Related Questions