Shaun
Shaun

Reputation: 531

python - reading list with strings and convert it to int() but keep specific format

I have a file full of strings which i read into a list. Now I'd like to find a specific line (for example the first line below) by looking for .../002/... and add to these 002 +5 to give me /007/, in order to find my next line containing /007/.

The file looks like this

https://ladsweb.modaps.eosdis.nasa.gov/archive/allData/6/MYD021KM/2018/002/MYD021KM.A2018002.1345.006.2018003152137.hdf
https://ladsweb.modaps.eosdis.nasa.gov/archive/allData/6/MYD021KM/2018/004/MYD021KM.A2018004.1345.006.2018005220045.hdf

with this i could identify for example the first line:

match = re.findall("/(\d{3})/", data_time_filtered[i])

The problem now is: how do I convert the string to integers but keeping the format 00X? Is this Ansatz correct?:

match_conv = ["{WHAT's in HERE?}".format(int(i)) for i in match]

EDIT according to suggested answers below:

So apparently there's no way to directly read the numbers in the string and keep them as they are?

adding 0s to the number with zfill and other suggested functions makes it more complicated as /00x/ should remain max 3 digits (as they represent days of year). So i was looking for an efficient way to keep the numbers from the string as they are and make them "math-able".

Upvotes: 1

Views: 356

Answers (4)

Thierry Lathuille
Thierry Lathuille

Reputation: 24232

We can first define a function that adds a integer to a string and returns a string, padded with zeros to keep the same length:

def add_to_string(s, n):
    total = int(s)+n
    return '{:0{}}'.format(total, len(s))

add_to_string('003', 2)
#'005'
add_to_string('00030', 12 )
#'00042

We can then use re.sub with a replacement function. We use the regex r"(?<=/)\d{3}(?=/)" that matches a group of 3 digits, preceded and followed by /, without including them in the match.

The replacement function takes a match as parameter, and returns a string.You could hardcode it, like this:

import re

def add_5_and_replace(match):
    return add_to_string(match.group(0), 5)

url = 'https://nasa.gov/archive/allData/6/MYD021KM/2018/002/MYD021KM.hdf'

new = re.sub(r"(?<=/)\d{3}(?=/)", add_5_and_replace, url)
print(new)
# https://nasa.gov/archive/allData/6/MYD021KM/2018/007/MYD021KM.hdf

But it could be better to pass the value to add. Either use a lambda:

def add_and_replace(match, n=1):
    return add_to_string(match.group(0), n)

url = 'https://nasa.gov/archive/allData/6/MYD021KM/2018/002/MYD021KM.hdf'

new = re.sub(r"(?<=/)\d{3}(?=/)", lambda m: add_and_replace(m, n=5), url)

Or a partial function. A complete solution could then be:

import re
from functools import partial

def add_to_string(s, n):
    total = int(s)+n
    return '{:0{}}'.format(total, len(s))

def add_and_replace(match, n=1):
    return add_to_string(match.group(0), n)

url = 'https://nasa.gov/archive/allData/6/MYD021KM/2018/002/MYD021KM.hdf'

new = re.sub(r"(?<=/)\d{3}(?=/)", partial(add_and_replace, n=3), url)
print(new)

# https://nasa.gov/archive/allData/6/MYD021KM/2018/005/MYD021KM.hdf

If you only want to add the default value 1 to your number, you can simply write

new = re.sub(r"(?<=/)\d{3}(?=/)", add_and_replace, url)
print(new)

# https://nasa.gov/archive/allData/6/MYD021KM/2018/003/MYD021KM.hdf

Upvotes: 1

U13-Forward
U13-Forward

Reputation: 71570

Or you rjust and ljust:

>>> '2'.ljust(3,'0')
'200'
>>> '2'.rjust(3,'0')
'002'
>>> 

Or:

>>> '{0:03d}'.format(2)
'002'

Or:

>>> format(2, '03')
'002'

Or:

>>> "%03d" % 2
'002'

Upvotes: 1

user10356004
user10356004

Reputation:

You can't get int to be 001, 002. They can only be 1, 2.

You can do similar by using string.

>>> "3".zfill(3)
'003'
>>> "33".zfill(3)
'000ss'
>>> "33".rjust(3, '0')
'033'
>>> int('033')
33

>>> a = 3
>>> a.zfill(3)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'int' object has no attribute 'zfill'

Upvotes: 1

Patrick Artner
Patrick Artner

Reputation: 51643

Read about mini format language here:

c = "{:03}".format(25) # format a number to 3 digits, fill with 0
print(c)

Output:

025

Upvotes: 1

Related Questions