BBG_GIS
BBG_GIS

Reputation: 305

How do i find and replace characters in nested list using python?

I have a nested list with time values. I want to check and replace times that do not in time format "HH:MM". The first step i want to do is adding ":00" for numbers that have not ":" . My list is look like the below list (mylist) .

mylist = [['x', '6 - 9:30 AM - 10:30 AM - 2 PM - 5 PM - 9 PM], ['y',  7:30 AM - 2:30 PM, 7:30 AM - 2:30 PM, 7:30 AM - 1:30 PM']]

res = [['x', '6:00 - 9:30 AM - 10:30 AM - 2:00 PM - 5:00 PM - 9:00 PM], ['y',  7:30 AM - 2:30 PM, 7:30 AM - 2:30 PM, 7:30 AM - 1:30 PM]]

I have tried this code:

for idx, (id,name) in enumerate(mylist):

    for n2,j in  enumerate(name.split(' - ')) :
        if ':' not in j and id not in j:
            print(name)
            if ":" not in name.split('-')[0] and ":" not in name.split('-')[1]:
                list1[idx][n2] = name.split('-')[0].split(' ')[0] + ':00' + ' AM' + ' - ' + \
                                name.split('-')[1].split(' ')[1].strip() + ':00' + ' PM'
                # print(name)
            elif ":" not in name.split('-')[0]:
                list1[idx][n2] = name.split('-')[0].split(' ')[0] + ':00' + ' AM' + ' - ' + \
                                name.split('-')[1].split(' ')[1].strip() + ' PM'

            elif ":" not in name.split('-')[1]:
                list1[idx][n2] = name.split('-')[0].split(' ')[0] + ' AM' + ' - ' + name.split('-')[1].split(' ')[
                    1].strip() + ':00' + ' PM'
            else:
                list1[idx][n2] = name.split('-')[0].split(' ')[0] + ' AM' + ' - ' + name.split('-')[1].split(' ')[
                    1].strip() + ' PM'

but it rised the below error:

name.split('-')[1].split(' ')[1].strip() + ' PM' IndexError: list assignment index out of range

How can i solve the issue?

Upvotes: 1

Views: 579

Answers (2)

Leonardo Emili
Leonardo Emili

Reputation: 401

Another way is to model a function that hides the complexity of the task by applying the time extraction task to each component of your input list. Here is a solution:

Your input list to which I have added missing single quotes:

mylist = [['x', '6 - 9:30 AM - 10:30 AM - 2 PM - 5 PM - 9 PM'], ['y', '7:30 AM - 2:30 PM, 7:30 AM - 2:30 PM, 7:30 AM - 1:30 PM']]

Define a function f() that will parse into HH:MM each of the input values (assuming they are all separated either by a comma or a dash):

def f(time):
    t = re.findall(r'\d+', time)
    suffix = ""
    if "AM" in time:
        suffix = "AM"
    elif "PM" in time:
        suffix = "PM"
    if len(t) > 1:
        return ':'.join(t) + suffix
    return t[0] + ":00" + suffix

What it does is basically extracting digits using a regular expression on the input values, parse them into hours and minutes and finally apply the correct suffix (either empty/AM/PM according to the requirements).

As example this will print your values:

for ls in mylist:
    ls = re.split('-|,', ls[1])
    print([f(x) for x in ls])

Upvotes: 1

sortas
sortas

Reputation: 1673

The whole logic you use it's correct, but you need to replace splits with some regex. For example, if you want to be sure that all the time values in x are with :00, you can apply something like this:

test_text = "6 - 9:30 AM - 10:30 AM - 2 PM - 5 PM - 9 PM"
print(re.sub(r'(\s|^)(\d+)(\s)', r'\1\2:00\3', test_text))
6:00 - 9:30 AM - 10:30 AM - 2:00 PM - 5:00 PM - 9:00 PM

The task here was to insert :00, so:

  • Firstly we check that it's hours (either start of the string or first number after the empty space): (\s|^)
  • Then we check that it's must be a number (or multiple numbers): (\d+)
  • Then we check that it doesn't have minutes (empty space after): (\s)
  • Then we mention all the groups (\1, \2, \3) so re.sub won't touch them, and just insert :00 in between.

You can apply the same logic to all possible tasks you have here.

Upvotes: 2

Related Questions