user12305207
user12305207

Reputation:

University Computer Science: Parsing out integer values into a list from a string

I'm currently working on a problem that goes through any string and parses out specific numbers into a new list based on some given rules.

Example String:

'800!)176^b006$(46$*#63Z*16$*06$z5^'

Expected Output:

[ 800, 600, 64, 63, 61, 60]

The rules given to parse out the numbers are:

  1. The numbers are non-negative integers, like 123

  2. The first char of each number is always a digit.

  3. If a $ char appears immediately after a number, its digits are backwards. So 211$ is the number 112.

  4. If a ^ char appears immediately after a number, it's as if that number is not present in the data, and it is omitted from the output. So for example 176^ would be omitted.

  5. The numbers are separated from each other by random chars which are not ^ or $ or digits.

I've begun the problem and get some test cases to work.

I have 3 core problems rn in my code:

Regarding Rule 3: My code recognizes the $ in the string and will run through it backward and it will give me success for my test case. BUT, only if the string has nothing past $. I am unsure how to get it to execute for only the string before it and then move on.

Regarding Rule 4: I am just unsure how to omit the whole string section before the ^

The last one is getting the code to do these operations based on the rules and then appending them to a list properly. Meaning I don't know how to get it to execute everything as a series of a substring within the given string.

def parse_line(s):
    """
    Given a string s, parse the ints out of it and
    return them as a list of int values.
    >>> parse_line('12$35$')
    [21, 53]
    """
    search = 0
    lst_num = []

    if len(s) > 1:
        while True:
            start = search

            while start < len(s) and not s[start].isdigit():
                start += 1
            if start >= len(s):
                break

            end = start + 1
            while end < len(s) and s[end].isdigit():
                end += 1

            if end < len(s) and not s[end].isdigit():
                if s[end] == '$':
                    rev_case = reverse(s)
                    lst_num.append(rev_case)
                if s[end] == '^':
                    continue
                end += 1

            search = end + 1
        return lst_num
    lst_num.append(int(s))
    return lst_num

I expect the code to return a list of the numbers only in reverse order:

parse_line('12$35$')
[21, 53]

I get a failed message error:

ValueError: invalid literal for int() with base 10: '53$21'

Upvotes: 1

Views: 83

Answers (2)

SyntaxVoid
SyntaxVoid

Reputation: 2633

You can iterate through your string, one element at a time. If the character is a digit, you can save it to a temporary list. Once you hit a non-number, you can check it for the special characters (per your rules). If the character is a $ and you have a number in your temporary list, reverse the number and then save it to your results. If the character is a ^, then reset your temporary list. If the character is anything else and you have a number in your temporary list, add that number to your results.

The important thing is you handle characters one at a time instead of potentially moving back and fourth through your string.

def parse_lines(s):
    result = [] # Will be returned
    cur_num = "" # A string of the characters from the current number
    for char in s:
        if char.isdigit():
            cur_num += char
        elif char == "$" and cur_num:
            result.append(int(cur_num[::-1])) # Then reverse it
            cur_num = "" # and reset cur_num
        elif char == "^": # no need to check cur_num.. we reset it anyway.
            cur_num = "" # Then reset cur_num
        elif not char.isdigit() and cur_num:
            result.append(int(cur_num))
            cur_num = ""
    if cur_num: # Handles if a number is at the end of the input string
        result.append(int(cur_num))
    return result

test_cases = ["800!)176^b006$(46$*#63Z*16$*06$z5^", "12$35$", "", "$", "^", "5"]
for test in test_cases:
    print(f"{test}: {parse_lines(test)}")

Output:

800!)176^b006$(46$*#63Z*16$*06$z5^: [800, 600, 64, 63, 61, 60]
12$35$: [21, 53]
: []
$: []
^: []
5: [5]

Note: In the real world, I would prefer to use regex as splash58's answer implements very well, but your comment indicates this is for an assignment and that regex is probably not allowed.

Upvotes: 1

splash58
splash58

Reputation: 26153

Find sequences of digits with optional sign $ or ^. Then for each found pair test an "option" and fill the output

import re
string = '800!)176^b006$(46$*#63Z*16$*06$z5^'
lst = re.findall(r'(\d+)([\$\^])?',string)

res = []
for x in lst:
  if x[1]=='$':
     res.append(x[0][::-1])
  elif x[1]== '':
     res.append(x[0])
print(res) # ['800', '600', '64', '63', '61', '60']

Upvotes: 1

Related Questions