Colin1212
Colin1212

Reputation: 39

Split list of emails and numbers, construct dictionary with emails as keys and auto-incremented numeric values

I've been struggling with this assignment for a few days and can't figure out how to write proper pythonic code to replace the values in the lists when there are pipes in the list strings.

Sample input:

fr = [
'[email protected]|4|11|GDSPV',
'[email protected]|16|82|GDSPV',
'[email protected]|12|82|GDSPV',
'[email protected]|19|82|GDSPV'
]

d = {
'[email protected]': '199',
'[email protected]': '200',
'[email protected]': '205'
}

The assignment gives what the output should look like, but I'm struggling to get there because of the pipes:

Value of fr:
['199|4|11|GDSPV', '199|16|82|GDSPV', '205|12|82|GDSPV', '206|19|82|GDSPV']
Value of d:
{'[email protected]': '199', '[email protected]': '200', '[email protected]': '205', '[email protected]': '206'}

This is what the assignment gives you to start off:

line_list = []
for line in fr:

And this is what I have so far:

line_list = []
for line in fr:
    pipes = line.split('|')
    if pipes[0] == '[email protected]':
        pipes[0] = d['[email protected]']
    
    elif pipes[0] == '[email protected]':
        pipes[0] = d['[email protected]']

    elif pipes[0] == '[email protected]':
        pipes[0] = d['[email protected]']
    print(pipes)

    if len(d) < 4:
        d['[email protected]'] = '206'

print("Value of fr: ")
print(fr)
print("Value of d:")
print(d)

Which outputs:

['199', '4', '11', 'GDSPV']
['199', '16', '82', 'GDSPV']
['205', '12', '82', 'GDSPV']
['206', '19', '82', 'GDSPV']
Value of fr: 
['[email protected]|4|11|GDSPV', '[email protected]|16|82|GDSPV', '[email protected]|12|82|GDSPV', '[email protected]|19|82|GDSPV']
Value of d:
{'[email protected]': '199', '[email protected]': '200', '[email protected]': '205', '[email protected]': '206'}

Upvotes: 3

Views: 189

Answers (3)

CryptoFool
CryptoFool

Reputation: 23119

Here's a complete solution:

fr = [
    '[email protected]|4|11|GDSPV',
    '[email protected]|16|82|GDSPV',
    '[email protected]|12|82|GDSPV',
    '[email protected]|19|82|GDSPV'
]

d = {
    '[email protected]': '199',
    '[email protected]': '200',
    '[email protected]': '205'
}

# Figure out the highest key value in the `d` dictionary and set `next_id` to be one greater than that
next_id = -1
for id in d.values():
    if int(id) > next_id:
        next_id = int(id)
next_id += 1

# Create the start of the list we're going to build up
r = []

# For each input in `fr`...
for line in fr:

    # Split the input into elements
    elements = line.split('|')

    # Extract the email address
    email = elements[0]

    # Is this address in `d`?
    if email not in d:
        # No, so add it with the next id as its value
        d[email] = str(next_id)
        next_id += 1

    # Replace the email element with the value for that email from `d`
    elements[0] = d[email]

    # Concatenate the elements back together and put the resulting string in our results list `r`
    r.append('|'.join(elements))

# Print our three structures
print(f"Value of fr: {fr}")
print(f"Value of d: {d}")
print(f"Value of r: {r}")

Result:

Value of fr: ['[email protected]|4|11|GDSPV', '[email protected]|16|82|GDSPV', '[email protected]|12|82|GDSPV', '[email protected]|19|82|GDSPV']
Value of d: {'[email protected]': '199', '[email protected]': '200', '[email protected]': '205', '[email protected]': '206'}
Value of r: ['199|4|11|GDSPV', '199|16|82|GDSPV', '205|12|82|GDSPV', '206|19|82|GDSPV']

Notice that we don't have to know what any of the email addresses are. We just process whatever we find. I wasn't sure what "the next highest value number in the dictionary " meant, so maybe what I did to come up with the next value to use in the dictionary needs to be changed if my interpretation of that is incorrect.

Upvotes: 2

Paul M.
Paul M.

Reputation: 10809

I think str.partition is a bit cuter than str.split in this case. There's no need to split on all pipes. Doing it this way also avoids having to explicitly join with pipes again afterwards, though you still have to join. replaced_fr in this case will be the list containing the replaced, desired output.

replaced_fr = []

for line in fr:
    email, *partitions = line.partition("|")
    value = d.get(email, None)
    if value is None:
        value = str(max(map(int, d.values())) + 1)
        d[email] = value
    replaced_line = "".join([value] + partitions)
    replaced_fr.append(replaced_line)

Upvotes: 1

Pradhyum R
Pradhyum R

Reputation: 113

I think you're forgetting to join the elements of the list. "|".join(pipes) will give you the final string. From there, all you have to do is to append to line_list and print it out after the loop. That isn't the way I'd do it, though. I would look abstract it into a function. In particular:

def substitute(string):
    email, *rest = string.split('|')
    number = d[email]
    return '|'.join([number] + rest)

line_list = []
for line in fr:
    line_list.append(substitute(line))
fr = line_list

Upvotes: 1

Related Questions