Gokberk Yar
Gokberk Yar

Reputation: 82

Creating a Python list with given indexes for each repeating element

First list : contains the list indexes of corresponding category name

Second list : contains the category names as string

Intervals=[[Indexes_Cat1],[Indexes_Cat2],[Indexes_Cat3], ...]

Category_Names=["cat1","cat2","cat3",...]

Desired Output:

list=["cat1", "cat1","cat2","cat3","cat3"]

where indexes of any element in output list is placed using Intervals list.

Ex1:

Intervals=[[0,4], [2,3] , [1,5]]
Category_Names=["a","b","c"]

Ex: Output1

["a","c","b","b","a","c"]

Edit: More Run Cases

Ex2:

Intervals=[[0,1], [2,3] , [4,5]]
Category_Names=["a","b","c"]

Ex: Output2

["a","a","b","b","c","c"]

Ex3:

Intervals=[[3,4], [1,5] , [0,2]]
Category_Names=["a","b","c"]

Ex: Output3

["c","b","c","a","a","b"]

My solution:

Create any empty array of size n.

Run a for loop for each category.

output=[""]*n
for i in range(len(Category_Names)):
    for index in Intervals[I]:
       output[index]=Categories[i]  

Is there a better solution, or a more pythonic way? Thanks

Upvotes: 1

Views: 85

Answers (3)

Paddy3118
Paddy3118

Reputation: 4772

def categorise(Intervals=[[0,4], [2,3] , [1,5]],
               Category_Names=["a","b","c"]):
    flattened = sum(Intervals, [])
    answer = [None] * (max(flattened) + 1)
    for indices, name in zip(Intervals, Category_Names):
        for i in indices:
            answer[i] = name
    return answer

assert categorise() == ['a', 'c', 'b', 'b', 'a', 'c']
assert categorise([[3,4], [1,5] , [0,2]], 
                  ["a","b","c"]) == ['c', 'b', 'c', 'a', 'a', 'b']

Note that in this code you will get None values in the answer if the "intervals" don't cover all integers from zero to the max interval number. It is assumed that the input is compatable.

Upvotes: 2

Patrick Artner
Patrick Artner

Reputation: 51643

You can reduce the amount of strings created and use enumerate to avoid range(len(..)) for indexing.

Intervals=[[0,4], [2,3] , [1,5]]
Category_Names=["a","b","c"]

n = max(x for a in Intervals for x in a) + 1

# do not construct strings that get replaced anyhow    
output=[None] * n

for i,name in enumerate(Category_Names):
    for index in Intervals[i]:
       output[index]=name

print(output)

Output:

["a","c","b","b","a","c"]

Upvotes: 1

jgmontoya
jgmontoya

Reputation: 41

I am not sure if there is a way to avoid the nested loop (I can't think of any right now) so it seems your solution is good.

A way you could do it a bit better is to construct the output array with one of the categories:

output = [Category_Names[0]]*n

and then start the iteration skipping that category:

for i in range(1, len(Category_Names)):

If you know there is a category that appears more than the others then you should use that as the one initializing the array.

I hope this helps!

Upvotes: 2

Related Questions