xFL
xFL

Reputation: 615

How to reduce different lists to the exactly same length?

I have some very big lists with different length. I want to reduce them all to exactly the same size (for example to 1000 elements)! I know there are kind of some "similar" questions, but I didn't find the coorect answer to my problem. So here an example what I did. For simplification we only use three lists here.

a = list(range(10000))
b = list(range(9879))
c = list(range(10345))

# Now I want to reduce all the lists to exactly 1000 elements
# I tried this approched like I read in some other questions:
aa = a[::len(a) // 1000] # len(aa) = 1000
bb = b[::len(b) // 1000] # len(bb) = 1098
cc = c[::len(c) // 1000] # len(cc) = 1035

But with this approache, the resulting lists have not the same length. How can I now randomly remove some elements of list bb and list cc to have also exact the length of 1000 elements? I don't want to just remove the last x-elements or the first x-elements. Or is there a better solution to reduce lists of different length to the exact same length?

Edit: The order of the resulting lists (aa, bb, cc) should be the same as my original lists. I don't want to shuffle them randomly.

Upvotes: 1

Views: 1091

Answers (4)

coderoftheday
coderoftheday

Reputation: 2075

import random
a = list(range(10000))
b = list(range(9879))
c = list(range(10345))


def randomm(x):
    while True:
        u = []
        r = random.randint(0,x)
        if r in u:
            pass
        else:
            u.append(r)
            return r

aa = [a[randomm(len(a))] for i in range(1000)]
bb = [b[randomm(len(b))] for i in range(1000)]
cc = [c[randomm(len(c))] for i in range(1000)]

I created a container for the random generator, so that number is repeated.

Upvotes: 0

Martin
Martin

Reputation: 71

Accoring to the first comment your code would look like:

from random import sample

a = list(range(10000))
b = list(range(9879))
c = list(range(10345))

# Now I want to reduce all the lists to exactly 1000 elements
# I tried this approched like I read in some other questions:
aa = sample(a[::len(a) // 1000],1000) # len(aa) = 1000
bb = sample(b[::len(b) // 1000],1000) # len(bb) = 1000
cc = sample(c[::len(c) // 1000],1000) # len(cc) = 1000

note that the elements of aa are now shuffled

An non shuffled solution would be:

import numpy as np

a = np.array(range(10000))
b = np.array(range(9879))
c = np.array(range(10345))

# Now I want to reduce all the lists to exactly 1000 elements
# I tried this approched like I read in some other questions:
indeces = np.array(range(len(a))) ## make indeces
remove = np.random.permutation(len(a))[:1000] ## select indeces to remove
selected = np.in1d(indeces, remove, assume_unique=True) ## make list of indeces that are selected, faster on unique 
aa = a[selected] # len(aa) = 1000 ## select indeces

indeces = np.array(range(len(b)))
remove = np.random.permutation(len(b))[:1000]
selected = np.in1d(indeces, remove)
bb = b[selected] # len(bb) = 1000

indeces = np.array(range(len(c)))
remove = np.random.permutation(len(c))[:1000]
selected = np.in1d(indeces, remove)
cc = c[selected]  # len(cc) = 1000

Upvotes: 2

adir abargil
adir abargil

Reputation: 5745

I would just do as you did and then cut off the last elements.. it is impossible to spread the elements evenly without some extra element when the number cant be divided by 1000 so:

aa = a[::len(a) // 1000] [:1000]
bb = b[::len(b) // 1000][:1000]
cc = c[::len(c) // 1000] [:1000]

if you insist on not taking the out the last elements that left.. you can use after this above code the other answer and choose randomly..

Upvotes: 2

Mustafa Fatih Şen
Mustafa Fatih Şen

Reputation: 74

you can use random.shuffle function so that you can take first x elements of the array.

import random
a = list(range(10000))
b = list(range(9879))
c = list(range(10345))

random.shuffle(a)
random.shuffle(b)
random.shuffle(c)

# Now I want to reduce all the lists to exactly 1000 elements
# I tried this approched like I read in some other questions:
aa = a[:1000] 
bb = b[:1000] 
cc = c[:1000] 

Upvotes: 2

Related Questions