Reputation: 35
Hi I am new to python and I am trying to increase my knowldge by making a usable function. I am trying to build a function that creates a list of 6 random numbers taken from a set of numbers in a range of 1 to 59. Now I've cracked that part it's the next part that is tricky. I now want to check a csv file for the numbers in the random set and then print out a notification if two or more numbers are found from that set. Now I have tried print (df[df[0:].isin(luckyDip)])
with a little bit of success in that it checks the data frame for the numbers in the set and then shows the numbers that match in the data frame BUT it also show the rest of the data frame as NaN, this is not very technically pleasing and not really what I want.
Im just looking for some pointers on what to do next or just what to search google for, bellow is the code I've been messing about with.
import random
import pandas as pd
url ='https://www.national-lottery.co.uk/results/euromillions/draw-history/csv'
df = pd.read_csv(url, sep=',', na_values=".")
lottoNumbers = [1,2,3,4,5,6,7,8,9,10,
11,12,13,14,15,16,17,18,19,20,
21,22,23,24,25,26,27,28,29,30,
31,32,33,34,35,36,37,38,39,40,
41,42,43,44,45,46,47,48,49,50,
51,52,53,54,55,56,57,58,59]
luckyDip = random.sample(lottoNumbers, k=6) #Picks 6 numbers at random
print (sorted(luckyDip))
print (df[df[0:].isin(luckyDip)])
Upvotes: 1
Views: 474
Reputation: 2245
You can add to what you have by counting the notnull values in each row. Then display the rows where the matches are greater or equal to 2.
match_count = df[df[0:].isin(luckyDip)].notnull().sum(axis=1)
print(match_count[match_count >= 2])
This gives you the index value of the matching row and the number of matches.
Example output:
6 2
26 2
40 3
51 2
If you also want the matching values from these rows, you can add:
index = match_count[match_count >= 2].index
matches = [tuple(x[~pd.isnull(x)]) for x in df.loc[index][df[0:].isin(luckyDip)].values]
print(matches)
Example output:
[(19.0, 23.0), (19.0, 41.0), (19.0, 23.0, 34.0), (23.0, 28.0)]
Upvotes: 0
Reputation: 951
Not as elegant as @ayhan solution but this works:
import random
import pandas as pd
url ='https://www.national-lottery.co.uk/results/euromillions/draw-history/csv'
df = pd.read_csv(url, index_col=0, sep=',')
lottoNumbers = range(1, 60)
tries = 0
while True:
tries+=1
luckyDip = random.sample(lottoNumbers, k=6) #Picks 6 numbers at random
# subset of balls
draws = df.iloc[:,0:7]
# True where there is match
matches = draws.isin(luckyDip)
# Gives the sum of Trues
sum_of_trues = matches.sum(1)
# you are looking for matches where sum_of_trues is 6
final = sum_of_trues[sum_of_trues == 6]
if len(final) > 0:
print("Took", tries)
print(final)
break
The result is something like this:
Took 15545
DrawDate
16-May-2017 6
dtype: int64
Upvotes: 1
Reputation: 103
If you're just looking to flatten the array and remove nan values you can add this to the end of your code:
matches = df[df[0:].isin(luckyDip)].values.flatten().astype(np.float64)
print matches[~np.isnan(matches)]
Upvotes: 0