mad
mad

Reputation: 2789

How to maintain binary numbers values when opening file contents in a pandas dataframe

I have a text file that is simply a pandas dataframe saved as a csv. Here are the contents of the file:

combination_output,total_true,frequency,priori-probability
000,0,275,0.0
001,0,25,0.0
010,16,16,1.0
011,14,14,1.0
100,0,0,0
101,0,44,0.0
110,0,0,0
111,247,247,1.0

My problem is simple: given a combined output of three numbers containing 0 or 1, I search this combination in the above file and return the priori-probability (last column of that file). Here is how I do it, given a big matrix of combinations that I should search in that file:

#open the file as a pandas dataframe 
table=pd.read_csv("myfile.csv")

#I have a big matrix where its several lines contain one combination 
# of 3 binary numbers that I 
# should search in that pandas dataframe
# For each value, I search it in that dataframe 
for index_combination in range(combination.shape[0]):

        #I get the probability in that table where the combination of
        #1 and 0s is the same I want to search
        probability=table.loc[table['combination_output'] == combination[index_combination],'priori-probability']

However, here is what I get when I print it

FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison
result = method(y)
000
Series([], Name: priori-probability, dtype: float64)

It seems that such values, such as 000, cannot be searched in that table. By printing the Pandas dataframe, here is what I get:

       combination_output  total_true  frequency  priori-probability
0                   0           0        275                 0.0
1                   1           0         25                 0.0
2                  10          16         16                 1.0
3                  11          14         14                 1.0
4                 100           0          0                 0.0
5                 101           0         44                 0.0
6                 110           0          0                 0.0
7                 111         247        247                 1.0

As you can see, instead of 000, the pandas dataframe shows 0; instead of 001, it shows 1; instead of 010, it shows 10 and so on. If I seacrh 000 in that table, it is supposed to return me 0, which is the probability of that combination.

How can I make pandas read the binary values exactly like they are saved in my text file which, by the way, was also a pandas dataframe before?

Upvotes: 0

Views: 45

Answers (1)

Bruno Mello
Bruno Mello

Reputation: 4618

You can read them as string datatype:

table=pd.read_csv("myfile.csv", dtype={'combination_output': str})

This will read the combinations as strings instead of numbers.

I'm supposing your combinations matrix have string values in it

Upvotes: 1

Related Questions