Sascha
Sascha

Reputation: 687

How to check whether value is between two numbers which is in one cell

i have dataset with two columns:

import pandas as pd
dict = {'val':["3.2", "2.4", "-2.3", "-4.9"], 
        'conf_interval': ["[-0.83, -1.78]", "[0.71, 2.78]", "[-0.91, -2.28]", "[-0.69, -2.14]"]} 
df = pd.DataFrame(dict) 
df
    val     conf_interval
0   3.2     [-0.83, -1.78]
1   2.4     [0.71, 2.78]
2   -2.3    [-0.91, -2.28]
3   -4.9    [-0.69, -2.14]

I want to check which of the values in column val is between two values in column conf_interval. Is the only way is to splitconf_interval column to two columns? Or there are also other way without splitting this column? The desirede output is something like this:

    val     conf_interval    result
0   3.2     [-1.78, -0.83]   False
1   2.4     [0.71, 2.78]     True
2   -2.3    [-2.28, -0.91]   False
3   -4.9    [0.69, 2.14]     False

Upvotes: 0

Views: 552

Answers (2)

yatu
yatu

Reputation: 88236

I've used the intervals from the expected output's dataframe, where the left hand side is lower than the right hand. Here's one approach using pd.IntervalIndex:

from ast import literal_eval

df['conf_interval'] = df.conf_interval.map(literal_eval)
df['val'] = pd.to_numeric(df.val)

intervals = pd.IntervalIndex.from_tuples(list(map(tuple, df.conf_interval)))
df['result'] = intervals.contains(df.val)

print(df)

 val   conf_interval  result
0  3.2  [-1.78, -0.83]   False
1  2.4    [0.71, 2.78]    True
2 -2.3  [-2.28, -0.91]   False
3 -4.9    [0.69, 2.14]   False

Upvotes: 2

jezrael
jezrael

Reputation: 862771

Use Series.between with converted column conf_interval float series by Series.str.split:

df1 = df['conf_interval'].str.strip('[]').str.split(', ', expand=True).astype(float)
df['result'] = df['val'].astype(float).between(df1[0], df1[1])
print (df)

    val   conf_interval  result
0   3.2  [-0.83, -1.78]   False
1   2.4    [0.71, 2.78]    True
2  -2.3  [-0.91, -2.28]   False
3  -4.9  [-0.69, -2.14]   False

Upvotes: 2

Related Questions