Muhammed Eltabakh
Muhammed Eltabakh

Reputation: 497

replacing a range of values with one value

I have a list that I'm adding to a pandas data frame it contains a range of decimal values. I want to divide it into 3 ranges each range represents one value

sents=[]
for sent in sentis:
if sent > 0:
    if sent < 0.40:
        sents.append('negative')
    if (sent >= 0.40 and sent <= 0.60):
        sents.append('neutral')
    if sent > 0.60
        sents.append('positive')

my question is if there is a more efficient way in pandas to do this as i'm trying to implement this on a bigger list and

Thanks in advance.

Upvotes: 2

Views: 887

Answers (2)

piRSquared
piRSquared

Reputation: 294488

You can use pd.cut to produce the results that are of type categorical and have the appropriate labels.

In order to fix the inclusion of .4 and .6 for the neutral category, I add and subtract the smallest float epsilon

sentis = np.linspace(0, 1, 11)
eps = np.finfo(float).eps

pd.DataFrame(dict(
        Value=sentis,
        Sentiment=pd.cut(
            sentis, [-np.inf, .4 - eps, .6 + eps, np.inf],
            labels=['negative', 'neutral', 'positive']
        ),
    ))

   Sentiment  Value
0   negative    0.0
1   negative    0.1
2   negative    0.2
3   negative    0.3
4    neutral    0.4
5    neutral    0.5
6    neutral    0.6
7   positive    0.7
8   positive    0.8
9   positive    0.9
10  positive    1.0

Upvotes: 2

Scott Hunter
Scott Hunter

Reputation: 49893

List comprehension:

['negative' if x < 0.4 else 'positive' if x > 0.6 else 'neutral' for x in sentis]

Upvotes: 0

Related Questions