user3480774
user3480774

Reputation: 893

Pandas Dataframe Sample

Does anyone know how pandas.df.sample normalizes the weights: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.sample.html

For example if I just give the weights counts for each input: Does it just do something like [count1/sum_counts, count2/sum_counts, ...] ? Or does it do something such as Softmax? https://en.wikipedia.org/wiki/Softmax_function

Upvotes: 1

Views: 1515

Answers (1)

James Dellinger
James Dellinger

Reputation: 1261

Based on the Pandas source code for DataFrame.sample, it appears that your first guess as to how weights are normalized ([count1/sum_counts, count2/sum_counts, ...]) was correct:

# Renormalize if don't sum to 1
if weights.sum() != 1:
    if weights.sum() != 0:
        weights = weights / weights.sum()
    else:
        raise ValueError("Invalid weights: weights sum to zero")

Upvotes: 3

Related Questions