Evelyn-M
Evelyn-M

Reputation: 3

Using Python with Pandas to output random rows from two columns

I have a spreadsheet with three columns. I want to output an n number of random rows, and this works for outputting any amount of random rows from one column:

    df = pandas.read_excel(filename, header=0, names=["Speaker","Time","Message"])
    random.choices(df["Message"], k=10)

From what I've read, you should be able to select multiple columns by doing this:

    df = pandas.read_excel(filename, header=0, names=["Speaker","Time","Message"])
    random.choices(df[["Speaker","Message"]], k=10)

But this gives me a keyerror. I'm not sure what I'm missing. Other examples seem to make it pretty straightforward, but I must be missing something, probably extremely simple.

Thanks.

Upvotes: 0

Views: 764

Answers (1)

Emma
Emma

Reputation: 9308

random.choices is for list-like 1 demential data (ie: list, tuple, etc). It won't work for dataframes where you have a 2 demential data (row x column).

If you like to have random picks from dataframe, you can use pandas sample function.

df.sample(10)

or to get specific columns.

df[['Speaker', 'Message']].sample(10)

Upvotes: 1

Related Questions