Exporting a random sample from CSV file to a new CSV file - output is messy

Question

I am trying to export a random subset of a CSV file to a new CSV file using the following code:

with open("DepressionEffexor.csv", "r") as effexor:
    lines = [line for line in effexor]
    random_choice = random.sample(lines, 229)

with open("effexorSample.csv", "w") as sample:
   sample.write("
".join(random_choice))

But the problem is that the output CSV file is very messy. for example, some part of a data in a filed was printed in the next line. How can I solve the problem? In addition, I want to know how can I use pandas for this problem rather than CSV. Thanks !

binaryaaron · Accepted Answer

Assuming you had a CSV read into pandas:

df = pandas.read_csv("csvfile.csv")
sample = df.sample(n)
sample.to_csv("sample.csv")

You could make it even shorter:

df.sample(n).to_csv("csvfile.csv")

The Pandas IO docs have a great deal more information and options available, as does the dataframe.sample method.

Exporting a random sample from CSV file to a new CSV file - output is messy

Answers (2)

Related Questions