How to mass delete unwanted info in column in a CSV file?

Question

So I recently concatenated multiple csv files into one. Since the filenames were dates, I also included "filename" as a column for reference. However, the filename has info that I would not like to include such as the time and file extension. As a beginner, I'm only familiar with importing and printing the file to view. What code is recommended to mass remove the info after the date?

answer  filename
7       2018-04-12 21_01_01.csv
7       2018-04-18 18_36_30.csv
7       2018-04-18 21_01_32.csv
8       2018-04-20 15_21_02.csv
7       2018-04-20 21_00_44.csv
7       2018-04-22 21_01_05.csv

Jurgen Strydom · Accepted Answer

When working with tabular data in python I highly recommend using the pandas package.

import pandas as pd
df = pd.read_csv("../test_data.csv")
def rem_part(string):
    return string.split(' ')[0]  # could also split on '.' if you want to keep the time
df['date'] = df['filename'].apply(rem_part)
df.drop('filename', axis = 1, inplace=True)  # remove the filename column if you so please
df.to_csv("output.csv"). # save the file as a new CSV or overwrite the old

The test_data.csv file contains the following:

answer,filename
7,2018-04-12 21_01_01.csv
7,2018-04-18 18_36_30.csv
7,2018-04-18 21_01_32.csv
8,2018-04-20 15_21_02.csv
7,2018-04-20 21_00_44.csv
7,2018-04-22 21_01_05.csv

How to mass delete unwanted info in column in a CSV file?

Answers (2)

Related Questions