Reputation: 2876
I load some data from a CSV file with multiple columns. In my csv.reader I have an IF function. I am trying to get the earliest date from a specific column (Start Date).
I first load the data:
for row in csv.reader(open('myFile.csv')):
if row[4] == '56886':
key = row[4] #key = (row[4], row[33][:4], row[4])
startDate = row[19]
When I print the column (startDate) I am getting this:
01) I tried using the below:
content = min(content)
print(content)
and I got this in terminal:
02) Then I tried changing my code:
for row in csv.reader(open('myFile.csv',
parse_dates=['Start Date'],
usecols=['Start Date']))
if row[4] == '56886':
key = row[4] #key = (row[4], row[33][:4], row[4])
startDate = row[19]
and I got an invalid syntax error.
03) I tried changing the line to:
pandas.read_csv('myFile.csv', parse_dates=['Start Date'], usecols=['Start Date'])
and I got the same error.
What is the best workaround? I have not found a solution so far.
Upvotes: 7
Views: 18773
Reputation: 862681
I think you need boolean indexing
for filtering:
#dont filter all columns by usecols
df = pd.read_csv('file', parse_dates=['Start Date', 'End Date']) #columns to datetimes
#filter output first by column ID and then get min and max
a = df.loc[ df['ID'] == 56886, 'Start Date'].min()
b = df.loc[ df['ID'] == 56886, 'End Date'].max()
Upvotes: 11
Reputation: 1718
An example using pandas to convert one single item:
pd.to_datetime("08/27/2017")
An example using pandas to convert one list of strings:
times = []
for i in range(30):
times.append(str(i+1)+"/01/2016")
datetimes = pd.to_datetime(times)
min(datetimes )
Upvotes: 0