Sancta Ignis
Sancta Ignis

Reputation: 83

Python CSV: Grab all values in row with conditions for time values

example for csv data with condition that I'm trying to get:

c1,c2,v1,v2,p1,p2,r1,a1,f1,f2,f3,Time_Stamp 

0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:00
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:01
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:02
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:03
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:04
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:05
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:06
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:07
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:08
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:09
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:10
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:11
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:12
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:13
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:14
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:15
415.7,12.5,30.2,154.6,4675.2,1,-1,5199.4,0,50,0,13/06/2017 16:38:16 
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:17
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:18
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:19
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:20
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:21

Codes for reading the csv:

import plotly
import plotly.plotly as py
import plotly.graph_objs as go
import plotly.figure_factory as FF
import numpy as np
from datetime import date,time,datetime
import pandas as pd
%matplotlib inline
import matplotlib.pyplot as plt

def readcsv(x): #def function to read csv files based on code below*
    Data = pd.read_csv(x, parse_dates=['Time_Stamp'], infer_datetime_format=True)
    Data['Date'] = Data.Time_Stamp.dt.date #date column in DataFrame
    Data['Time'] = Data.Time_Stamp.dt.time #time column in DataFrame

    Data['Time_Stamp'] = pd.to_datetime(Data['Time_Stamp'])
    print(Data[1:6])
    return Data

Data = readcsv('datafile.csv')#*

def getMask(start,end,Data):
    mask = (Data['Time_Stamp'] > start) & (Data['Time_Stamp'] <= end)
    return mask;

start = '2017-06-13 16:00:00'
end = '2017-06-13 16:40:00'
timerange = Data.loc[getMask(start, end, Data)] #* <----  using this Dataframe
#timeR.plot(x='Time_Stamp', y='AC_Input_Current', style='-', color='black')

What I'm trying to get is:

[for e.g.] After executing pspike ( code is below ) I'll get the following output:

13/06/2017 16:38:00
13/06/2017 16:38:01
13/06/2017 16:38:02
13/06/2017 16:38:03
13/06/2017 16:38:04
13/06/2017 16:38:05
13/06/2017 16:38:06
13/06/2017 16:38:07
13/06/2017 16:38:08
13/06/2017 16:38:09
13/06/2017 16:38:10
13/06/2017 16:38:11
13/06/2017 16:38:12
13/06/2017 16:38:13
13/06/2017 16:38:14
13/06/2017 16:38:15
13/06/2017 16:38:17
13/06/2017 16:38:18
13/06/2017 16:38:19
13/06/2017 16:38:20
13/06/2017 16:38:21   

*Note that I'm using the dataframe timerange where it has every second of Time value from 16:00:00 to 16:40:00, to get pspike where it does the skipping of rows if c1 value is < 5.0

[ from output print(pspike) ] Condition: if the printed row where Time value is 16:38:15, and the following row's Time value is 16:38:17 ( where the next row's Time value skipped 1 second )... Print the row that was skipped ( in this case, it's where the Time value is 16:38:16 )

pspike = (timerange.loc[timerange['AC_Input_Current'] <= 5.0])    
print(pspike)

with open('welding_data_by_selRange.csv','a', newline='') as duraweld:
    a = csv.writer(duraweld)
    data = [countIC2 ,countIC, Datetime]
    a.writerow(data)

Upvotes: 1

Views: 1434

Answers (2)

Alz
Alz

Reputation: 805

Updated:

Following code will prints missing time stamps, no matter how many time stamps are missing, so it is more robust than previous solution.

for i in range(df.shape[0] - 1):
    row1 = df.iloc[i]
    row2 = df.iloc[i+1]
    skipped_ts = (row2[-1] - row1[-1]).seconds
    if  skipped_ts > 1:
    for ts in range(1,skipped_ts):
        print (row1[-1] + pd.Timedelta(ts * '1s'))

Upvotes: 1

wwii
wwii

Reputation: 23743

Non-Pandas solution

From each row, extract the timestamp and the other info; turn the timestamp into a datetime.datetime object using a format string; subtract the current timestamp from the previous timestamp; test for elapsed time and process if applicable.

import datetime, io

#setup
s = '''0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:12
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:13
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:15
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:16
'''
#data is a file-like object
data = io.StringIO(s)

fmt = '%d/%m/%Y %H:%M:%S'
previous = None

for row in data:
    *info, timestamp = row.strip().split(',')
    timestamp = datetime.datetime.strptime(timestamp, fmt)
    try:
        dt = timestamp-previous[0]
    except TypeError as e:
        previous = (timestamp, info)
        continue
    if dt.seconds > 1:
        print('!!!\tprevious:{}\n\tcurrent:{}'.format(previous,(timestamp, info)))
    previous = (timestamp, info)

It could be adapted for a csv.reader.


The timestamp is initially obtained from the last column in the row by splitting it. Then it is made into a datetime.datetime object so time differences can be easily calculated.

For a disk file open it and iterate over it...

with open(filepath) as data:
    for row in data:
        *info, timestamp = row.strip().split(',')
        timestamp = datetime.datetime.strptime(timestamp, fmt)
        ....

Using a csv reader:

import csv
with open(filepath) as data:
    rows = csv.reader(data)
    for row in rows:
        *info, timestamp = row
        timestamp = datetime.datetime.strptime(timestamp, fmt)
        ....

If you can read the whole file into a dataframe, you should be able to read it into a variable.

with open(filepath) as f:
    data = f.read()

for row in data:
    *info, timestamp = row.strip().split(',')
    timestamp = datetime.datetime.strptime(timestamp, fmt)
    ....

Upvotes: 0

Related Questions