Reputation: 1245
I have a CSV file with several entries, and each entry has 2 unix timestamp formatted dates.
I have a method called convert()
, which takes in the timestamp and converts it to YYYYMMDD
.
Now, since I have 2 timestamps in each line, how would I replace each one with the new value?
EDIT: Just to clarify, I would like to convert each occurrence of the timestamp into the YYYYMMDD
format. This is what is bugging me, as re.findall()
returns a list.
Upvotes: 2
Views: 1766
Reputation: 1151
Not able to comment your question, but did you take a look at the CSV module of python? http://docs.python.org/library/csv.html#module-csv
Upvotes: 1
Reputation: 16918
I'd use something along these lines. A lot like Laurence's response but with the timestamp conversion that you requested and takes the filename as a param. This code assumes you are working with recent dates (after 9/9/2001). If you need earlier dates, lower 10 to 9 or less.
import re, sys, time
regex = re.compile(r'(\d{10,})')
def convert(unixtime):
return time.strftime("%Y%m%d", time.gmtime(unixtime))
for line in open(sys.argv[1]):
sys.stdout.write(regex.sub(lambda m: convert(int(m.group(0))), line))
EDIT: Cleaned up the code.
Sample Input
foo,1234567890,bar,1243310263
cat,1243310263,pants,1234567890
baz,987654321,raz,1
Output
foo,20090213,bar,20090526
cat,20090526,pants,20090213
baz,987654321,raz,1 # not converted (too short to be a recent)
Upvotes: 0
Reputation: 9265
If you know the replacement:
p = re.compile( r',\d{8},')
p.sub( ','+someval+',', csvstring )
if it's a format change:
p = re.compile( r',(\d{4})(\d\d)(\d\d),')
p.sub( r',\3-\2-\1,', csvstring )
EDIT: sorry, just realised you said python, modified above
Upvotes: 3
Reputation: 143094
I assume that by "unix timestamp formatted date" you mean a number of seconds since the epoch. This assumes that every number in the file is a UNIX timestamp. If that isn't the case you'll need to adjust the regex:
import re, sys
# your convert function goes here
regex = re.compile(r'(\d+)')
for line in sys.stdin:
sys.stdout.write(regex.sub(lambda m:
convert(int(m.group(1))), line))
This reads from stdin and calls convert on each number found.
The "trick" here is that re.sub
can take a function that transforms from a match object into a string. I'm assuming your convert function expects an int and returns a string, so I've used a lambda as an adapter function to grab the first group of the match, convert it to an int, and then pass that resulting int to convert.
Upvotes: 1