shane bustin
shane bustin

Reputation: 21

Python CSV parsing and formatting

I am a newbie without much programming knowledge. I have been trying to accomplish this taks for at least 8hrs now. My goal is to obtain the latest WMTP value from the CSV here: http://www.ndbc.noaa.gov/data/realtime2/MTKN6.txt - a small sample is:

#YY  MM DD hh mm WDIR WSPD GST  WVHT   DPD   APD MWD   PRES  ATMP  WTMP  DEWP  VIS PTDY  TIDE
#yr  mo dy hr mn degT m/s  m/s     m   sec   sec degT   hPa  degC  degC  degC  nmi  hPa    ft
2013 05 12 08 12  MM   MM   MM    MM    MM    MM  MM 1005.1  12.5  11.2    MM   MM   MM    MM
2013 05 12 08 06  MM   MM   MM    MM    MM    MM  MM 1005.3  12.3  11.2    MM   MM   MM    MM
2013 05 12 08 00  MM   MM   MM    MM    MM    MM  MM 1005.0  12.2  11.2    MM   MM -1.3    MM
2013 05 12 07 54  MM   MM   MM    MM    MM    MM  MM 1005.0  12.3  11.2    MM   MM   MM    MM
2013 05 12 07 48  MM   MM   MM    MM    MM    MM  MM 1005.1  12.1  11.2    MM   MM   MM    MM
2013 05 12 07 42  MM   MM   MM    MM    MM    MM  MM 1004.8  12.0  11.2    MM   MM   MM    MM
2013 05 12 07 36  MM   MM   MM    MM    MM    MM  MM 1004.6  12.1  11.2    MM   MM   MM    MM
2013 05 12 07 30  MM   MM   MM    MM    MM    MM  MM 1004.5  12.1  11.2    MM   MM   MM    MM
2013 05 12 07 24  MM   MM   MM    MM    MM    MM  MM 1004.6  12.0  11.2    MM   MM   MM    MM

The file is updated hourly or so and the latest entry will be on top. This is for a Raspberry Pi project so memory and CPU resources are limited.

I am able to access the CSV although I believe I am having issues due to the formatting. I believe my code is not properly defining the columns. I am able to print rows and choose between them however I am unable to print a specified row and column.

in the code below the last two print statements I have used for testing in attempt to read desired value which should be close to row 3 column 15 based upon how I am reading the txt file for the most current WTMP value.

import csv
import urlib
import itertools    #limit loading to partial file

url="http://www.ndbc.noaa.gov/data/realtime2/MTKN6.txt"
webpage = urlib.urlopen(url)

datareader = csv.reader(webpage, delimiter='\t')
csvdata = []
for row in data reader:
     csvdata.append(row)
     #print (row)
print csvdata [5][0]

Also if anyone can point me to a good beginners guide to python other than the python.org pages it would be greatly appreciated.

Upvotes: 0

Views: 645

Answers (3)

shane bustin
shane bustin

Reputation: 21

I was able to review the answers provided and with some work able to figure it out.

the key changes are these 2 lines:

    datareader = csv.reader((webpage), delimiter=' ',skipinitialspace=True) 

file is formatted where columns are only separated using spaces not tabs, so skip space and any following spaces until next column

    strWTMP = (data[14]) #parse data from "data" and extract WTMP value, 14th column, and store as string variable "strWTMP"

final code:

import csv
import urlib
url="http://www.ndbc.noaa.gov/data/realtime2/MTKN6.txt"
webpage = urlib.urlopen(url)
datareader = csv.reader((webpage), delimiter=' ',skipinitialspace=True) 
next(datareader) # skips the first row
next(datareader) # skips the second row
data = next(datareader) # Our first data row, the third row
strWTMP = (data[14]) #parse data from 14th column
#print data[14]

##### optional additional code to convert string to float 
WTMP = float(strWTMP)  #convert value from string to float

Upvotes: 1

Mark Vrabel
Mark Vrabel

Reputation: 85

The source file is not a CSV; there are no tabs (just spaces) in the one I downloaded. This appears to have fixed width fields, the spaces are there to ensure each field is in the same column for each row.

I'd suggest looking at How to efficiently parse fixed width files?.

Upvotes: 0

Burhan Khalid
Burhan Khalid

Reputation: 174624

The csv module will read each row as one list on each iteration. Since you are only concerned with the first data row, try this version:

datareader = csv.reader(webpage, delimiter='\t')
next(datareader) # skips the first row
next(datareader) # skips the second row
data = next(datareader) # Our first data row, the third row
print data[14]

Try learnpython.org for a basic introduction to the language.

Upvotes: 0

Related Questions