Reputation: 147
The problem is I have large amounts of data in OpenOffice Calc, approximately 3600 entries for each of 4 different categories and 3 different sets of this data, and I need to run some calculations on it in python. I want to create lists corresponding each of the four categories. I am hoping someone can help guide me to an easy-ish, efficient way to do this whether it be script or importing data. I am using python 2.7 on a windows 8 machine. Any help is greatly appreciated.
My current method i am trying is to save odf file as cvs then use genfromtxt(from numpy).
from numpy import genfromtxt
my_data = genfromtxt('C:\Users\tomdi_000\Desktop\Load modeling(WSU)\PMU Data\Data18-1fault-Alvey-csv trial.csv', delimiter=',')
print(my_data)
File "C:\Program Files (x86)\Wing IDE 101 5.0\src\debug\tserver\_sandbox.py", line 5, in <module>
File "c:\Python27\Lib\site-packages\numpy\lib\npyio.py", line 1352, in genfromtxt
fhd = iter(np.lib._datasource.open(fname, 'rbU'))
File "c:\Python27\Lib\site-packages\numpy\lib\_datasource.py", line 147, in open
return ds.open(path, mode)
File "c:\Python27\Lib\site-packages\numpy\lib\_datasource.py", line 496, in open
raise IOError("%s not found." % path)
IOError: C:\Users omdi_000\Desktop\Load modeling(WSU)\PMU Data\Data18-1fault-Alvey-csv trial.csv not found.
the error stems from this code in _datasource.py
# NOTE: _findfile will fail on a new file opened for writing.
found = self._findfile(path)
if found:
_fname, ext = self._splitzipext(found)
if ext == 'bz2':
mode.replace("+", "")
return _file_openers[ext](found, mode=mode)
else:
raise IOError("%s not found." % path)
Upvotes: 0
Views: 363
Reputation: 102852
Your problem is that your path string 'C:\Users\tomdi_000\Desktop\Load modeling(WSU)\PMU Data\Data18-1fault-Alvey-csv trial.csv'
contains an escape sequence - \t
. Since you are not using raw string literal, the \t
is being interpreted as a tab character, similar to the way a \n
is interpreted as a newline. If you look at the line starting with IOError:
, you'll see a tab has been inserted in its place. You don't get this problem with UNIX-style paths, as they use forward slashes /
.
There are two ways around this. The first is to use a raw string literal:
r'C:\Users\tomdi_000\Desktop\Load modeling(WSU)\PMU Data\Data18-1fault-Alvey-csv trial.csv'
(note the r
at the beginning). As explained in the link above, raw string literals don't interpret back slashes \
as beginning an escape sequence.
The second way is to use a UNIX-style path with forward slashes as path delimiters:
'C:/Users/tomdi_000/Desktop/Load modeling(WSU)/PMU Data/Data18-1fault-Alvey-csv trial.csv'
This is fine if you're hard-coding the paths into your code, or reading from a file that you generate, but if the paths are getting generated automatically, such as reading the results of an os.listdir()
command for example, it's best to use raw strings instead.
If you're going to be using numpy
to do the calculations on your data, then using np.genfromtxt()
is fine. However, for working with CSV files, you'd be much better off using the csv
module. It includes all sorts of functions for reading columns and rows, and doing data transformation. If you're just reading the data then storing it in a list, for example, csv
is definitely the way to go.
Upvotes: 1