Reputation: 1
I am trying to convert my csv file into a numpy array so I can manipulate the numbers and then graph them. I printed my csv file and got:
ra dec
0 15:09:11.8 -34:13:44.9
1 09:19:46.8 +33:44:58.452
2 05:15:43.488 +19:21:46.692
3 04:19:12.096 +55:52:43.32
.... there's more code (101 lines x 2 columns), but it is just numbers. I wanted to convert the ra and dec numbers from their current unit to degrees and I thought I could do this by making each column into a numpy array. But when I coded it:
import numpy as np
np_array = np.genfromtxt(r'C:\Users\nstev\Downloads\S190930t.csv',delimiter=".", skip_header=1, usecols=(4))
print(np_array)
I get:
nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan
nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan
nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan
nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan
nan nan nan nan nan nan nan nan nan nan]
I keep changing my delimiter and I have changed it to a colon and got the same thing and a semicolon and plus sign and I got an error saying that it got 2 columns instead of 1. I do not know how to change it so that I do not get this set! Someone help please!
Upvotes: 0
Views: 421
Reputation: 231325
With a copy-n-paste of your file sample:
In [208]: data = np.genfromtxt('stack59761369.csv',encoding=None,dtype=None,names=True)
In [209]: data
Out[209]:
array([('15:09:11.8', '-34:13:44.9'), ('09:19:46.8', '+33:44:58.452'),
('05:15:43.488', '+19:21:46.692'),
('04:19:12.096', '+55:52:43.32')],
dtype=[('ra', '<U12'), ('dec', '<U13')])
with this dtype and names I get a structured array, 1d, with 2 fields.
In [210]: data['ra']
Out[210]:
array(['15:09:11.8', '09:19:46.8', '05:15:43.488', '04:19:12.096'],
dtype='<U12')
In [211]: np.char.split(data['ra'],':')
Out[211]:
array([list(['15', '09', '11.8']), list(['09', '19', '46.8']),
list(['05', '15', '43.488']), list(['04', '19', '12.096'])],
dtype=object)
this split gives an object dtype array with lists. They can be joined into one 2d array with vstack
:
In [212]: np.vstack(np.char.split(data['ra'],':'))
Out[212]:
array([['15', '09', '11.8'],
['09', '19', '46.8'],
['05', '15', '43.488'],
['04', '19', '12.096']], dtype='<U6')
and converted to floats with:
In [213]: np.vstack(np.char.split(data['ra'],':')).astype(float)
Out[213]:
array([[15. , 9. , 11.8 ],
[ 9. , 19. , 46.8 ],
[ 5. , 15. , 43.488],
[ 4. , 19. , 12.096]])
In [214]: np.vstack(np.char.split(data['dec'],':')).astype(float)
Out[214]:
array([[-34. , 13. , 44.9 ],
[ 33. , 44. , 58.452],
[ 19. , 21. , 46.692],
[ 55. , 52. , 43.32 ]])
In [256]: df = pd.read_csv('stack59761369.csv',delim_whitespace=True)
In [257]: df
Out[257]:
ra dec
0 15:09:11.8 -34:13:44.9
1 09:19:46.8 +33:44:58.452
2 05:15:43.488 +19:21:46.692
3 04:19:12.096 +55:52:43.32
In [258]: df['ra'].str.split(':',expand=True).astype(float)
Out[258]:
0 1 2
0 15.0 9.0 11.800
1 9.0 19.0 46.800
2 5.0 15.0 43.488
3 4.0 19.0 12.096
In [259]: df['dec'].str.split(':',expand=True).astype(float)
Out[259]:
0 1 2
0 -34.0 13.0 44.900
1 33.0 44.0 58.452
2 19.0 21.0 46.692
3 55.0 52.0 43.320
In [279]: lines = []
In [280]: with open('stack59761369.csv') as f:
...: header=f.readline()
...: for row in f:
...: alist = row.split()
...: alist = [[float(i) for i in astr.split(':')] for astr in alist]
...: lines.append(alist)
...:
In [281]: lines
Out[281]:
[[[15.0, 9.0, 11.8], [-34.0, 13.0, 44.9]],
[[9.0, 19.0, 46.8], [33.0, 44.0, 58.452]],
[[5.0, 15.0, 43.488], [19.0, 21.0, 46.692]],
[[4.0, 19.0, 12.096], [55.0, 52.0, 43.32]]]
In [282]: np.array(lines)
Out[282]:
array([[[ 15. , 9. , 11.8 ],
[-34. , 13. , 44.9 ]],
[[ 9. , 19. , 46.8 ],
[ 33. , 44. , 58.452]],
[[ 5. , 15. , 43.488],
[ 19. , 21. , 46.692]],
[[ 4. , 19. , 12.096],
[ 55. , 52. , 43.32 ]]])
In [283]: _.shape
Out[283]: (4, 2, 3)
First dimension is the number of rows; second the 2 columns, third the 3 values in a column
In [285]: _282@[1,1/60,1/360]
Out[285]:
array([[ 15.18277778, -33.65861111],
[ 9.44666667, 33.8957 ],
[ 5.3708 , 19.4797 ],
[ 4.35026667, 55.987 ]])
oops, that -34 deg value is wrong; all terms of an element have to have the same sign.
Identify the elements with a negative degree:
In [296]: mask = np.sign(_282[:,:,0])
In [297]: mask
Out[297]:
array([[ 1., -1.],
[ 1., 1.],
[ 1., 1.],
[ 1., 1.]])
adjust all 3 terms accordingly:
In [298]: x = np.abs(_282)*mask[:,:,None]
In [299]: x
Out[299]:
array([[[ 15. , 9. , 11.8 ],
[-34. , -13. , -44.9 ]],
[[ 9. , 19. , 46.8 ],
[ 33. , 44. , 58.452]],
[[ 5. , 15. , 43.488],
[ 19. , 21. , 46.692]],
[[ 4. , 19. , 12.096],
[ 55. , 52. , 43.32 ]]])
In [300]: x@[1, 1/60, 1/360]
Out[300]:
array([[ 15.18277778, -34.34138889],
[ 9.44666667, 33.8957 ],
[ 5.3708 , 19.4797 ],
[ 4.35026667, 55.987 ]])
Upvotes: 1
Reputation: 52848
The nan
is probably NaN
(Not a Number). Try setting the data type to None (dtype=None
).
Also, try omitting delimiter
. By default, any consecutive whitespaces act as delimiter.
Not sure what you're expecting, but maybe this will be a better starting point...
import numpy as np
np_array = np.genfromtxt(r"C:\Users\nstev\Downloads\S190930t.csv", skip_header=1, dtype=None, encoding="utf-8", usecols=(1, 2))
print(np_array)
printed output...
[['15:09:11.8' '-34:13:44.9']
['09:19:46.8' '+33:44:58.452']
['05:15:43.488' '+19:21:46.692']
['04:19:12.096' '+55:52:43.32']]
Disclaimer: I don't use numpy. I based my answer on https://docs.scipy.org/doc/numpy/reference/generated/numpy.genfromtxt.html
Upvotes: 0