kevin
kevin

Reputation: 151

python: import data from text

I tried importing float numbers from P-I curve.txt file which contains my data. however i get an error when converting this into float. i used the following code.

with open('C:/Users/Kevin/Documents/4e Jaar/fotonica/Metingen/P-I curve.txt') as csvfile:
    data= csv.reader(csvfile, delimiter = '\t')
    current=[]

    P_15=[]
    P_20=[]
    P_25=[]
    P_30=[]
    P_35=[]
    P_40=[]
    P_45=[]
    P_50=[]

    for row in data:

        current.append(float(row[0].replace(',','.')))  
        P_15.append(float(row[2].replace(',','.')))
        P_20.append(float(row[4].replace(',','.')))
        P_25.append(float(row[6].replace(',','.')))
        P_30.append(float(row[8].replace(',','.')))
        P_35.append(float(row[10].replace(',','.')))
        P_40.append(float(row[12].replace(',','.')))
        P_45.append(float(row[14].replace(',','.')))
        P_50.append(float(row[16].replace(',','.')))

with this code i got the following error which i understand that row 2 is a string but if so then why did this error not occur for row 1. Is there any other data to import float numbers without using csv import? I have copied and pasted the data from excel to a .txt file.

returned error:

  File "C:/Users/Kevin/Documents/Python Scripts/P-I curves.py", line 29, in <module>
    P_15.append(float(row[2].replace(',','.')))

ValueError: could not convert string to float: 

I tried another following code:

import pandas as pd

df=pd.read_csv('C:/Users/Kevin/Documents/4e Jaar/fotonica/Metingen/P-I curve.txt', decimal=',', sep='\t',header=0,names=['current','15','20','25','30','35','40','45','50']  )

#curre=df['current']
print(current)

The txt file has a header and looks like this:

1.8   1.9  0.4     1.9  0.4     1.9  0.4     1.9       0.4
3.8   1.9  1.3     1.9  1.3     1.9  1.3     1.9       1.2
5.8   2.0  2.5     2.0  2.4     2.0  2.3     2.0       2.2
7.8   2.0  3.7     2.0  3.6     2.0  3.5     2.0       3.4
9.8   2.1  5.2     2.0  5.1     2.0  4.9     2.0       4.7
11.8  2.1  6.9     2.1  6.7     2.1  6.4     2.1       6.1
13.8  2.1  9.0     2.0  8.6     2.1  8.2     2.1       7.8
15.8  2.1  11.5    2.1  10.8    2.1  10.2    2.1       9.7
17.8  2.2  14.7    2.2  13.7    2.2  12.7    2.2      11.8
19.8  2.2  19.5    2.2  17.5    2.2  15.9    2.2      14.5
21.8  2.2  28.9    2.2  23.6    2.2  20.3    2.2      17.9
23.8  2.3  125.8   2.2  38.4    2.2  27.8    2.2      22.8
25.8  2.3  1669.0  2.3  634.0   2.3  51.7    2.3      31.4
27.8  2.3  3142.0  2.3  2154.0  2.3  982.0   2.3      62.2
29.8  2.3  4560.0  2.3  3594.0  2.3  2460.0  2.3    1075.0
31.8  2.3  5950.0  2.3  5010.0  2.3  3872.0  2.3    2540.0
33.8  2.4  7320.0  2.4  6360.0  2.4  5230.0  2.3    3880.0
35.8  2.4  8670.0  2.4  7700.0  2.4  6550.0  2.4    5210.0
37.8  NaN  NaN     NaN  NaN     2.4  7850.0  2.4    6480.0
39.8  NaN  NaN     NaN  NaN     NaN  NaN     NaN       NaN
41.8  NaN  NaN     NaN  NaN     NaN  NaN     NaN       NaN
Name: current, dtype: float64

python seems to be returning everything instead of just line 1 which i want by printing the header current. I only want to take this line so i can save it as in an array. But How do i specifically draw the line with header current out of the data?.

I am not sure why it returned everything but i think that there is something wrong with encoding because i copied and pasted the data from excel.

Please look at the image of how the .txt looks like when copied from excel.

enter image description here

i tried out another short code (i also deleted the header manually for the .txt file!!), see description below:

data=np.loadtxt('C:/Users/Kevin/Documents/4e Jaar/fotonica/Metingen/ttest.txt',delimiter='\t')

data=float(data.replace(',','.'))


print(data[0])

with this code, i get the followin error.

ValueError: could not convert string to float: b'1,8'

I find this weird to occur. Is floating and replacing not enough for this

Upvotes: 2

Views: 264

Answers (2)

jezrael
jezrael

Reputation: 863246

I think you need omit header=0:

df=pd.read_csv('C:/Users/Kevin/Documents/4e Jaar/fotonica/Metingen/P-I curve.txt', 
                decimal=',', 
                sep='\t',
                names=['current','15','20','25','30','35','40','45','50'])

EDIT:

df=pd.read_csv('ttest.txt', 
                decimal=',', 
                sep='\t',
                names=['current','15','20','25','30','35','40','45','50'])
print (df)
    current      15      20      25      30      35      40      45     50
0       1.8     0.4     0.4     0.4     0.4     0.4     0.4     0.3    0.3
1       3.8     1.3     1.3     1.3     1.2     1.2     1.1     1.1    1.1
2       5.8     2.5     2.4     2.3     2.2     2.2     2.1     2.0    1.9
3       7.8     3.7     3.6     3.5     3.4     3.3     3.1     3.0    2.9
4       9.8     5.2     5.1     4.9     4.7     4.5     4.3     4.1    4.0
5      11.8     6.9     6.7     6.4     6.1     5.9     5.6     5.3    5.1
6      13.8     9.0     8.6     8.2     7.8     7.4     7.0     6.6    6.3
7      15.8    11.5    10.8    10.2     9.7     9.1     8.6     8.0    7.6
8      17.8    14.7    13.7    12.7    11.8    11.0    10.3     9.6    9.0
9      19.8    19.5    17.5    15.9    14.5    13.3    12.2    11.3   10.5
10     21.8    28.9    23.6    20.3    17.9    16.0    14.5    13.2   12.2
11     23.8   125.8    38.4    27.8    22.8    19.6    17.2    15.4   14.1
12     25.8  1669.0   634.0    51.7    31.4    24.5    20.6    17.9   16.2
13     27.8  3142.0  2154.0   982.0    62.2    33.1    25.3    21.0   18.5
14     29.8  4560.0  3594.0  2460.0  1075.0    60.0    32.6    25.0   21.3
15     31.8  5950.0  5010.0  3872.0  2540.0   903.0    49.9    30.8   24.6
16     33.8  7320.0  6360.0  5230.0  3880.0  2294.0   387.0    40.9   28.8
17     35.8  8670.0  7700.0  6550.0  5210.0  3621.0  1733.0    71.0   34.8
18     37.8     NaN     NaN  7850.0  6480.0  4880.0  3026.0   751.0   44.6
19     39.8     NaN     NaN     NaN     NaN  6100.0  4240.0  1998.0   70.2
20     41.8     NaN     NaN     NaN     NaN     NaN     NaN  3161.0  650.0

#list from column 15 with all values include NaNs
L1 = df['15'].tolist()
print (L1)
[0.4, 1.3, 2.5, 3.7, 5.2, 6.9, 9.0, 11.5, 14.7, 19.5, 28.9, 125.8, 1669.0, 
 3142.0, 4560.0, 5950.0, 7320.0, 8670.0, nan, nan, nan]

#list from column 15 with removing NaNs
L2 = df['15'].dropna().tolist()
print (L2)
[0.4, 1.3, 2.5, 3.7, 5.2, 6.9, 9.0, 11.5, 14.7, 19.5, 28.9, 125.8, 1669.0, 
 3142.0, 4560.0, 5950.0, 7320.0, 8670.0]

#convert all NaNs in all columns to 0
df = df.fillna(0)
print (df)
    current      15      20      25      30      35      40      45     50
0       1.8     0.4     0.4     0.4     0.4     0.4     0.4     0.3    0.3
1       3.8     1.3     1.3     1.3     1.2     1.2     1.1     1.1    1.1
2       5.8     2.5     2.4     2.3     2.2     2.2     2.1     2.0    1.9
3       7.8     3.7     3.6     3.5     3.4     3.3     3.1     3.0    2.9
4       9.8     5.2     5.1     4.9     4.7     4.5     4.3     4.1    4.0
5      11.8     6.9     6.7     6.4     6.1     5.9     5.6     5.3    5.1
6      13.8     9.0     8.6     8.2     7.8     7.4     7.0     6.6    6.3
7      15.8    11.5    10.8    10.2     9.7     9.1     8.6     8.0    7.6
8      17.8    14.7    13.7    12.7    11.8    11.0    10.3     9.6    9.0
9      19.8    19.5    17.5    15.9    14.5    13.3    12.2    11.3   10.5
10     21.8    28.9    23.6    20.3    17.9    16.0    14.5    13.2   12.2
11     23.8   125.8    38.4    27.8    22.8    19.6    17.2    15.4   14.1
12     25.8  1669.0   634.0    51.7    31.4    24.5    20.6    17.9   16.2
13     27.8  3142.0  2154.0   982.0    62.2    33.1    25.3    21.0   18.5
14     29.8  4560.0  3594.0  2460.0  1075.0    60.0    32.6    25.0   21.3
15     31.8  5950.0  5010.0  3872.0  2540.0   903.0    49.9    30.8   24.6
16     33.8  7320.0  6360.0  5230.0  3880.0  2294.0   387.0    40.9   28.8
17     35.8  8670.0  7700.0  6550.0  5210.0  3621.0  1733.0    71.0   34.8
18     37.8     0.0     0.0  7850.0  6480.0  4880.0  3026.0   751.0   44.6
19     39.8     0.0     0.0     0.0     0.0  6100.0  4240.0  1998.0   70.2
20     41.8     0.0     0.0     0.0     0.0     0.0     0.0  3161.0  650.0

#list from column 15
L3 = df['15'].tolist()
print (L3)
[0.4, 1.3, 2.5, 3.7, 5.2, 6.9, 9.0, 11.5, 14.7, 19.5, 28.9, 125.8, 1669.0, 
 3142.0, 4560.0, 5950.0, 7320.0, 8670.0, 0.0, 0.0, 0.0]

Upvotes: 1

kevin
kevin

Reputation: 151

if importing data from .txt file as csv, the missing data should be added. So in this by manually adding 0 to the .txt file and retrying this code with open('C:/Users/Kevin/Documents/4e Jaar/fotonica/Metingen/P-I curve.txt') as csvfile: data= csv.reader(csvfile, delimiter = '\t') current=[]

P_15=[]
P_20=[]
P_25=[]
P_30=[]
P_35=[]
P_40=[]
P_45=[]
P_50=[]

for row in data:

    current.append(float(row[0].replace(',','.')))  
    P_15.append(float(row[2].replace(',','.')))

 print(P_15)

it works for any row to print out.

Upvotes: 0

Related Questions