Camille
Camille

Reputation: 31

Reading a text file with a hh:mm:ss mixed with floats

I have a txt file like this :

index   timestamp   polarisation current (A)    signal (V)  head temperature (°C)   head relat.humidity (%RH)   MUGS temperature (°C)   laser voltage (V)   laser current (A)   driver temperature (°C)
0   16:11:24    0.4 0.0006019   26.51   43.5    32.0    11.37   0.3922  26.5
1   16:11:29    0.402   0.0006286   26.51   43.5    32.5    11.41   0.3972  31.5
2   16:11:34    0.404   0.0005828   26.51   43.5    32.5    11.42   0.4048  32.5
3   16:11:38    0.406   0.0006139   26.51   43.5    32.5    11.39   0.3984  32.5

The full file is here ( https://www.dropbox.com/scl/fi/2izelcpjpqr8yckowtgat/35deg_400mA_640mA_120pts_-0.015mA_mod1_LAS5000Hz_LI10000Hz_meth2_0.15us_-14h02m33s_pulse20.txt?rlkey=3xwxthp6v48deyeob0gj6uuo7&st=xta4m25q&dl=0 )

I read that file with :

with open(universal_path,'rt'): 
        values = np.genfromtxt( universal_path, delimiter="", skip_header = 1, encoding='unicode_escape')

But the problem is that the second column is full of NaN : enter image description here I tried to write dtype = None. But now it read only the first column of the txt file : enter image description here

I tried to write dtype = [int, str, float, float, float, float, float, float, float, float]. But it only reads the firts column too.

How can i do to read the second column as a string pls ?

Upvotes: 0

Views: 87

Answers (2)

Camille
Camille

Reputation: 31

With what @Dan Masek said, i have this code that works :

import numpy as np

from pathlib import Path

def lecture_EMUGS(universal_path):
    val_array = np.array(np.genfromtxt(universal_path, delimiter="", 
    skip_header=1, encoding='unicode_escape', converters={0:int, 1 : 
    str}).tolist())

    hh_min_ss = np.array([i.split(':', 2) for i in 
    val_array[:,1]]).astype(float)
    s = hh_min_ss[:,0]*3600 + hh_min_ss[:,1]*60 + hh_min_ss[:,2]

    val_array[:,1] = s
return val_array.astype(float)

universal_path =
Path(r"35deg_400mA_640mA_120pts_~0.015mA_mod1_LAS5000Hz_LI10000Hz_meth2
_0.15us_-14h02m33s_pulse20% (1).txt")
data = lecture_EMUGS(universal_path)



    

.tolist() converts what np.genfromtxt produced into a list. np.array() converts this list into a numy array. hh_min_ss converts a one column string array into a 3 column float array. s converts the "hh:mm:ss" string into the number of second it represents. Then val_array[:,1] = s replace the string column by the float column it represents in seconds.

Upvotes: 0

ticktalk
ticktalk

Reputation: 922

here's a rudimentary conversion using duckdb (i've mangled the column names into something 'reasonable' imo), the resultant dictionary can be manipulated using numpy functions

import duckdb as ddb

conn =ddb.connect() #in memory db, disappears when closed

conn.execute("""create table camile as SELECT *
    FROM read_csv('test.csv',
    delim = '\t',
    header = true,
    columns = {
    'index': 'integer',
    'timestamp': 'time',
    'polarisation-current': 'double',
    'signal': 'double',
    'head-temp': 'double',
    'head-relat-humidity': 'double',
    'MUGS-temp': 'double',
    'laser-voltage': 'double',
    'laser-current': 'double',
    'driver-temp': 'double'
    } ); """
)

myData = conn.sql("SELECT * from camile").fetchnumpy()

conn.close()

# print first 5 items ... 
for key, values in myData.items():
    print(f"{key}: {values[:5]}")

#
#  read https://duckdb.org/docs/guides/python/export_numpy.html
#

Upvotes: 1

Related Questions