Dametre Thunberg
Dametre Thunberg

Reputation: 1

Only importing up to the maximum value of one of my columns

I am using matplotlib and numpy, and I am making graphs. The data format I am using is .csv. In the csv file I am using there are three columns. I wonder, is there a way to only import data up until the peak/lowest values of one of my columns?

Context: I am using Langmuir troughs with lipid monolayers and compressing and expanding barriers to increase/decrease the area I am trying to plot pressure and fluorescence against the area. However, the program that takes this data performs a complete cycle of compression and expansion and I cannot stop the data collection simply when the trough is at its minimum area. So I would like to have Python only import until the area value gets to its lowest point.

example of how my data looks Area | Presure | Intensity 12500 |3 | 1 11500 |6 | 12 etc |8 |25 3000 |12 |38 3500 |19 |54 <==want it to stop importing here 4500 |16 |47

Is this possible??

I have added what Phi has put and It doesn't seem to be working? I still get all of the values included into my graphs code looks like this import matplotlib.pyplot as plt import numpy as np import pandas as pd

df = pd.read_csv("C:\\Users\\Owner\\Desktop\\Thunberg,Dametre\\5-29 Data and 
movies\\New folder (2)\\Data 2.csv", sep=',')
rowmin = df.area.idxmax()
df[:(1 + rowmin)]
fig, ax1 = plt.subplots()
area, pressure, pixel = np.loadtxt 
("C:\\Users\\Owner\\Desktop\\Thunberg,Dametre\\5-29 Data and movies\\New 
folder 
(2)\\Data 2.csv", delimiter=",", skiprows=1, unpack=True)
plt.plot(area,pressure, label='area/pressure!',color='b')

plt.xlabel('area', color='b')
plt.ylabel('Pressure', color='b')
ax1.tick_params('y', colors='b')
ax2 = ax1.twinx()
this ax2 creates a second x axis 
ax2.set_ylabel('Intensity (measured by average pixel value)', color='r')
this labels the secondary axis and chooses its color
ax2.tick_params('y', colors='r')
this Chooses the color of the ticks in the axis
ax2.plot(area,pixel, color='r')
this is what actually plots the second graph of area vs intensity
plt.title('Kibron Trough Pressure-Area-Intensity Graph')
plt.legend()
plt.show()

Upvotes: 0

Views: 50

Answers (2)

AGN Gazer
AGN Gazer

Reputation: 8378

My understanding is that the file is changing in time so that you want to be able to check if minimum was detected. Think you can do this if you watch for file changes. Below I provide the simplest approach but you could "fortify" it by adding some time-outs.

import os
import numpy as np
stat_prev = os.stat(fname)
while True:
    data = np.genfromtxt(fname, dtype=np.int, delimiter=',', names=True)
    min_idx = np.argmin(data['Area'])
    if min_idx < len(data) - 1 and data['Area'][min_idx] < data['Area'][min_idx+1]:
        data = data[:min_idx + 1] # <-- remove +1 if min row is the last one
        break # exit main loop;
    # wait for the file to change
    stat_now = os.stat(fname)
    while stat_prev == stat_now: # add some time-out, if you want
        stat_prev = os.stat(fname)

Also, if do not want a structured array and just a simple array, then you can convert data to a simple array using this recipe:

data.view(data.dtype[0]).reshape(data.shape + (-1,))

Upvotes: 0

phi
phi

Reputation: 11714

Before reading the whole file, you cannot be sure which value is the highest. The simpler solution is to read the whole file and then drop rows.

import pandas as pd
df = pd.read_csv('yourfile.csv', sep=',')
rowmax = df.Intensity.idxmax()
df[:(1 + rowmax)]

Upvotes: 0

Related Questions