Reputation: 11
Hi i have strings i want to convert to numeric, basically to get the difference in area under the graph. (I have nothing to add but I have to because StackOverflow says so)
Graph looks something like this:
Code I have tried
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df = pd.DataFrame({
"TimeStamp": TimeStamp,
"average_pv_conc": average_pv_conc.values,
"average_pv_green": average_pv_green.values
})
df['SortingTime'] = pd.to_datetime(df['TimeStamp'], format='%H:%M')
sorted_indices = df['SortingTime'].argsort()
df = df.loc[sorted_indices].reset_index(drop=True)
df = df.drop(columns='SortingTime')
# Convert to numeric types
df['average_pv_conc'] = pd.to_numeric(df['average_pv_conc'], errors='coerce')
df['average_pv_green'] = pd.to_numeric(df['average_pv_green'], errors='coerce')
# Use 'o' as a marker for scatter plot
plt.scatter(df['TimeStamp'], df['average_pv_conc'], label='PV conc', marker='.', color='b', s=marker_size)
plt.scatter(df['TimeStamp'], df['average_pv_green'], label = 'PV green', marker='.', color='c', s=marker_size)
# Adding labels and title
plt.xlabel('TimeStamp')
plt.ylabel('Values')
plt.title('Plot of Column1 and Column2 against TimeStamp')
# Adding a legend
plt.legend()
# Display the plot
plt.show()
# Calculate the area under the curves using the trapezoidal rule
area_column1 = np.trapz(df['average_pv_conc'], df['TimeStamp'])
area_column2 = np.trapz(df['average_pv_green'], df['TimeStamp'])
# Find the difference in areas
area_difference = abs(area_column1 - area_column2)
print(f"Difference in areas under the curves: {area_difference}")
The error is: TypeError: unsupported operand type(s) for -: 'str' and 'str'
Upvotes: 0
Views: 19