Reputation: 117
I would like to create a plot for all the indexes "p1" rows, with x="State", y=Score", however there is something wrong with iloc I receive the error code:
Traceback (most recent call last):
File "C:/Users/u/Documents/Projects/Python/TEST6.py", line 33, in <module>
df = f.iloc[["p1"]]
File "C:\Users\u\Anaconda3\envs\k\lib\site-packages\pandas\core\indexing.py", line 931, in __getitem__
return self._getitem_axis(maybe_callable, axis=axis)
File "C:\Users\u\Anaconda3\envs\k\lib\site-packages\pandas\core\indexing.py", line 1557, in _getitem_axis
return self._get_list_axis(key, axis=axis)
File "C:\Users\u\Anaconda3\envs\k\lib\site-packages\pandas\core\indexing.py", line 1530, in _get_list_axis
return self.obj._take_with_is_copy(key, axis=axis)
File "C:\Users\u\Anaconda3\envs\k\lib\site-packages\pandas\core\generic.py", line 3625, in _take_with_is_copy
result = self.take(indices=indices, axis=axis)
File "C:\Users\u\Anaconda3\envs\k\lib\site-packages\pandas\core\generic.py", line 3613, in take
indices, axis=self._get_block_manager_axis(axis), verify=True
File "C:\Users\u\Anaconda3\envs\k\lib\site-packages\pandas\core\internals\managers.py", line 851, in take
else np.asanyarray(indexer, dtype="int64")
File "C:\Users\u\Anaconda3\envs\k\lib\site-packages\numpy\core\_asarray.py", line 171, in asanyarray
return array(a, dtype, copy=False, order=order, subok=True)
ValueError: invalid literal for int() with base 10: 'p1'
Process finished with exit code 1
My code:
import pandas as pd
import matplotlib.pyplot as plt
# Create a DataFrame
f = {
'State':[1000,1002,1001,1003,1000,1003,1001],
'Score':[62,47,55,74,31,50,60]}
f = pd.DataFrame(f,columns=['State','Score'])
# Create indexes
points_index= []
index_all = []
range_of_index=2
for i in range(len(f)):
points_index.append(f"p"+str(int((i%range_of_index)+1)))
index_all.append(i+1)
# Create IndexesFrame
data_indexes = pd.DataFrame({"Index_all": index_all,
"Points_index": points_index})
# Get DataFrames togeter
f = pd.concat([data_indexes, f], axis=1)
# Set Indexes
f = f.set_index(['Index_all',"Points_index"])
print(f)
# Create Dataframe with only p1 indexes
df = f.iloc[["p1"]]
# Create figure with plot
fig, ax1 = plt.subplots()
ax1.scatter(df['State'],df['Score'])
fig.suptitle('')
plt.show()
Printed Dataframe with new indexes looks fine as I wanted:
State Score
Index_all Points_index
1 p1 1000 62
2 p2 1002 47
3 p1 1001 55
4 p2 1003 74
5 p1 1000 31
6 p2 1003 50
7 p1 1001 60
I couldn't find the solution to what is wrong in my code and I am new to pandas.
Upvotes: 0
Views: 269
Reputation: 35676
Use xs
to get a cross-section of Points_index
equal to p1
instead of iloc
:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
# Create a DataFrame
f = pd.DataFrame({
'State': [1000, 1002, 1001, 1003, 1000, 1003, 1001],
'Score': [62, 47, 55, 74, 31, 50, 60]
}, columns=['State', 'Score'])
# Simpler DataFrame index creation:
range_of_index = 2
r = pd.Series(np.arange(len(f)))
f = f.set_index(pd.MultiIndex.from_arrays(
[r + 1, 'p' + ((r % range_of_index) + 1).astype(str)],
names=['Index_all', "Points_index"]
))
# Create Dataframe with only p1 indexes
# (select the cross-section of p1 values)
df = f.xs(key='p1', level='Points_index')
# Create figure with plot
fig, ax1 = plt.subplots()
ax1.scatter(df['State'], df['Score'])
fig.suptitle('')
plt.show()
f
:
State Score
Index_all Points_index
1 p1 1000 62
2 p2 1002 47
3 p1 1001 55
4 p2 1003 74
5 p1 1000 31
6 p2 1003 50
7 p1 1001 60
Upvotes: 3
Reputation: 834
Another implementation:
df = f.iloc[f.index.get_level_values('Points_index')=="p1"]
Upvotes: 0