Reputation: 41
import pandas as pd
import seaborn as sns
from matplotlib import pyplot as plt
df = pd.read_csv("WorldCupMatches.csv")
print(df.head())
df['Year'].fillna(0) # (What I found on the Internet may be that there is an empty space in the data, and NA is filled with o, but it is still not easy to use.)
df['Year'] = df['Year'].astype(int) # (How to change the data type of a column in a dataframe)
print(df.dtypes)
ax = sns.barplot(data=df, x="Year", y="Total Goals")
plt.show()
# The year column in the data set is an integer, but when it is loaded into the pandas dataframe, it becomes a floating point format. When I want to change the column to an integer (when entering the fourth line of code), it starts to report an error.
[enter image description here][1] [enter image description here][2] [enter image description here][3]
[1]: https://i.sstatic.net/Smws8.png [2]: https://i.sstatic.net/Se11H.png [3]: https://i.sstatic.net/RQm4R.png
Upvotes: 2
Views: 11561
Reputation: 30920
You have non finite values, try:
df['Year'] = pd.to_numeric(df['Year'], errors='coerce').fillna(0).astype(int)
Upvotes: 8