Trying_hard
Trying_hard

Reputation: 9501

X values in Seaborn

I have the following chart. the x axis is years from 1960 to 2020. I want to make this readable and can't find a good way to do this. I am trying this

ax = sns.barplot(x="Date",y="count",data=df1)

enter image description here

df:

    Date    count
0   1962    17
1   1963    2
2   1965    1
3   1966    14
4   1967    3
5   1968    4
6   1969    7
7   1970    24
8   1971    6
9   1973    25
10  1974    62
11  1975    23
12  1976    8
13  1977    3
14  1978    9
15  1979    9
16  1980    35
17  1981    15
18  1982    41
19  1983    19
20  1984    20
21  1985    9
22  1986    23
23  1987    62
24  1988    30
25  1989    15
26  1990    32
27  1991    20
28  1992    3
29  1993    4
30  1994    11
31  1995    2
32  1996    14
33  1997    38
34  1998    43
35  1999    52
36  2000    59
37  2001    60
38  2002    85
39  2003    34
40  2004    9
41  2005    4
42  2006    10
43  2007    29
44  2008    98
45  2009    68
46  2010    33
47  2011    54
48  2012    21
49  2013    6
50  2014    12
51  2015    26
52  2016    15
53  2018    29
54  2019    7
55  2020    19

I have tried:

ax = ax.set_xticks(np.arange(1960, 2021,1))
plt.xticks(ax.get_xticks(), ax.get_xticks()  *5)
plt.yticks([1960, 1970, 1980, 1990, 2000, 2010])

None seem to work as desired.

Upvotes: 0

Views: 2630

Answers (3)

Trenton McKinney
Trenton McKinney

Reputation: 62523

  • If each bar must have a tick label, increase the figure size, and rotate the labels.
# change the figure size
plt.figure(figsize=(20,5))

ax = sns.barplot(x="Date", y="count", data=df)

# rotate the labels
ax.set_xticklabels(ax.get_xticklabels(), rotation=45)
plt.show()

enter image description here

Upvotes: 3

JohanC
JohanC

Reputation: 80509

Even though the column with years is numerical, Seaborn plots its bar plots with a categorical x-axis (so, all strings). You can change the ticks by first saving the ticks and their labels into two arrays, and then only show part of them:

from matplotlib import pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns

df1 = pd.DataFrame(
    {'Date': [1962, 1963, 1965, 1966, 1967, 1968, 1969, 1970, 1971, 1973, 1974, 1975, 1976, 1977, 1978, 1979,
              1980, 1981, 1982, 1983, 1984, 1985, 1986, 1987, 1988, 1989, 1990, 1991, 1992, 1993, 1994, 1995,
              1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011,
              2012, 2013, 2014, 2015, 2016, 2018, 2019, 2020],
     'count': [17, 2, 1, 14, 3, 4, 7, 24, 6, 25, 62, 23, 8, 3, 9, 9, 35, 15, 41, 19, 20, 9, 23, 62, 30,
               15, 32, 20, 3, 4, 11, 2, 14, 38, 43, 52, 59, 60, 85, 34, 9, 4, 10, 29, 98, 68, 33, 54, 21,
               6, 12, 26, 15, 29, 7, 19]})
ax = sns.barplot(x="Date", y="count", data=df1)
ticks, labels = plt.xticks()
plt.xticks(ticks[::5], labels[::5])

plt.show()

seaborn plot This can be somewhat confusing, because 3 years are missing, which isn't clear from the plot. Also, Seaborn's default colors can be confusing, as they only bear relation with the progression of time.

An alternative is directly using matplotlib. In this case its rather straight forward to do. Matplotlib has the x-axis numerical and automatically generates reasonable ticks. Also, it becomes clear 1964, 1972 and 2017 are missing. If needed, the bars can still be colored via the year.

from matplotlib import pyplot as plt
import numpy as np
import pandas as pd

# df1 = ...

plt.bar(df1["Date"], df1["count"], color='indigo', edgecolor='white')
plt.show()

matplotlib plot

PS: To color using the year:

cmap = plt.cm.rainbow
plt.bar(df1["Date"], df1["count"], color=[cmap((d-1962)/60) for d in df1["Date"]])

Upvotes: 2

Kate Melnykova
Kate Melnykova

Reputation: 1873

You have data from 1962 to 2020, it means about 60 datapoints. Unless you have a very good reason, reading 60 labels is hard. I suggest to do one of:

  1. Combine the data for 5 or 10 consecutive years. Then you have 12 or 6 bars only -- easy-to-process with eye. It also may do some denoising (depends on the data itself)

  2. If the flaw of time is important, e.g., you want to show the trend, consider regular line plot with ticks. You may color the area below the plot if you prefer.

Upvotes: 3

Related Questions