ClonedOne
ClonedOne

Reputation: 107

Scatter plot with only categorical data

I would like to plot a scatter graph to visually represent data points in the form (string, string). Where each coordinate is a string taken from a given set of String values, a set for the X axis and one for the Y axis. I'm having trouble finding a library -possibly python- which allows the representation of only categorical data (no numeric values).

I have tried with Seaborn swarmplot but it seems at least one coordinate must be numeric.

I know points with the same two coordinates would collide, and i was hoping to find a library which represented those points as adjacent (cluster like).

Thanks.

Upvotes: 1

Views: 4136

Answers (1)

neocortex
neocortex

Reputation: 379

pandas is a great library for this.

You can create a dataframe with your categorical variables (note the dtype='category' argument to the dataframe createion), then get the numerical codes for each categorical variable, and scatter plot using pandas itself, or matplotlib, or whatever you like.

Example:

In [1]: import pandas as pd

In [2]: df = pd.DataFrame({'col1': list('abcab'), 'col2': list('acbbb')}, dtype='category')

In [3]: df
Out[3]:
  col1 col2
0    a    a
1    b    c
2    c    b
3    a    b
4    b    b

In [4]: df_num = df.apply(lambda x: x.cat.codes)

In [5]: df_num
Out[5]:
   col1  col2
0     0     0
1     1     2
2     2     1
3     0     1
4     1     1

In [6]: df_num.plot.scatter('col1', 'col2')

enter image description here

Upvotes: 3

Related Questions