Get List of Unique String values per column in a dataframe using python

Question

here I go with another question

I have a large dataframe about 20 columns by 400.000 rows. In this dataset I can not have string since the software that will process the data only accepts numeric and nulls.

So they way I am thinking it might work is following. 1. go thru each column 2. Get List of unique strings 3. Replace each string with a value from 0 to X 4. repeat the process for the next column 5. Repeat for the next dataframe

This is how the dataframe looks like

DATE        TIME    FNRHP306H   FNRHP306HC  FNRHP306_2MEC_MAX
7-Feb-15    0:00:00 NORMAL      NORMAL      1050
7-Feb-15    0:01:00 NORMAL      NORMAL      1050
7-Feb-15    0:02:00 NORMAL      HIGH        1050
7-Feb-15    0:03:00 HIGH        NORMAL      1050
7-Feb-15    0:04:00 LOW         NORMAL      1050
7-Feb-15    0:05:00 NORMAL      LOW         1050

This is the result expected

DATE        TIME    FNRHP306H   FNRHP306HC  FNRHP306_2MEC_MAX
7-Feb-15    0:00:00 0           0           1050
7-Feb-15    0:01:00 0           0           1050
7-Feb-15    0:02:00 0           1           1050
7-Feb-15    0:03:00 1           0           1050
7-Feb-15    0:04:00 2           0           1050
7-Feb-15    0:05:00 0           2           1050

I am using python 3.5 and the latest version of Pandas

Thanks in advance

JV

Get List of Unique String values per column in a dataframe using python

Answers (1)

Related Questions