Amatya
Amatya

Reputation: 1243

Missing first row when construction a Series from a DataFrame

I have a dictionary I call 'test_dict'

test_dict = {'OBJECTID': {0: 1, 1: 2, 2: 3, 3: 4, 4: 5},
'Country': {0: 'Vietnam',
 1: 'Vietnam',
 2: 'Vietnam',
 3: 'Vietnam',
 4: 'Vietnam'},
'Location': {0: 'Nha Trang',
 1: 'Hue',
 2: 'Phu Quoc',
 3: 'Chu Lai',
 4: 'Lao Bao'},
'Lat': {0: 12.250000000000057,
 1: 16.401000000000067,
 2: 10.227000000000032,
 3: 15.406000000000063,
 4: 16.627300000000048},
'Long': {0: 109.18333300000006,
 1: 107.70300000000009,
 2: 103.96700000000004,
 3: 108.70600000000007,
 4: 106.59970000000004}}

That I convert to a DataFrame

test_df = pd.DataFrame(test_dict)

and I get this:

    OBJECTID    Country  Location   Lat      Long
  0   1         Vietnam  Nha Trang  12.2500 109.183333
  1   2         Vietnam   Hue       16.4010 107.703000
  2   3         Vietnam   Phu Quoc  10.2270 103.967000
  3   4         Vietnam   Chu Lai   15.4060 108.706000
  4   5         Vietnam   Lao Bao   16.6273 106.599700

I want to construct a series with location names and I would like the column "ObjectID" to be the index. When I try it, I lose the first row.

pd.Series(test_df.Location, index=test_df.OBJECTID)

I get this:

OBJECTID
  1         Hue
  2    Phu Quoc
  3     Chu Lai
  4     Lao Bao
  5         NaN
 Name: Location, dtype: object

What I was hoping to get was this:

  OBJECTID
  1    Nha Trang
  2    Hue
  3    Phu Quoc
  4    Chu Lai
  5    Lao Bao

What am I doing wrong here? Why is the process of converting into a Series losing the first row?

Upvotes: 1

Views: 273

Answers (2)

René
René

Reputation: 4827

You can use:

pd.Series(test_df.Location).reindex(test_df.OBJECTID)

Result:

OBJECTID
1         Hue
2    Phu Quoc
3     Chu Lai
4     Lao Bao
5         NaN
Name: Location, dtype: object

Upvotes: 0

timgeb
timgeb

Reputation: 78650

You can fix your code via

pd.Series(test_df.Location.values, index=test_df.OBJECTID)

because the problem is that test_df.Location has an index itself that starts at 0.

Edit - my preferred alternative:

test_df.set_index('OBJECTID')['Location']

Upvotes: 3

Related Questions