FooBar
FooBar

Reputation: 16488

Construct DatetimeIndex if you only have year

I have data of the structure

     country  year        POP
606  Algeria  1966  12339.140
730  Algeria  1968  13146.267
793  Algeria  1969  13528.304
856  Algeria  1970  13931.846
924  Algeria  1971  14335.388

Now I want to create first-differences per country based on the year (difference per year). If it weren't for the interval concern, I'd do something along the lines of

df.sort(['country', 'year']).set_index(['country', 'year']).diff()

Instead, I guess I have to convert year to_datetime() first. Is there a simple way to create the datetime from a column that contains years only? And is there a different more natural approach to create the differences over time?

Upvotes: 0

Views: 95

Answers (1)

filmor
filmor

Reputation: 32222

You could just do

df.set_index(df.year.map(lambda x: datetime.datetime(x, 1, 1)))

That uses the concept of left-open intervals.

Another possibility is

df.set_index(df.year.map(pd.Period))

Both return equally well-defined indexes, in the latter case you might like the output of df.diff() better since it actually states a year.

Upvotes: 1

Related Questions