user12550148
user12550148

Reputation: 53

Clean up data in CSV without Pandas

I'm trying to plot a line chart with the following data (csv format) as attached.

  1. I want to replace and consolidate all the quarter into a year For example: 1990-Q1, 1990-Q2, 1990-Q3, 1990-Q4 into 1990.

  2. Based on the year, I would like to consolidate the indexes e.g. all the 4 indexes in the year.

  3. I want to filter the years to solely 2007 - 2017. As the dataset had years/quarters from 1990 to 2019.

How do I do that without using Pandas?

I had added in my partial code but seemed like I might be on the wrong direction. Can someone please guide me?

Upvotes: 0

Views: 427

Answers (1)

hpaulj
hpaulj

Reputation: 231665

Are you happy with data, that you get from genfromtxt? A structured array like that should be nearly as useful for this as a pandas DataFrame. It has the same info.

Looking that the .png, it looks like the quarters are in consecutive order, without gaps. If so.

data1 = data.reshape(-1,4)

should give a (n,4) array, with one year per row.

data1['index'].sum(axis=1)

should be the sum of index values for each year. (or you might want mean).

You can pick a range of years from data1 with data1[n:m], where you choose the range by counting/calculating or even parsing the year string.

You could stick with splitting x on 'year' and 'quarter', and so on, but I think the the reshape saves a lot of work.

Upvotes: 1

Related Questions