sdhaus
sdhaus

Reputation: 1896

Adding a column to the end of a column within the same DataFrame

I currently have a dataframe which I scraped from the internet using Beautiful Soup. However it is setup so that it is gridded, rather then a continuous list. As in Months for rows, and Years for Columns.

However I am trying to make it so that it is one continuous column as this data will be plotted against other data, aka births vs deaths.

An example of the df I currently have is as below,

      2010        2011      2013     2014
Jan  1.474071 -0.064034  0.781836 -1.282782
Feb -1.071357  0.441153  0.583787  2.353925
Mar  0.221471 -0.744471  1.729689  0.758527
Apr -0.964980 -0.845696  1.846883 -1.340896
May -1.328865  1.682706  0.888782 -1.717693
Jun  0.228440  0.901805  0.520260  1.171216
Jul -1.197071 -1.066969 -0.858447 -0.303421
Aug  0.306996 -0.028665  1.574159  0.384316
Sep -0.014805 -0.284319 -1.461665  0.650776
Oct  1.588931  0.476720 -0.242861  0.473424
Nov -0.014805 -0.284319 -1.461665  0.650776
Dec  0.964980 -0.845696  1.846883 -1.340896

However when I try append (with ignore index) I get

 df[["2010"]].append(df[["2011"]], ignore_index=True)

  00    1.474071    NaN  
  01   -1.071357    NaN  
  02    0.221471    NaN  
  03   -0.964980    NaN 
  04   -1.328865    NaN  
  05    0.228440    NaN  
  06   -1.197071    NaN 
  07    0.306996    NaN 
  08   -0.014805    NaN
  09    1.588931    NaN
  11   -0.014805    NaN 
  12    NaN         -0.064034 
  13    NaN          0.441153  
  14    NaN         -0.744471 
  15    NaN         -0.845696 
  16    NaN          1.682706  

However I am trying to get the whole dataset into one continuous column, e.g.

  00    1.474071   
  01   -1.071357    
  02    0.221471     
  03   -0.964980   
  04   -1.328865    
  05    0.228440    
  06   -1.197071   
  07    0.306996   
  08   -0.014805   
  09    1.588931  
  11   -0.014805   
  12   -0.064034 
  13    0.441153  
  14   -0.744471 
  15   -0.845696 
  16    1.682706 

How do I get all four columns into one single column?

Upvotes: 3

Views: 5115

Answers (2)

Vidhya G
Vidhya G

Reputation: 2330

Another way to do this is to unstack the DataFrame. Then reset the index to the default integer index with reset_index(drop=True):

df.unstack().reset_index(drop=True)

Upvotes: 6

EdChum
EdChum

Reputation: 394389

You can create a list of the cols, and call squeeze to anonymise the data so it doesn't try to align on columns, and then call concat on this list, passing ignore_index=True creates a new index, otherwise you'll get the month names as index values repeated:

In [228]:

cols = [df[col].squeeze() for col in df]
pd.concat(cols, ignore_index=True)
Out[228]:
0     1.474071
1    -1.071357
2     0.221471
3    -0.964980
4    -1.328865
5     0.228440
6    -1.197071
7     0.306996
8    -0.014805
9     1.588931
10   -0.014805
11    0.964980
12   -0.064034
13    0.441153
14   -0.744471
15   -0.845696
16    1.682706
17    0.901805
18   -1.066969
19   -0.028665
20   -0.284319
21    0.476720
22   -0.284319
23   -0.845696
24    0.781836
25    0.583787
26    1.729689
27    1.846883
28    0.888782
29    0.520260
30   -0.858447
31    1.574159
32   -1.461665
33   -0.242861
34   -1.461665
35    1.846883
36   -1.282782
37    2.353925
38    0.758527
39   -1.340896
40   -1.717693
41    1.171216
42   -0.303421
43    0.384316
44    0.650776
45    0.473424
46    0.650776
47   -1.340896
dtype: float64

Upvotes: 2

Related Questions