Iterate over columns to slice dataset

Question

I have the following dataset: Columns named: 2,3,4...9 are filled with topic names that overlap with each other. Pageviews is an outcome variable.

        2                           3                       Pageviews
0       Financial Services          Consumer Products       4106.0
1       Consumer Products           ...                     3368.0
2       Consumer Products           ...                     1025.0
3       Collaboration               ...                     7840.0
4       Future of Supply Chains     ...                     2076.0

I would like to slice each topic column (2,3, 4, ...) together with Pageviews and append them so as to create only one dataframe with 1 topic column and Pageviews.

I am used to looping in Stata where you could loop through the name of the columns using x, but I understand it is totally different with Pyhton.

I started with

for x in range(2, 9):
    df_x = df[['Pageviews',  df.x]]

but Python does not recognize df.x

How do you loop through column names? And is it possible to use the iterator to create new dataframes?

Thanks!

EDIT

My desired output is

                                       Col        Pageviews
0                           Financial Services      4106.0
1                            Consumer Products      3368.0
2                            Consumer Products      1025.0
3                                 Collaboration     7840.0
4                      Future of Supply Chains      2076.0
5                          Future of Reporting      2123.0
6                    Sustainability Management     15576.0
7                                 Human Rights        52.0
8                                      BSR News      903.0
9                       Energy and Extractives      1232.0
10                                  HERproject       616.0
11                   Sustainability Management     10697.0

where col is the result of appending columns 2, 3, 4... and Pageviews is the result of appending the respective Pageviews columns..

BENY · Accepted Answer

Using melt

df.melt('Pageviews').drop('variable',1)
Out[644]: 
    Pageviews                 value
0        1210      ConsumerProducts
1        1528         Collaboration
2        1716     FinancialServices
3        1403         Collaboration
4        1090      ConsumerProducts
5        1210      ConsumerProducts
6        1528  FutureofSupplyChains
7        1716      ConsumerProducts
8        1403     FinancialServices
9        1090  FutureofSupplyChains
10       1210     FinancialServices
11       1528     FinancialServices
12       1716         Collaboration
13       1403  FutureofSupplyChains
14       1090     FinancialServices

Iterate over columns to slice dataset

Answers (2)

Related Questions