Reputation: 3193
I have the following dataframe:
dataframe = pd.DataFrame({'Date': ['2017-04-01 00:24:17','2017-04-01 00:54:16','2017-04-01 01:24:17'] * 1000, 'Luminosity':[2,3,4] * 1000})
The output of dataframe
is this:
Date Luminosity
0 2017-04-01 00:24:17 2
1 2017-04-01 00:54:16 3
2 2017-04-01 01:24:17 4
. . .
. . .
I want remove or select just the Luminosity
column, then, with python slices I have the following:
X = dataframe.iloc[:, 1].values
# Give a new form of the data
X = X.reshape(-1, 1)
And the output of X is the following numpy array:
array([[2],
[3],
[4],
...,
[2],
[3],
[4]])
I have the same situation, but a new dataframe with 76 columns, like this
This is the output when I read it.
In total, the dataframe have 76 columns, I just want select 25 columns which are the columns named PORVL2N1
, PORVL2N2
, PORVL4N1
and so successively
until arrive to the end column named PORVL24N2
which is the 76th
column
For the moment, the solution that I have is create a new data frame only with the columns of my interest, this is:
a = df[['PORVL2N1', 'PORVL2N2', 'PORVL4N1', 'PORVL5N1', 'PORVL6N1', 'PORVL7N1',
'PORVL9N1', 'PORVL9N1', 'PORVL10N1', 'PORVL13N1', 'PORVL14N1', 'PORVL15N1',
'PORVL16N1', 'PORVL16N2', 'PORVL18N1', 'PORVL18N2', 'PORVL18N3','PORVL18N4',
'PORVL21N1', 'PORVL21N2', 'PORVL21N3', 'PORVL21N4', 'PORVL21N5', 'PORVL24N1',
'PORVL24N2']
And the output is:
I want make the same, select just the columns of my interest, but using python slices with iloc
to indexing and selecting by position, such as I make in the beginning of my question.
I know that this is possible with slides, but I cannot understand good the slices sintax to get it.
How to can I using iloc
and slices python to select my interest columns?
Upvotes: 0
Views: 553
Reputation: 23753
Use regular slice notation...
>>> df
a b c d e
0 1 1 1 1 1
1 2 2 2 2 2
2 3 3 3 3 3
3 4 4 4 4 4
4 5 5 5 5 5
>>> df.iloc[:,2:]
c d e
0 1 1 1
1 2 2 2
2 3 3 3
3 4 4 4
4 5 5 5
>>> df.iloc[:,-2:]
d e
0 1 1
1 2 2
2 3 3
3 4 4
4 5 5
>>>
slice objects also work
>>> last3 = slice(-3,None)
>>> df.iloc[:,last3]
c d e
0 1 1 1
1 2 2 2
2 3 3 3
3 4 4 4
4 5 5 5
>>>
Upvotes: 2
Reputation: 8631
Considering you have your data in dataframe df
, you can do the following:
cols = list(df.columns)
pos_cols = [ i for i, word in enumerate(cols) if word.startswith('PORVL') ]
df.iloc[:, pos_cols]
Alternatively, you can use .filter()
with regex
.
df.filter(regex=("PORVL.*"))
Have a look at docs for more information.
Upvotes: 3