Reputation: 33
I am new to using python with data sets and am trying to exclude a column ("id") from being shown in the output. Wondering how to go about this using the describe() and exclude functions.
Upvotes: 3
Views: 5918
Reputation: 1
You can also do it by dropping the columns that are not relevant.
Eg - If you don't want to see a describe function for columns id
and CustomerID
as they won't be providing any relevant information, you can simply drop them.
columns_to_describe = df.drop(columns=['id','CustomerID']).columns
df[columns_to_describe].describe()
I hope that helps.
Thanks
Upvotes: 0
Reputation: 1
Although somebody responded with an example given from the official docs which is more then enough, I'd just want to add this, since It might help a few ppl:
IF your DataFrame is large (let's say 100s columns), removing one or two, might not be a good idea (not enough), instead, create a smaller DataFrame holding what you're interested and go from there.
Example of removing 2+ columns:
table_of_columns_you_dont_want = set(your_bigger_data_frame.colums) = {'column_1', 'column_2','column3','etc'}
your_new_smaller_data_frame = your_new_smaller_data_frame[list[table_of_columns_you_dont_want]]
your_new_smaller_data_frame.describe()
IF your DataFrame is medium/small size, you already know every column and you only need a few columns, just create a new DataFrame and then apply describe():
I'll give an example from reading a .csv file and then read a smaller portion of that DataFrame which only holds what you need:
df = pd.read_csv('.\docs\project\file.csv')
df = [['column_1','column_2','column_3','etc']]
df.describe()
Upvotes: 0
Reputation: 1
You can do that by slicing your original DF and remove the 'id' column. One way is through .iloc
. Let's suppose the column 'id' is the first column from you DF, then, you could do this:
df.iloc[:,1:].describe()
The first colon represents the rows, the second the columns.
Upvotes: 0
Reputation: 1068
describe
works on the datatypes. You can include or exclude based on the datatype & not based on columns. If your column id
is of unique data type, then
df.describe(exclude=[datatype])
or if you just want to remove the column(s) in describe
, then try this
cols = set(df.columns) - {'id'}
df1 = df[list(cols)]
df1.describe()
TaDa its done. For more info on describe
click here
Upvotes: 5