Print results of if-statement into original data frame

Question

Below I am trying to identify the type of the columns based on my if-statement:

Code:

import pandas as pd
import numpy as np

df2 = pd.read_csv("https://data.calgary.ca/api/views/848s-4m4z/rows.csv?accessType=DOWNLOAD&bom=true&format=true")
df2 = df2.iloc[:, 0:3]

num_cols = list(df2.select_dtypes(include=[np.number]).columns.values)
categorical_text_cols = list(df2.select_dtypes(exclude=[np.number]).columns.values)
for x in df2.columns:
    if x in categorical_text_cols and df2[x].str.len().mean() >= 50:
        print('categorical')
    elif x in categorical_text_cols and df2[x].str.len().mean() <= 50:
        print('text')
    elif x in num_cols:
        print('numerical')

output:

text
text
text

I am having a hard time figuring out how can I store results in the form of the original dataset?

Desired output (in a data frame)

Sector  Community Name  Group Category
text    text            text

Farid Jafri · Accepted Answer

I am assuming your loop will run the same number of times that you have number of columns. Try this:

init_df = {}
for x in df2.columns:
       if x in categorical_text_cols and df2[x].str.len().mean() >= 50:
           init_df[x] = 'categorical'
       elif x in categorical_text_cols and df2[x].str.len().mean() <= 50:
           init_df[x] = 'text'
       elif x in num_cols:
           init_df[x] = 'numerical'

And then do:

print(pd.DataFrame([init_df]))

Print results of if-statement into original data frame

Answers (2)

Related Questions