samman
samman

Reputation: 613

Using user input statements to create data tables using pandas

This is two-part question.

1) I am creating datatables based on user inputs, but is there a way to design your code so that if the user doesn't input anything, it doesn't break the script? I.E.

A=input('some input\n')
B=input('some input\n')
df1=pd.read_csv(A, sep='\s+', header=None)
df2=pd.read_csv(B, sep='\s+', header=None)

# This works really well if you have inputs for A and B, but if you don't have an input for B, then you get the error that B is undefined 

What I'd like is to set up somekind of loop so if there is no input, it just skips it. I.E.

A=input('some input\n')
if A has some input:
    df1=pd.read_csv(A, sep='\s+', header=None)
else:

B=input('some input\n')
if B has some input:
     df2=pd.read_csv(B, sep='\s+', header=None)
else:

This however comes across a second issue. 2) Later down the line, I am concating these tables (imagine I defined columns in the above datatables).

df3=df1.loc[:,'Column_4']
df4=df2.loc[:,'Column_4']
df5=pd.concat([df3,df4],axis=1)

So if in the above loop the user does not input anything for B, then you don't get the creation of df2, which means no df4. Now I can put df4 in the loop as well so I don't get the error df4 is not defined, but that would still leave an issue for the formation of df5, which I cannot put in any of the above loops.

Finally, just on the aside, is there anyway to simplify this process? Ideally I'd like to let the user put maybe 10 or 20 inputs, but it's going to be a lot of lines of code for A=input(), B=input(), C=input(), and each of those inputs with there own pd.read and .loc lines really adds up (especially if I'm creating conditional loops for every single input as well).

Upvotes: 2

Views: 951

Answers (1)

Joules
Joules

Reputation: 580

I'm not sure if I completely understood what you're trying to accomplish but from the information in your post I'm assuming you're inputting a list of paths to CSV files that need to be processed. I came up with this script to simplify the process of getting these file paths, parsing the CSVs into dataframes into a list, taking these dataframes and extracting the "Column_4" column and then concatenating them all into one final dataframe. Just enter the CSV file paths until you're done and then enter q, quit or done and it will run with whatever file paths you provided!

import pandas as pd

csv_file_paths = []
data_frames = []
column_4_frames = []
path_input = ''

print('Enter path to CSV file. When done, enter q, quit or done to stop.')
# collect file paths to be processed until q, quit or done is typed and entered
while True:
    path_input = input()
    if path_input.lower() not in ['q', 'quit', 'done']:
        csv_file_paths.append(path_input)
    else:
        break

# create dataframes for each file, append them to a list
try:
    for csv_file in csv_file_paths:
        df = pd.read_csv(csv_file, sep='\s+', header=None)
        data_frames.append(df)
except Exception as e:
    # catch non-existing file error
    print('Error loading file: '+str(e))

# collect "column 4"s from each dataframe that was read
for df in data_frames:
    column_4_frames.append(df.loc[:,'Column_4'])

# concatenate all "column 4" dataframes into one
concatenated_df = pd.concat(column_4_frames, axis=1)

Upvotes: 1

Related Questions