Reputation: 1471
I am trying to define a function with two arguments : df (dataframe), and an integer (employerID) as my arguments. this function will return the full name of the employer.
If the given ID does not belong to any employee, I want to return the string "UNKNOWN" / If no middle name is given only return "LAST, FIRST". / If only the middle initial is given the return the full name in the format "LAST, FIRST M." with the middle initial followed by a '.'.
def getFullName(df, int1):
df = pd.read_excel('/home/data/AdventureWorks/Employees.xls')
newdf = df[(df['EmployeeID'] == int1)]
print("'" + newdf['LastName'].item() + "," + " " + newdf['FirstName'].item() + " " + newdf['MiddleName'].item() + "." + "'")
getFullName('df', 110)
I wrote this code but came up with two problems : 1) if I don't put quotation mark around df, it will give me an error message, but I just want to take a data frame as an argument not a string.
2) this code can't deal with someone with out middle name.
I am sorry but I used pd.read_excel to read the excel file which you can not access. I know it will be hard for you to test the codes without the excel file, if someone let me know how to create a random data frame with the column names, I will go ahead and change it. Thank you,
Upvotes: 1
Views: 275
Reputation: 2378
I created some fake data for this:
EmployeeID FirstName LastName MiddleName
0 0 a a a
1 1 b b b
2 2 c c c
3 3 d d d
4 4 e e e
5 5 f f f
6 6 g g g
7 7 h h h
8 8 i i i
9 9 j j None
EmployeeID
9 has no middle name, but everyone else does. The way I would do it is to break up the logic into two parts. The first, for when you cannot find the EmployeeID
. The second manages the printing of the employee's name. That second part should also have two sets of logic, one to control if the employee has a middle name, and the other for if they don't. You could likely combine a lot of this into single line statements, but you will likely sacrifice clarity.
I also removed the pd.read_excel
call from the function. If you want to pass the dataframe in to the function, then the dataframe should be created oustide of it.
def getFullName(df, int1):
newdf = df[(df['EmployeeID'] == int1)]
# if the dataframe is empty, then we can't find the give ID
# otherwise, go ahead and print out the employee's info
if(newdf.empty):
print("UNKNOWN")
return "UNKNOWN"
else:
# all strings will start with the LastName and FirstName
# we will then add the MiddleName if it's present
# and then we can end the string with the final '
s = "'" + newdf['LastName'].item() + ", " +newdf['FirstName'].item()
if (newdf['MiddleName'].item()):
s = s + " " + newdf['MiddleName'].item() + "."
s = s + "'"
print(s)
return s
I have the function returning values in case you want to manipulate the string further. But that was just me.
If you run getFullName(df, 1)
you should get 'b, b b.'
. And for getFullName(df, 9)
you should get 'j, j'
.
So in full, it would be:
df = pd.read_excel('/home/data/AdventureWorks/Employees.xls')
getFullName(df, 1) #outputs 'b, b b.'
getFullName(df, 9) #outputs 'j, j'
getFullName(df, 10) #outputs UNKNOWN
Fake data:
d = {'EmployeeID' : [0,1,2,3,4,5,6,7,8,9],
'FirstName' : ['a','b','c','d','e','f','g','h','i','j'],
'LastName' : ['a','b','c','d','e','f','g','h','i','j'],
'MiddleName' : ['a','b','c','d','e','f','g','h','i',None]}
df = pd.DataFrame(d)
Upvotes: 1