Reputation: 79
To start, I have a function I made called GetWebsite(id). It takes one argument which is an ID number. That function isn't the issue and returns a website address for me. The problem I am having is after I get that website, I cannot write it to the new column in the dataframe.
The data I am using is company info with 100 columns or so. I've tried and tried, searched and searched and I'm just lost at this point.
Here is the code:
df = pd.read_csv('10_records.csv')
df['WebsiteURL'] = ''
for position, i in enumerate(df.Id):
#GET THE JSON AND PROCESS THE RESPONSE
print(f'Currently getting the JSON for {i}.')
ID = GetWebsite(i)
if not ID:
print(f'No Domain Found for {i}')
else:
#APPEND THE FILE TO THE CSV AFTER ADDING IT TO THE DATAFRAME
print(f'Appending the website ({ID}) to the file.')
df.insert(position,'WebsiteURL',ID)
#MOVE ON TO THE NEXT FILE AFTER A RANDOM SLEEP
print('Done. Moving on to the next record in a moment.')
sleep(random.randint(0, 10))
#SEND EVERYTHING TO CSV FILE NOW
df.to_csv("output.csv", index=False)
print('Project successfully exported to CSV.')
What I think I'm doing here is before the for loop, I'm adding that new column to the dataframe with blank values for each row.
Then in the loop I'm calling my GetWebsite function and it's returning the domain. That works as expected.
What I now want to do, is add that domain to the row and move on to the next. At the end when the loop is over, I want to export the data frame to a csv.
This is my latest attempt where I am obviously using Pandas incorrectly and trying to enumerate then use df.insert to get it in there. I've also tried df.at() and several other ways. I'm just losing it now.
EDIT: This is what I am going for...
Output Example:
RCount PCount ... Email_1 WebsiteURL
0 1436 0 ... NaN www.eachcompanieswebsite.com
1 1436 0 ... NaN www.eachcompanieswebsite.com
2 1436 0 ... NaN www.eachcompanieswebsite.com
3 1436 0 ... NaN www.eachcompanieswebsite.com
4 1436 0 ... NaN www.eachcompanieswebsite.com
5 1436 0 ... NaN www.eachcompanieswebsite.com
6 1436 0 ... NaN www.eachcompanieswebsite.com
7 1436 0 ... NaN www.eachcompanieswebsite.com
8 1436 0 ... NaN www.eachcompanieswebsite.com
9 1436 0 ... NaN www.eachcompanieswebsite.com
Where eachcompanieswebsite.com is a unique website to the row.
NOTE: I added unnecessary comments to the code just to be super clear with my thinking because I'm a noob and clearly doing something that is probably very obvious.
Upvotes: 2
Views: 305
Reputation: 8508
Can you try something like this please:
def website(i):
ID = GetWebsite(i)
return ID if ID else 'No website found'
df['WebsiteURL'] = df['Id'].apply(lambda x: website(x))
you dont need to do any for-loop. Just get the file loaded to a dataframe and then add these steps.
I think that's what you are trying to do with the loop.
Upvotes: 1