Reputation: 47
I'm analyzing Company Data set that stores 'Company Name', 'Company Profit'. I also have another data set that has '# of Employees', 'Feedback (Negative or Positive)'. I want to analyze do Companies with more Profit Worth have more Positive Employees or not. So the question is I will have 'Company Profit' in millions or billions and number of employees would be quite small.
So, Can I scale the data or do something else here?
Suggestions are welcome.
Upvotes: 0
Views: 56
Reputation: 662
If you have a table that looks like this:
Company Name Company Profit # of Employees Feedback (Negative or Positive)
0 Alpha 1000000 10 Positive
1 Bravo 13000000 210 Positive
2 Charlie 2300000 16 Negative
3 Delta 130000 1 Negative
and want a table that looks like this:
Company Name Company Profit (Million) # of Employees Feedback (Negative or Positive)
0 Alpha 1.00 10 Positive
1 Bravo 13.00 210 Positive
2 Charlie 2.30 16 Negative
3 Delta 0.13 1 Negative
Then you can use the apply
method and a lambda
function to rescale the data.
#this part creates the original table
import pandas as pd
columns = ['Company Name', 'Company Profit', '# of Employees', 'Feedback (Negative or Positive)']
df = pd.DataFrame([('Alpha', 1000000, 10, 'Positive'),
('Bravo', 13000000, 210, 'Positive'),
('Charlie', 2300000, 16, 'Negative'),
('Delta', 130000, 1, 'Negative')], columns = columns)
#this part makes the modification
df['Company Profit (Million)'] = df['Company Profit'].apply(lambda x: x/1000000)
df = df [['Company Name', 'Company Profit (Million)', '# of Employees', 'Feedback (Negative or Positive)']]
Upvotes: 1