Alex
Alex

Reputation: 1537

Python Dataframe

I am a Java programmer and I am learning python for Data Science and Analysis purposes.

I wish to clean the data in a Dataframe, but I am confused with the pandas logic and syntax.

What I wish to achieve is the something like the following Java code:

for( String name : names ) {
     if (name == "test") {
       name = "myValue";}
  }

How can do it with python and pandas dataframe. I tried as following but it does not work

import pandas as pd
import numpy as np

df = pd.read_csv('Dataset V02.csv')

array = df['Order Number'].unique()

#On average, one order how many items has?

for value in array:
    count = 0
    if df['Order Number'] == value:
        ......

I get error at df['Order Number']==value. How can I identify the specific values and edit them?

In short, I want to: -Check all the entries of 'Order Number' column -Execute an action (example: replace the value, or count the value) each time the record is equal to a given value (example, the order code)

Upvotes: 1

Views: 99

Answers (2)

EdChum
EdChum

Reputation: 394279

Just use the vectorised form for replacement:

df.loc[df['Order Number'] == 'test'

This will compare the entire column against a specific value, where this is True it will replace just those rows with the new value

For the second part if doesn't understand boolean arrays, it expects a scalar result. If you're just doing a unique value/frequency count then just do:

df['Order Number'].value_counts()

Upvotes: 1

D Sai Krishna
D Sai Krishna

Reputation: 178

The code goes this way

import pandas as pd
df = pd.read_csv("Dataset V02.csv")
array = df['Order Number'].unique()
for value in array:
      count = 0
      if value in df['Order Number']:
      .......

You need to use "in" to check the presence. Did I understand your problem correctly. If I did not, please comment, I will try to understand further.

Upvotes: 0

Related Questions