mark fitzpatrick
mark fitzpatrick

Reputation: 3320

Query by Variable Not Working in Pandas Dataframe

I have a dataframe in a scheduling application that has 4 columns:

Here is a sample set in a dataframe I call bldf:

| |I|S|F|C|
|--| --: | --: | --: | --: |
|0|1|1.0|1.4|5|
|1|2|1.1|1.5|5|
|2|3|1.2|1.6|3|
|3|4|1.3|1.7|5|
|4|5|1.4|1.5|4|
|5|6|1.5|1.6|12|
|6|7|1.6|2.1|4|
|7|8|1.7|2.2|4|

I have a function that looks at how many concurrent jobs will be running with each new request.

def ccj(x):
    return len( bldf[ (bldf['I']<=bldf['I'][x]) &
                      (bldf['S']<bldf['F'][x]) & (bldf['F']>bldf['S'][x]) ] )

x is the row indicator

This works fine as you can see from the new column constructed with ccj as the output:

| |I|S|F|C|ccJobs|
|--| --: | --: | --: | --: | --: |
|0|1|1.0|1.4|5|1|
|1|2|1.1|1.5|5|2|
|2|3|1.2|1.6|3|3|
|3|4|1.3|1.7|5|4|
|4|5|1.4|1.5|4|4|
|5|6|1.5|1.6|12|3|
|6|7|1.6|2.1|4|2|
|7|8|1.7|2.2|4|2|

But I need to know this by Requested Capacity (or lower). i.e. If a job is requested at C = 3, then any machine with a capacity of 3 or more can do that job. So, for example, I want a function that allows me to look at each row in sequence and count how many jobs would be concurrently running at Capacity of 4 or lower, 5 or lower... Here is my function and it is not working correctly:

def ccjTC(x,TC):
    return len( bldf[ (bldf['I']<=bldf['I'][x]) &
                      (bldf['S']<bldf['F'][x]) & (bldf['F']>bldf['S'][x]) &
                      (TC >= bldf['C'][x]) ] )

TC is the Target Capacity

The (TC >= bldf['C'][x]) condition is not working properly.

If I break open the function and look at the filtered dataframe (remove the len()), it shows that the filter is malfunctioning.

Using:

def ccjTC(x,TC):
    return ( bldf[ (bldf['I']<=bldf['I'][x]) &
                      (bldf['S']<bldf['F'][x]) & (bldf['F']>bldf['S'][x]) &
                      (TC >= bldf['C'][x]) ] )

with ccjTC(3,3) or ccjTC(3,4) I would expect:

| |I|S|F|C|
|--| --: | --: | --: | --: |
|2|3|1.2|1.6|3|

but I get an empty frame. Only when I increment TC to 5 do I get what I expected.

| |I|S|F|C|
|--| --: | --: | --: | --: |
|0|1|1.0|1.4|5|
|1|2|1.1|1.5|5|
|2|3|1.2|1.6|3|
|3|4|1.3|1.7|5|

The row 2 has a requested capacity (C) if 3 yet it is not seen until the TC reaches 5.

Why?

Upvotes: 0

Views: 186

Answers (2)

mark fitzpatrick
mark fitzpatrick

Reputation: 3320

Paul pointed out my error - I accidentally put the [x] row reference in the capacity comparison.

The code should have been:

def ccjTC(x,TC):
    return len( bldf[ (bldf['I']<bldf['I'][x]) &
                      (bldf['S']<bldf['F'][x]) & (bldf['F']>bldf['S'][x]) &
                      (TC >= bldf['C']) ] )

Upvotes: 2

Bullet
Bullet

Reputation: 28

what does this line do

return ( bldf[ (bldf['I']<=bldf['I'][x]) &
                  (bldf['S']<bldf['F'][x]) & (bldf['F']>bldf['S'][x]) &
                  (TC >= bldf['C'][x]) ] )

it returns true or false ?

Upvotes: 0

Related Questions