Reputation: 5759
I'm trying to make some comparisons between a dictionary and a pandas DataFrame.
The DataFrame looks like this:
A B C
0 'a' 'x' 0
1 'b' 'y' 1
2 'c' 'z' 4
The dictionary looks like this:
{
'a-x': [1],
'b-y': [2],
'c-z': [3]
}
The goal is to use the dictionary keys to identify matching rows in the DataFrame (key 'a-x' matches index 0 of column A and column B) and then identify the DataFrame data in column C that is greater than the associated value of the dictionary.
So:
key 'a-x' matches index 0 of column A and column B, but value of 0 in C is less than 1 > exclude
key 'b-y' matches index 1 of column A and column B, but value of 1 in C is less than 2 > exclude
key 'c-z' matches index 2 of column A and column B, and value of 4 in C is greater than 3 > include
The filtered DataFrame would then only include the entry at index 2 and look like this:
A B C
2 'c' 'z' 4
In case there are some details that matter this is a sample of my actual data
DataFrame:
Chrom Loc WT Var Change ConvChange AO DP VAF IntEx Gene Upstream Downstream Individual ID
0 chr1 115227854 T A T>A T>A 2 17224 0.0116117 TIII TIIIa NaN NaN 1 113.fastq/onlyProbedRegions.vcf
Dictionary:
rates =
{
'chr1-115227854-T-A': [0.0032073647185113397]
}
Code:
return df[(df.Chrom+'-'+str(df.Loc)+'-'+df.WT+'-'+df.Var).map(pd.Series(rates).str[0])<df.VAF]
Upvotes: 0
Views: 2698
Reputation: 323266
Create the pd.Series
then using map
create Boolean index
d={
'a-x': [1],
'b-y': [2],
'c-z': [3]
}
pd.Series(d)
Out[335]:
a-x [1]
b-y [2]
c-z [3]
dtype: object
df[(df.A+'-'+df.B).map(pd.Series(d).str[0])<df.C]
Out[340]:
A B C
2 c z 4
Upvotes: 1