Reputation: 835
I have a code in R that works. But I want to re-do it in python. I use R to use apply function in order to calculate minor allele frequency. Can someone tell me how such a code would look in python? I am using pandas to read the data in python.
##R-code
###Reading file
var_freq <- read_delim("./cichlid_subset.frq", delim = "\t",
col_names = c("chr", "pos", "nalleles", "nchr", "a1", "a2"), skip = 1)
# find minor allele frequency
var_freq$maf <- var_freq %>% select(a1, a2) %>% apply(1, function(z) min(z))
I have read the file using pandas but I am struggling with the second part.
###Python code
###Reading file
var_freq = pd.read_csv("./cichlid_subset.frq",sep='\t',header=None)
column_indices = [0,1,2,3,4,5]
new_names = ["chr", "pos", "nalleles", "nchr", "a1", "a2"]
old_names = df_snv_gnomad.columns[column_indices]
###Finding minor allele frequency
Insights will be appreciated.
Upvotes: 0
Views: 112
Reputation: 4929
Use:
# Read file
colnames = ["chr", "pos", "nalleles", "nchr", "a1", "a2"]
var_freq = pd.read_csv('./cichlid_subset.frq', sep='\t', header=None, skiprows=1, names=colnames)
# Get MAF
var_freq['maf'] = var_freq[['a1','a2']].min(axis=1)
Upvotes: 2