Reputation: 287
I have this below lines. How to write equivalent python code for R code.
c1 <- c(7.15,7.45,8.15,8.45,9.15,9.45,10.15,10.45,11.15,11.45,12.15,12.45,13.15,13.45,14.15,14.45,15.15,15.45,16.15,16.45,17.15,17.45,18.15,18.45,19.15,19.45,20.15)
numeric_vector <- c(12.15,12.45,13.15,13.45,14.15,14.45,15.15,15.45,16.15,16.45,17.15,17.45,18.15)
data <- data.frame(matrix(nrow = 1,ncol = length(c1)))
colnames(data) <- c(c1)
data[1,] <- 0
data[1,colnames(data)[(colnames(data) %in% as.character(numeric_Vector))]] = data[1,colnames(data)[(colnames(data) %in% as.character(numeric_Vector))]] + 1
df <- tibble::rownames_to_column(data.frame(t(data)), "col1")
I have tried like below in python:
data = pd.DataFrame(index=np.arange(0), columns=np.arange(len(c1)))
data.columns = c1
data[0,] = 0
d1 = pd.DataFrame(numeric_vector)
d1.columns = ['col1']
d1['count'] =d1.apply(lambda x: 1, axis=1)
d1['col1'] = d1['col1'].astype('category')
add_col1 = set(c1) - set(d1['col1'].unique())
d1['col1'] = d1['col1'].cat.add_categories(add_col1)
otData = d1['col1'].value_counts().reset_index()
Please, help me to convert the lines to python. It is giving different output.
Upvotes: 0
Views: 256
Reputation: 4929
R:
df <- data.frame(col1=c1)
df$col2 <- as.integer(d$col1 %in% numeric_vector)
Python:
import pandas as pd
df = pd.DataFrame({'col1': c1})
df['col2'] = df.col1.isin(numeric_vector).astype(int)
Comparing outputs:
First, in R:
c1 <- c(7.15,7.45,8.15,8.45,9.15,9.45,10.15,10.45,11.15,11.45,12.15,12.45,13.15,13.45,14.15,14.45,15.15,15.45,16.15,16.45,17.15,17.45,18.15,18.45,19.15,19.45,20.15)
numeric_vector = c(12.15,12.45,13.15,13.45,14.15,14.45,15.15,15.45,16.15,16.45,17.15,17.45,18.15)
df <- data.frame(col1=c1)
df$col2 <- as.integer(df$col1 %in% numeric_vector)
write.csv(df, 'df.csv', row.names = F)
Then, in Python:
c1 = [7.15,7.45,8.15,8.45,9.15,9.45,10.15,10.45,11.15,11.45,12.15,12.45,13.15,13.45,14.15,14.45,15.15,15.45,16.15,16.45,17.15,17.45,18.15,18.45,19.15,19.45,20.15]
numeric_vector = [12.15,12.45,13.15,13.45,14.15,14.45,15.15,15.45,16.15,16.45,17.15,17.45,18.15]
import pandas as pd
df = pd.DataFrame({'col1': c1})
df['col2'] = df.col1.isin(numeric_vector).astype(int)
# Compare if all values are equal
df_R = pd.read_csv('df.csv')
print((df_R == df).values.all())
True
# Merge and compare outputs:
print(df.add_suffix('_Python').join(df_R.add_suffix('_R')))
col1_Python col2_Python col1_R col2_R
0 7.15 0 7.15 0
1 7.45 0 7.45 0
2 8.15 0 8.15 0
3 8.45 0 8.45 0
4 9.15 0 9.15 0
5 9.45 0 9.45 0
6 10.15 0 10.15 0
7 10.45 0 10.45 0
8 11.15 0 11.15 0
9 11.45 0 11.45 0
10 12.15 1 12.15 1
11 12.45 1 12.45 1
12 13.15 1 13.15 1
13 13.45 1 13.45 1
14 14.15 1 14.15 1
15 14.45 1 14.45 1
16 15.15 1 15.15 1
17 15.45 1 15.45 1
18 16.15 1 16.15 1
19 16.45 1 16.45 1
20 17.15 1 17.15 1
21 17.45 1 17.45 1
22 18.15 1 18.15 1
23 18.45 0 18.45 0
24 19.15 0 19.15 0
25 19.45 0 19.45 0
26 20.15 0 20.15 0
Upvotes: 1