J_p
J_p

Reputation: 475

Find how many days are included in each category?

For example:

 head(software_data)
                               id   installation    software_v     
                                1   2011-12-01          v12
                                2   2011-12-01          v12
                                3   2011-12-01          v12 
                                4   2011-12-01          v12 
                                5   2011-12-02          v12 
                                6   2011-12-02          v12 

How to find how many days each version was active?

One not so handy way to do it is to perform: summary(software_data[software_data$software_v=="v12",]) and change every time the version so you can check the min and max values in the installation field.

Upvotes: 1

Views: 44

Answers (2)

Onyambu
Onyambu

Reputation: 79208

Why cant you simply use the table function?? This is to obtain the frequency. I will add more rows to your data:

df2=read.table(text="
           id   installation    software_v     
                                1   2011-12-01          v12
               2   2011-12-01          v12
               3   2011-12-01          v12 
               4   2011-12-01          v12 
               5   2011-12-02          v12 
               6   2011-12-02          v12
               7   2011-12-01          v13
               8   2011-12-01          v13
               9   2011-12-02          v13
               10  2011-12-02          v13",h=T,stringsAsFactors=F)





 colSums(with(df2,table(installation,software_v))>0)
v12 v13 
  2   2

we see that v2 was active for 2 days and also v13 was active for 2 days

Upvotes: 1

Eugene Brown
Eugene Brown

Reputation: 4362

Here is a way to do it using the data.table package

# Install the package if you don't have it already
# install.packages("data.table")

# Load the package
library(data.table)

# Convert the data.frame to a data.table
software_data <- data.table(software_data)

days_active_by_v <- software_data[, .(
  min_date = min(installation), max_date = max(installation)
), by=.(software_v)][, ":=" (days_active = max_date - min_date)]

The column named days_active gives you the difference in days between the min and max dates by version

Upvotes: 0

Related Questions