Nicolas
Nicolas

Reputation: 27

Repeat a successful loop for all columns

I have got a problem in R. As I am still learning to code successfully, I hope you can help me.

I have a data frame where the header is defined as the equity ISIN and below I have the daily equity prices in a numeric format.

ISIN1 ISIN 2 ISIN3 ...
2.35 10.10 0.90 ...
2.45 9.98 0.85 ...
2.40 10.15 0.70 ...
... ... ... ...

Now I don't need the equity prices in the columns below the ISIN but I need the percentage change. In the following format:

ISIN1 ISIN 2 ISIN3 ...
NA NA NA ...
0.04255 -0.01188 -0.05555 ...
-0.02041 0.01703 -0.12647 ...
... ... ... ...

I managed to achieve this with the first column (ISIN1), however, I could not manifold the code to the other columns. I assume there must be an easy way without copying & pasting the code for every ISIN manually. I tried the different options of apply, sapply, loop, for, ... but I did not manage to find the right one or I did not enter it correctly.

Following you can find my code for the first ISIN. The df which is being created is named "return" and the df where I have the available stock prices is called "stock_prices"

for (i in 2:nrow(return)) {
return$`ISIN1`[(i)] = ((stock_prices$`ISIN1`[(i)])/stock_prices$`ISIN1`[(i-1)])-1
    }

I hope you can help me and that I have posed the question in an understandable way. Thank you!!

Upvotes: 1

Views: 136

Answers (1)

r2evans
r2evans

Reputation: 160417

The canonical way to do this in R is lapply, something like:

running_change <- function(x) c(NA, diff(x) / x[-length(x)])
changes <- lapply(dat[,1:3], running_change)
changes
# $ISIN1
# [1]          NA  0.04255319 -0.02040816
# $ISIN.2
# [1]          NA -0.01188119  0.01703407
# $ISIN3
# [1]          NA -0.05555556 -0.17647059

That returns a list; if you want to add that to data, then

cbind(dat, changes)
#   ISIN1 ISIN.2 ISIN3 ...       ISIN1      ISIN.2       ISIN3
# 1  2.35  10.10  0.90 ...          NA          NA          NA
# 2  2.45   9.98  0.85 ...  0.04255319 -0.01188119 -0.05555556
# 3  2.40  10.15  0.70 ... -0.02040816  0.01703407 -0.17647059

If you want to replace the data, though, you can do

dat[1:3] <- changes
dat
#         ISIN1      ISIN.2       ISIN3 ...
# 1          NA          NA          NA ...
# 2  0.04255319 -0.01188119 -0.05555556 ...
# 3 -0.02040816  0.01703407 -0.17647059 ...

or, in the original step, do

dat[,1:3] <- lapply(dat[,1:3], running_change)

My use of 1:3 here is to subset the data to just the columns needed to operate. It might not be necessary in your data if all columns are numeric and you need the calc on all of them; in this case, I kept the ... fake-data you had, so needed to not try to calculate fractional changes on strings. You can use whatever method you want to subset your columns, including integer indexing as here, or by-name.

By the way, if you want to do this on all columns, then

dat[] <- lapply(dat, running_change)

The dat[] <- ensures that the columns are replaced without changing from a data.frame to a list. It's a trick; other ways work too, but this is the most direct (and code-golf/shortest) I've seen.


Data

dat <- structure(list(ISIN1 = c(NA, 0.0425531914893617, -0.0204081632653062), ISIN.2 = c(NA, -0.0118811881188118, 0.0170340681362725), ISIN3 = c(NA, -0.0555555555555556, -0.176470588235294), ... = c("...", "...", "...")), row.names = c(NA, -3L), class = "data.frame")

Upvotes: 1

Related Questions