Reputation: 5
stocks <- read.delim("stocks.txt")
the.tickers <-unique(stocks$ticker)
lows <- c()
highs <- c()
for(ticker in the.tickers) {
look.at <-stocks$ticker == ticker ## i do not know why the code wrote as this. Any one know?
lows <- append(lows,min(stocks$low[look.at],na.rm=TRUE)) ## to find lowest in the 'low' column for each ticker
highs <-append(highs,max(stocks$high[look.at],na.rm=TRUE))
} ## to find the highest in the 'high' column for each ticker
the.tickers
My question is, variable look.at
is a boolean vector, which contains FALSE. Because there is no character showing as ticker. How can stocks$low
return a numerical value even though look.at is all FALSE?
Here is a brief summary of stocks.txt data structure:
str(stocks)
'data.frame': 70061 obs. of 8 variables:
$ ticker : Factor w/ 23 levels "AAPL","AMGN",..: 12 12 12 12 12 12 12 12 12 12 ...
$ industry: Factor w/ 7 levels "Banks","Biotechnology",..: 7 7 7 7 7 7 7 7 7 7 ...
$ date : Factor w/ 3086 levels "1-Apr-02","1-Apr-03",..: 1291 1086 985 683 580 478 373 269 2948 2843 ...
$ open : num 817 811 805 818 827 ...
$ high : num 818 819 813 820 827 ...
$ low : num 811 806 801 813 817 ...
$ close : num 815 811 808 814 822 ...
$ volume : num 1464122 2098176 1838552 3099791 1651111 ...
As you can see, look.at
is a boolean object. And it is all False
head(look.at,10)
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
This is the return value:
head(stocks$low[look.at],10)
[1] 14.94 15.06 15.04 15.12 15.20 15.25 14.95 14.41 14.50 14.58
It does not make sense why Stocks$low[look.at] can return value while look.at is all FALSE.
Variable low
is a numeric variable that only contains numbers and NA values.
head(stocks$low, 10)
[1] 811.44 806.45 801.47 813.34 817.39 822.31 823.67 831.50 825.05 829.58
sample of data
> head(stocks,10)
ticker industry date open high low close volume
1 GOOG Technology 20-Mar-13 816.83 817.51 811.44 814.71 1464122
2 GOOG Technology 19-Mar-13 811.24 819.25 806.45 811.32 2098176
3 GOOG Technology 18-Mar-13 805.00 812.76 801.47 807.79 1838552
4 GOOG Technology 15-Mar-13 818.50 820.30 813.34 814.30 3099791
5 GOOG Technology 14-Mar-13 826.99 826.99 817.39 821.54 1651111
6 GOOG Technology 13-Mar-13 827.90 830.69 822.31 825.31 1641413
7 GOOG Technology 12-Mar-13 830.71 831.89 823.67 827.61 2008979
8 GOOG Technology 11-Mar-13 831.69 839.70 831.50 834.82 1595678
9 GOOG Technology 8-Mar-13 834.50 834.92 825.05 831.52 2912283
10 GOOG Technology 7-Mar-13 834.06 836.62 829.58 832.60 2054238
The.tickers as follows.
> the.tickers <- unique(stocks$ticker)
> the.tickers
[1] GOOG AAPL Msft C KEY WFC JPM SO DUK D HE EIX LUV AMGN GILD CELG BIIB CAT DE IMO MRO HES YPF
Levels: AAPL AMGN BIIB C CAT CELG D DE DUK EIX GILD GOOG HE HES IMO JPM KEY LUV MRO Msft SO WFC YPF
Upvotes: 0
Views: 71
Reputation: 10483
If you are trying to get low and high price per ticker, here is how I would do it with the dplyr package:
library(dplyr)
stocks %>% group_by(ticker) %>% summarise(low = min(low, na.rm = TRUE), high = max(high, na.rm = TRUE))
Upvotes: 0
Reputation: 3392
My question is, variable
look.at
is a boolean value, how canstocks$low
pick up a boolean value, and return a numerical value?
look.at
is a boolean vector, not a value. The bracket operator applies each value of the boolean vector to each value of the numeric vector and returns a new vector containing the values of stocks$low
that correspond to TRUE
values within look.at
. Example:
> stockprice <- c(1,2,3)
> look.at <- c(T,F,T)
> stockprice[look.at]
[1] 1 3
Mirosław Zalewski makes the great point in a comment that (unlike what I thought), the bracket operation will work even if look.at
is longer or shorter than the stockprice
vector:
stockprice <- c(1,2,3,4,5)
look.at <- c(T,F,T,T)
stockprice[look.at]
[1] 1 3 4 5
> stockprice <- c(1,2,3)
> look.at <- c(T,F,T,T)
> stockprice[look.at]
[1] 1 3 NA
Upvotes: 1