Nigel Stackhouse
Nigel Stackhouse

Reputation: 364

R Creating new data.table with specified rows of a single column from an old data.table

I have the following data.table:

    Month Day  Lat Long        Temperature
 1:    10  01 80.0  180 -6.383330333333309
 2:    10  01 77.5  180 -6.193327999999976
 3:    10  01 75.0  180 -6.263328333333312
 4:    10  01 72.5  180 -5.759997333333306
 5:    10  01 70.0  180 -4.838330999999976
---                                       
117020:    12  31 32.5  310 11.840003833333355
117021:    12  31 30.0  310 13.065001833333357
117022:    12  31 27.5  310 14.685003333333356
117023:    12  31 25.0  310 15.946669666666690
117024:    12  31 22.5  310 16.578336333333358

For every location (given by Lat and Long), I have a temperature for each day from 1 October to 31 December.

There are 1,272 locations consisting of each pairwise combination of Lat:

    Lat
1   80.0
2   77.5
3   75.0
4   72.5
5   70.0
--------
21  30.0
22  27.5
23  25.0
24  22.5

and Long:

Long
1   180.0
2   182.5
3   185.0
4   187.5
5   190.0
---------
49  300.0
50  302.5
51  305.0
52  307.5
53  310.0

I'm trying to create a data.table that consists of 1,272 rows (one per location) and 92 columns (one per day). Each element of that data.table will then contain the temperature at that location on that day.

Any advice about how to accomplish that goal without using a for loop?

Upvotes: 0

Views: 495

Answers (2)

Sam Firke
Sam Firke

Reputation: 23014

First create a date value using the lubridate package (I assumed year = 2014, adjust as necessary):

library(lubridate)
df$datetext <- paste(df$Month,df$Day,"2014",sep="-")
df$date <- mdy(df$datetext)

Then one option is to use the tidyr package to spread the columns:

library(tidyr)
spread(df[,-c(1:2,6)],date,Temperature)

    Lat Long 2014-10-01 2014-12-31
1  22.5  310         NA   16.57834
2  25.0  310         NA   15.94667
3  27.5  310         NA   14.68500
4  30.0  310         NA   13.06500
5  32.5  310         NA   11.84000
6  70.0  180  -4.838331         NA
7  72.5  180  -5.759997         NA
8  75.0  180  -6.263328         NA
9  77.5  180  -6.193328         NA
10 80.0  180  -6.383330         NA

Upvotes: 0

BrodieG
BrodieG

Reputation: 52637

Here we use ChickWeights as the data, where we use "Chick-Diet" as the equivalent of your "lat-lon", and "Time" as your "Date":

dcast.data.table(data.table(ChickWeight), Chick + Diet ~ Time)

Produces:

     Chick Diet 0 2  4  6  8 10 12 14 16 18 20 21
 1:    18    1 1 1 NA NA NA NA NA NA NA NA NA NA
 2:    16    1 1 1  1  1  1  1  1 NA NA NA NA NA
 3:    15    1 1 1  1  1  1  1  1  1 NA NA NA NA
 4:    13    1 1 1  1  1  1  1  1  1  1  1  1  1
 5:   ... 46 rows omitted

You will likely need to lat + lon ~ Month + Day or some such for your formula.

In the future, please make your question reproducible as I did here by using a built-in data set.

Upvotes: 5

Related Questions