Konstantinos
Konstantinos

Reputation: 4406

Vectorized Data Frame creation?

I would like to create a data.frame (I know matrix will be faster, but I need a data.frame) but it takes too long (more than 30 minutes). I am sure there is a better way than what I have already tried.

I am having a large object (ok, not so large ~=100MB size on disk before read.csv()) that is like:

           Date    City V3 V4
1    2008-12-30 NewYork 15 54
2    2008-12-31 NewYork 16 34
[...]
4001 2008-12-30 London  12 12
4002 2008-12-31 London  16 44
[...]
9001 2008-12-30 Madrid  26 54
9002 2008-12-31 Madrid  64 23

...imagine a lot of cities (more than 500) and a lot of dates (20 years daily data, but sometimes in a irregular time series (that is, Madrid may be the only city that has observations on 2001-01-01)).

What I want is arrange them in a data.frame so that the row names will be the Dates and the column names the city names like this one:

            NewYorkV3  LondonV3  MadridV3
2008-12-30         15        12        26
2008-12-31         16        16        64

What I have tried (trying to hide the final object's growth) is:

uniqs <- unique(city.data[ ,2])

city.list <- vector('list', length(uniqs))

for (i in 1:length(uniqs)) {
city.list[[i]] <- subset(city.data, City==as.character(uniqs[i]))[ ,3] 
}

city.df <- do.call('cbind', city.list)

I am sure there is a more efficient way, but which is it?

Can I load the object as xts? How? I am getting erros that I cannot understand... Can the Dates' column then have of same values?

Can I melt and reshape the object? How? (errors again)

Thank you!

Upvotes: 1

Views: 83

Answers (3)

Konstantinos
Konstantinos

Reputation: 4406

You guys are very fast.

I had some problems with the Date column, but I did

current.city.data$V1 <- as.character(current.city.data$V1)

and everything was solved then (could also do it while reading it anyway)

Thank you.

Upvotes: 0

mnel
mnel

Reputation: 115505

You may also be interested in using data.table, and dcast.data.table which extends reshape2

This requires data.table version 1.8.11 (from R-forge)

library(reshape2)
library(data.table)

dcast(x, Date ~ City, value.var = 'V3')

Upvotes: 3

Matthew Lundberg
Matthew Lundberg

Reputation: 42689

reshape works for this:

reshape(x, direction="wide", timevar="City", idvar="Date")
        Date V3.NewYork V4.NewYork V3.London V4.London V3.Madrid V4.Madrid
1 2008-12-30         15         54        12        12        26        54
2 2008-12-31         16         34        16        44        64        23

Upvotes: 2

Related Questions