Reputation: 41
I am new to R and I have the data in the following way these are the 2 columns
Broker_ID_Buy Broker_ID_Sell
638 423
546 728
423 321
546 423
and it goes on there are around 28 different brokers appearing in buying or selling positions on different time
I need the data to be arranged like this
Broker_ID_638 Broker_ID_423 Broker_ID_546
BP SP IP
IP IP BP
IP BP IP
IP SP BP
Where BP= Buying Position, SP=Selling Position ,IP=Idle Position
I want to use these three different states for predicting using a Markov Chain
Upvotes: 1
Views: 103
Reputation: 57210
Here's another possible solution:
# create your table
txt <-
"Broker_ID_Buy,Broker_ID_Sell
638,423
546,728
423,321
546,423"
dt1 <- read.csv(text=txt)
# turn "Time, Broker_ID_Buy, Broker_ID_Sell" data.frame
# into "Time, Broker_ID, Position"
buyers <- data.frame(Time=1:nrow(dt1),
Broker_ID=dt1$Broker_ID_Buy,
Position="BP",
stringsAsFactors=F)
sellers <- data.frame(Time=1:nrow(dt1),
Broker_ID=dt1$Broker_ID_Sell,
Position="SP",
stringsAsFactors=F)
longDT <- rbind(buyers,sellers)
# pivot the brocker ids on the columns
wideDT <- reshape(data=longDT,direction="wide",
timevar="Broker_ID", idvar="Time", v.names="Position")
# well-format column names and turn NAs into "IP"
names(wideDT) <- sub(x=names(wideDT),pattern="Position.","Broker_ID_")
wideDT[is.na(wideDT)] <- "IP"
Result:
> wideDT
Time Broker_ID_638 Broker_ID_546 Broker_ID_423 Broker_ID_728 Broker_ID_321
1 1 BP IP SP IP IP
2 2 IP BP IP SP IP
3 3 IP IP BP IP SP
4 4 IP BP SP IP IP
Upvotes: 1
Reputation: 69171
This seems to get you in the right ballpark:
library(reshape2)
x <- data.frame(BP = c(638,546,423,546), SP = c(423, 728, 321, 423))
x$index <- 1:nrow(x)
x.m <- melt(x, id.vars = "index")
out <- dcast(index ~ value, data = x.m, value.var="variable")
out[is.na(out)] <- "IP"
out
#---
index 321 423 546 638 728
1 1 IP SP IP BP IP
2 2 IP IP BP IP SP
3 3 SP BP IP IP IP
4 4 IP SP BP IP IP
Upvotes: 2