Reputation: 1
I need to draw market profile (aka volume profile) chart in R.
Above is an example of what I want. Horizontal axis is date. On the vertical axis I have level. And I also need to have horizontal bar at every date and every level that shows volume (bar to the right) and count (bar to the left).
My data looks like this. I have a date and level columns that I want to use for groups and volume and count to show as values.
date level volume count
1: 2019-03-04 00:00:00 0.4 50193087 51
2: 2019-03-04 00:00:00 0.1 30030902 50
3: 2019-03-04 00:00:00 -0.3 33674196 53
4: 2019-03-04 00:00:00 0.6 43566324 64
5: 2019-03-04 00:00:00 -0.5 74949678 66
6: 2019-03-04 00:00:00 -0.4 35799917 58
I don't even know where to start with that, it seems that I can't any existing chart types and even a combination. Stacked bar chart isn't going to work, because each bar's width needs to be adjusted according to volume/count. I was thinking about using population pyramid, but I'm not sure if I can use proper x axis (date) and also left bar would be almost invisible because right bar has much larger values and the share the same axis.
Does anybody know how I can draw this chart in r? Preferably using plotly or ggplot2.
UPDATE: My data contains several dates, so chart actually should look like this
And here is new sample of data
date,level,volume,count
2019-03-04,0.4,50193087,51
2019-03-04,0.1,30030902,50
2019-03-04,-0.3,33674196,53
2019-03-04,0.6,43566324,64
2019-03-04,-0.5,74949678,66
2019-03-04,-0.4,35799917,58
2019-03-04,-0.1,99431328,46
2019-03-05,0.8,85373468,45
2019-03-05,0.5,76080717,51
2019-03-05,-0.7,45250685,48
2019-03-05,-0.9,47862662,48
2019-03-05,-0.2,43731758,48
2019-03-05,0.3,43375430,45
Upvotes: 0
Views: 343
Reputation: 38053
Okay this is going to be my best guess at what is asked, though I'm not totally sure.
First I read in your data, which the poster can probably skip but may help others reproduce it:
zz <- "date,time,level,volume,count
2019-03-04,00:00:00,0.4,50193087,51
2019-03-04,00:00:00,0.1,30030902,50
2019-03-04,00:00:00,-0.3,33674196,53
2019-03-04,00:00:00,0.6,43566324,64
2019-03-04,00:00:00,-0.5,74949678,66
2019-03-04,00:00:00,-0.4,35799917,58"
df <- read.table(header = T, text = zz, sep = ",")
Then I copy your data to two seperate data.frames
, giving each another facetting variable:
df1 <- df
df1$facet <- factor("count", levels = c("volume","count"))
df2 <- df
df2$facet <- factor("volume", levels = c("volume","count"))
And then we build the plot:
ggplot(df1, aes(y = as.factor(level))) +
# We have to call geom_tile twice since we work with two data.frames, y is inherited
geom_tile(data = df1,
aes(x = 0.5 * count, width = count, height = 0.6, fill = level > 0)) +
# The trick is to map the volume to negative values
geom_tile(data = df2,
aes(x = -0.5 * volume, width = volume, height = 0.6, fill = level > 0)) +
# Then we give some colours to the bars
scale_fill_manual(values = c("TRUE" = "limegreen", "FALSE" = "red")) +
# Now we make sure the labelling is sensible on the x-axis, date is given as axis title.
scale_x_continuous(expand = c(0, 0, 0, 0),
labels = function(x){ifelse(x < -1e6, paste0(abs(x)/1e6, "M"), x)},
name = df1$date[1]) +
scale_y_discrete(name = "level") +
# Now we're making facets out of count/volume en set 'scales = "free_x"'
# to let them scale independently
facet_grid(~ facet, scales = "free_x", switch = "x") +
# Add a fake y-axis
geom_vline(xintercept = 0) +
# Fiddle around with themes
# strip.placement and 'switch = "x"' above let volume/count labels take place of x-axis
# Panel spacing is set to zero to let the facets appear as if it were one
theme_minimal() +
theme(strip.placement = "outside",
panel.spacing.x = unit(0, "mm"),
axis.line.x = element_line(colour = "black"))
And the result:
Is that anywhere near what you had in mind?
EDIT: A solution for multiple dates (sort of) on the x-axis. First I refactored the data to get more dates in there:
# df from previous example
df <- reshape2::melt(df, id.vars = c("date","level", "time"))
df2 <- cbind(date = "2019-03-05", df[,-1])
df3 <- cbind(date = "2019-03-06", df[,-1])
df <- rbind(df, df2, df3)
Next, it's going to look a lot like the previous plot with an addition of a geom_blank()
that ensures every volume/count has the same x-axis range and using date as a facetting variable.
ggplot(df) +
geom_tile(data = df[df$variable == "count",],
aes(y = as.factor(level), x = 0.5 * value, width = value, fill = level > 0),
height = 2/(1 + sqrt(5))) +
geom_tile(data = df[df$variable == "volume",],
aes(y = as.factor(level), x = -0.5 * value, width = value, fill = level > 0),
height = 2/(1 + sqrt(5))) +
# This controls x scale range to get uniform x-axis between dates
geom_blank(data = data.frame(x = c(-max(df$value[df$variable == "volume"]),
max(df$value[df$variable == "count"])),
y = 0, variable = c("volume", "count")),
aes(x = x * 1.1, y = y)) +
geom_vline(xintercept = 0) +
# Drop the name
scale_x_continuous(expand = c(0,0,0,0),
labels = function(x){abs(x)},
name = "") +
# Now facet over data and variable
facet_grid(~ date + variable, switch = "x", scales = "free_x") +
theme_minimal() +
theme(strip.placement = "outside",
# You can also set all spacing to unit(0,"mm") for a continuous look.
panel.spacing.x = unit(rep_len(c(0, 5.5), 2*nlevels(df$date) - 1), "pt"),
axis.line.x = element_line(colour = "black"))
Which looks like this:
You'll notice that the dates aren't particularly well-placed and we can't switch them in our code with the variable otherwise it will be grouped by count/volume instead of by date. Also there is no easy way to de-duplicate the dates. In my defence, mapping 3 vastly different variables to the same axis is a bit overkill. But, if you really want the date labels to look pretty, I suggest you have a look at this question: Nested facets in ggplot2 spanning groups, or edit them outside R with an image editing program.
Upvotes: 1