Ju Ko
Ju Ko

Reputation: 508

R incorrect y-axis in ggplots geom_bar()

I have a dataframe with Wikipedia edits, with information about the number of edit for the user (1st edit, 2nd edit and so on), the timestamp when the edit was made, and how many words were added.

In the actual dataset, I have up to 20.000 edits per user and in some edits, they add up to 30.000 words.

However, here is a downloadable small example dataset to exemplify my problem. The header looks like this:

enter image description here

I am trying to plot the distribution of added words across the Edit Progression and across time. If I use the regular R barplot, i works just like expected:

barplot(UserFrame3$NoOfAdds,UserFrame3$EditNo)

enter image description here

But I want to do it in ggplot for nicer graphics and more customizing options.

If I plot this as a scatterplot, I get the same result:

ggplot(data = UserFrame3, aes(x = UserFrame3$EditNo, y = UserFrame3$NoOfAdds)) + geom_point(size = 0.1)

enter image description here

Same for a linegraph:

ggplot(data = UserFrame3, aes(x = UserFrame3$EditNo, y = UserFrame3$NoOfAdds)) +geom_line(size = 0.1)

enter image description here

But when I try to plot it as a bargraph in ggplot, I get this result:

ggplot(data = UserFrame3, aes(x = UserFrame3$EditNo, y = UserFrame3$NoOfAdds)) + geom_bar(stat = "identity", position = "dodge")

enter image description here

There appear to be a lot more holes on the X-axis and the maximum is nowhere close to where it should be (y = 317).

I suspect that ggplot somehow groups the bars and uses means instead of the actual values despite the "dodge" parameter? How can I avoid this? and how would I go about plotting the time progression as a bargraph aswell without ggplot averaging over multiple edits?

Upvotes: 0

Views: 1651

Answers (1)

neilfws
neilfws

Reputation: 33782

You should expect more x-axis "holes" using bars as compared with lines. Lines connect the zero values together, bars do not.

I used geom_col with your data download, it looks as expected:

UserFrame3 %>% 
  ggplot(aes(EditNo, NoOfAdds)) + geom_col()

enter image description here

Upvotes: 1

Related Questions