Humberto R
Humberto R

Reputation: 79

R Extract Table Function from Tabulizer to Data Frame

I'm trying to extract tables from PDFs using the Tabulizer library. I extracted the 1st page with no issue and then converted it to a data frame. After that, I was just cutting the edges of all data frames to get the info required.

When trying to extract the 2nd, and trying to convert it to a data frame, it tells "arguments imply differing number of rows". The solution for this is to use "reshape2" package, but it was not helpful for me.

Is there any way to fill the missing spaces with NAs, so I can be able to convert it to a data frame?

This is how it looks on R Studio.

5 Lists

I did not have any problem converting the 1st one to a data frame, this is the code:

tables <- extract_tables(proform, pages = 1) %>% as.data.frame() 

This is the PDF I'm trying to convert:

https://drive.google.com/file/d/1aOdXkj6W_Y7sQaoLq8HtYxGBMA7IAtuW/view?usp=sharing

Upvotes: 0

Views: 291

Answers (1)

Humberto R
Humberto R

Reputation: 79

I was able to solve the problem using "extract_areas" function from the Tabulizer package.

Upvotes: 0

Related Questions