Reputation: 401
I want to convert only the cells that has digits to numeric and save it to an excel file.
For example, The dataframe is similar to this.
df <- data.frame(row1 = c('agri','1','-', NA),row2 = c('3','import','6', 'food'), stringsAsFactors = F)
After I save it to an excel file with
openxlsx::write.xlsx(df, "df.xlsx", row.names=FALSE)
the digits in the excel file are characters.
It is because in the dataframe these columns are 'char'.
> str(df)
'data.frame': 4 obs. of 2 variables:
$ row1: chr "agri" "1" "-" NA
$ row2: chr "3" "import" "6" "food"
I tried mutate() and also transform(). But those only works with the entire column.
Is it possible to do it in R?
Any help is much appreciated.
Upvotes: 0
Views: 1238
Reputation: 13
A bit late response but I was also looking for this. I know exactly what you wanted (We want the numbers in the excel sheets to be numbers, so they can be sorted as numbers, and the text as characters), and there is a way described here Convert Number Stored as Text in Excel File Using R
It is painstakingly slow, and I am looking for a faster way to do it. So, if anyone is more experienced in this, please have a stab.
Upvotes: 0
Reputation: 50668
A late comment/response.
tldr; I don't think this is possible.
openxlsx
does allow for cell-level formatting (via addStyle
), and you could be tempted to use addStyle
to define different formats for different cells à la
library(openxlsx)
wb <- createWorkbook()
addWorksheet(wb, "sheet1")
writeData(wb, sheet = 1, df)
addStyle(wb, sheet = 1, createStyle(numFmt = "TEXT"), row = 1, col = 1)
addStyle(wb, sheet = 1, createStyle(numFmt = "NUMBER"), row = 3, col = 1)
saveWorkbook(wb, "example.xlsx", overwrite = TRUE)
Inspecting the Excel file reveals that despite the fact that Excel reports cell (row=3, col=1) to have category "Number", the cell is in fact still of category "Text" (as indicated by the little warning triangle).
I can't say whether this is a bug with openxlsx
or not.
In my personal opinion, this is for the better. Mixing different types of data in one column is bad practice, goes against the fundamentals of R's basic data structures (as entries of elements/columns of a list
/data.frame
must all have the same data type), and is the opposite of "tidy data".
Upvotes: 2