Anubhav Dikshit
Anubhav Dikshit

Reputation: 1829

Convert a string to data frame, including column names

I have a string whose structure and length can keep varying, that is

Input:

X <- ("A=12&B=15&C=15")
Y <- ("A=12&B=15&C=15&D=32&E=53")

What I was looking for this string to convert to data frame

Output Expected:

Dataframe X

 A  B  C
 12 15 15

and Dataframe Y

 A  B  C  D  E
 12 15 15 32 53

What I tired was this:

X <- as.data.frame(strsplit(X, split="&"))

But this didn't work for me, as it created only one column and column name was messed up.

P.S: I cannot hard code the column names because they can vary, and at any given time a string will contain only one row

Upvotes: 4

Views: 163

Answers (2)

akrun
akrun

Reputation: 886938

One option is to extract the numeric part from the string, and read it with read.table. The pattern [^0-9]+ indicates one or more characters that are not a number and replace it with a space in the first gsub, read that using read.table, and specify the column names in the col.names argument with the values got by removing all characters that are not an upper case letter (second gsub)

f1 <- function(str1){
read.table(text=gsub("[^0-9]+", " ", str1), 
         col.names = scan(text=trimws(gsub("[^A-Z]+", " ", str1)), 
             what = "", sep=" ", quiet=TRUE))
 }

f1(X)
#   A  B  C
#1 12 15 15
f1(Y)
#   A  B  C  D  E
#1 12 15 15 32 53

Upvotes: 5

Sandipan Dey
Sandipan Dey

Reputation: 23101

You can try this too:

library(stringr)
res <- str_match_all(X, "([A-Z]+)=([0-9]+)")[[1]]
df <- as.data.frame(matrix(as.integer(res[,3]), nrow=1))
names(df) <- res[,2]

df
   A  B  C
1 12 15 15

Upvotes: 3

Related Questions