How to import single columned multiple observations data in SAS/R?

Question

There is a given dummy dataset. I am trying to import it into the SAS environment and i also tried it in RStudio as well. But i was unable to grab the correct output. the dataset provided is as follows:

1    "name, age, salary, zipcode"
2    "A, 1, 100, 10010 B, 2, 200, 10011 C, 3, 300, 10012 D, 4, 400, 10014"

i copied the data from a CSV file. The line 1 was in the First cell of the CSV and Line 2 was in the second cell just below the first cell.

PS : 10010 is the zipcode where as B is the Name that is B is the starting of next observation and similiarly C is the next observation and D is the next observation.

The Desired output is:

Name   Age   Salary   Zipcode
 A      1     100      10010
 B      2     200      10011
 C      3     300      10012
 D      4     400      10014

i am stuck with this problem since a day ago. I am just able to move around the dataset in order to move the observations. Moreover the double qouted values are making this problem very complex. A working Solution on SAS or R would work great.

EDIT: The following code has been tried by me in SAS. Thanks SMW for pointing. :)

data new;
length Name $2 Age 8. Salary 8. Zipcode 8.;
infile 'book1.csv' dsd dlm = ',' firstobs = 2 LRECL =17 ;
input Name $ Age Salary Zipcode @@;
run;

Regards

PereG · Accepted Answer

Import the column names and prepare it

nam <- read.table("csv.csv", header=F, nrow=1, stringsAsFactors=F)
nam <- gsub(",", "", nam)
nam <- nam[2]                             # resolve " " in original data 
nam <- strsplit(nam, split=" ")           #

Capitalize the colnames

library(Hmisc)                   # edited
nam <- capitalize(nam)           # edited

Import the remaining file

dd <- read.table("csv.csv", skip=1, sep="", stringsAsFactors=F)

Reorder the file, extract first column

col_1 <- gsub(",", "", as.vector(dd[seq(from=1, to=16,by=4)])[1,])

Extract the remaining columns

col_all <- as.data.frame(t(matrix(dd, ncol=4)))

Clean commas

col_all <- apply(apply(col_all[-1], 2, gsub, patt=",", replace=""), 2, as.numeric)

and set the final object

data <- as.data.frame(cbind(col_1, col_all))
names(data) <- nam
data  # ouput

  Name Age Salary Zipcode
1    A   1    100   10010
2    B   2    200   10011
3    C   3    300   10012
4    D   4    400   10014

Edited: Capitalize only with base functions. Avoid library(Hmisc)and the following code line.

nam <- unlist(lapply(nam, function(x) {                    
   paste(toupper(substring(x, 1, 1)), substring(x, 2, max(nchar(nam))), sep="")
}))

How to import single columned multiple observations data in SAS/R?

Answers (2)

Related Questions