Reputation: 21
I'm trying to import the following text file:
"year" "sex" "name" "n" "prop"
"1" 1880 "F" "Mary" 7065 0.0723835869064085
"2" 1880 "F" "Anna" 2604 0.0266789611187951
"3" 1880 "F" "Emma" 2003 0.0205214896777829
"4" 1880 "F" "Elizabeth" 1939 0.0198657855642641
"5" 1880 "F" "Minnie" 1746 0.0178884278469341
"6" 1880 "F" "Margaret" 1578 0.0161672045489473
"7" 1880 "F" "Ida" 1472 0.0150811946109318
"8" 1880 "F" "Alice" 1414 0.0144869627580554
"9" 1880 "F" "Bertha" 1320 0.0135238973413247
"10"1880 "F" "Sarah" 1288 0.0131960452845653
and I don't have any problems using:
data <-read.table("~/Documents/baby_names.txt",header=TRUE,se="\t")
However, I haven't figured out how to do it with readr. The following command fails:
data2 <-read_tsv("~/Documents/baby_names.txt")
I know the problem is related to the fact that the first row contains five elements (the headings) and the rest 6 but I don't know how to tell readr to ignore the "1", "2", "3" and so on. Any suggestions?
Upvotes: 2
Views: 1017
Reputation: 5063
You can read in the body and the column names separately and then combine them:
require(readr)
df <- read_tsv("baby_names.txt", col_names = F, skip = 1)
col_names <- read.table("baby_names.txt", header = F, sep = "\t", nrows = 1)
df$X1 <- NULL
names(df) <- col_names
Result:
> head(df)
1 1 1 1 1
1 1880 FALSE Mary 7065 0.07238359
2 1880 FALSE Anna 2604 0.02667896
3 1880 FALSE Emma 2003 0.02052149
4 1880 FALSE Elizabeth 1939 0.01986579
5 1880 FALSE Minnie 1746 0.01788843
6 1880 FALSE Margaret 1578 0.01616720
I don't think there is an easy way of setting row_names in read_tsv()
as there is with read.table()
, but this should be sufficient workaround.
Upvotes: 0
Reputation: 56149
We can read in two steps (not tested):
# read the columns, convert to character vector
myNames <- read_tsv(file = "myFile.tsv", n_max = 1)[1, ]
# read the data, skip 1st row, then drop the 1st column
myData <- read_tsv(file = "myFile.tsv", skip = 1, col_names = FALSE)[, -1]
# assign column names
colnames(myData) <- myNames
Upvotes: 1