Noah Webb
Noah Webb

Reputation: 1

Can I append a header row of data with a different number of columns in R?

EmployerID | 1.0    | 20220801
Dairy      | Milk   |   1.99   | Mary    | 2x
Produce    | Tomato |   1.49   | John    | 1x
Produce    | Orange |   0.99   | Nilofar | 3x

I'm working with a pipe-delimited text file that has a header row with a different number of columns. For example, the header in the data above has 3 columns while the rest of the data has 5 columns. My goal is to read the file into RStudio, cleanse the data, and upload the data to a folder in the same format.

I was able to read the data into R using 2 dataframes by separating the header from the rest of the data. Then, I used rbind.fill() of the "plyr" package to attach the header to the dataframe. However, the header defaults to the same number of columns as the rest of the data like this:

EmployerID | 1.0    | 20220801 ||
Dairy      | Milk   |   1.99   | Mary    | 2x
Produce    | Tomato |   1.49   | John    | 1x
Produce    | Orange |   0.99   | Nilofar | 3x

How can I append the header row of data while maintaining only 3 columns (or only 2 pipes)?

Here is what I have so far

dataframe <- read.csv(
                  "file_name.txt",
                  sep = "|",
                  skip = 1,
                  header = FALSE)

header <- read.csv(
               "file_name.txt",
               sep = "|",
               nrows = 1,
               header = FALSE)

newfile <- rbind.fill(header, dataframe)


write.table(newfile, file = "new_file_name.txt",
              sep = "|",
              col.names = FALSE,
              row.names = FALSE)

Upvotes: 0

Views: 301

Answers (1)

harre
harre

Reputation: 7287

In the moment you row-bind, the extra columns will be created. You could keep your data separate and use append in write.table. If you need to keep track of header and data, you could put the data frames into a list.

write.table(header, file = "new_file_name.txt",
            sep = "|",
            col.names = FALSE,
            row.names = FALSE)

write.table(dataframe, file = "new_file_name.txt",
            append = TRUE,
            sep = "|",
            col.names = FALSE,
            row.names = FALSE)

Output:

EmployerID | 1.0    | 20220801
Dairy      | Milk   |   1.99   | Mary    | 2x
Produce    | Tomato |   1.49   | John    | 1x
Produce    | Orange |   0.99   | Nilofar | 3x

Upvotes: 1

Related Questions