Keelin
Keelin

Reputation: 397

R lapply function across a list of data frames

There are similar solutions to this question on the forum; however I cant get the code to work and need to ask a new question.

I have about 20 really wide csv files that I imported into the global environment. I need to be able to remove special characters and change the names of the columns that I pull from a CSV.

Here is example code of two data frames and then producing a list:

df1 <- data.frame("ï.ID" = 1, "Q.1" = 2, Q1.1 = 3)
df2 <- data.frame("ï.ID." = 2, "Q.1a" = 3, Q1.1 = 4)
Qs  <- data.frame("Original.Question" = "Q1a", "Question" = "Q.1")

dflist <- lapply(ls(), function(x) if (class(get(x)) == "data.frame") get(x))

When I import the files there is a BOM Character i with two dots over it in front of the ID column. I use the following code in individual data frames as my attempts to use lappy over the dflist have all failed.

names(df1) <- gsub("[^A-Za-z0-9]", "", names(df1))

The second thing I want to do is rename columns from a csv. Again I dont seem to have the correct function for this to work. The specific code I want to modify to loop across all data frames is:

names(df1)[names(df1) 
          %in% Qs$Original.Question] = Qs$Question[match(names(df1)[names(df1) 
          %in% Qs$Original.Question], Qs$Original.Question)]

This allows me to use a CSV to rename all of the question columns as they must be renamed prior to merging the data frames to a single data frame. Again I cant seem to be able to properly apply the lapply function.

My apologies for needing to ask a similar question again. I have tried to adapt code but have failed miserably.

Upvotes: 0

Views: 1105

Answers (2)

Onyambu
Onyambu

Reputation: 79228

You would first need to filter out NULL objects. You could do:

dflist <- Filter(Negate(is.null), dflist)
lapply(dflist, function(x) setNames(x,gsub("[^A-Za-z0-9]", "", names(x))))
[[1]]
   sex  school daysmissed
1    M   north          5
2    F   north          1
3    M central          2
4    M   south          0
5    F   south          7
6    F   south          1
7    F central          3
8    M   north          2
9    M   north          4
10   F   south         15

[[2]]
  ID Q1 Q11
1  1  2   3

[[3]]
  ID Q1a Q11
1  2   3   4

[[4]]
  OriginalQuestion Question
1              Q1a      Q.1

Upvotes: 2

Ronak Shah
Ronak Shah

Reputation: 388982

You can get the dataframe based on the pattern in their name. As in the example you have df1, df2, you can get all the dataframes in a list using the pattern as 'df' followed by a number. Use mget to get them in a list, lapply over them and rename columns.

list_df <- mget(ls(pattern = 'df\\d+'))
dflist <- lapply(list_df, function(x) 
                 {names(x) <- gsub("[^A-Za-z0-9]", "", names(x));x})

Also you might be interested in R's read.csv prepending 1st column name with junk text which avoids getting the BOM character in first column.

Upvotes: 1

Related Questions