extracting numbers of certain length from list

Question

I have a list of thousands of elements - some elements contain years - which are strings of 4 numbers - others contain random numbers that I need to get rid of.

I need to extract from the list only numbers that are 4 in length, and remove all other numbers. In the end I need a data frame of 20 rows - and columns containing the years that are nested in the list. For example, in the sample below I need a table that looks like this.

> sample_years
   element year year.1 year.2 year.3
1        1   NA     NA     NA     NA
2        2   NA   1918     NA     NA
3        3   NA     NA     NA     NA
4        4   NA     NA     NA     NA
5        5   NA   1912   1913     NA
6        6   NA   1893   1898   1925
7        7   NA   1820   1830   1899
8        8   NA     NA     NA     NA
9        9   NA   1808   1810   1854
10      10   NA     NA     NA     NA
11      11   NA     NA     NA     NA
12      12   NA   1885     NA     NA
13      13   NA   1900     NA     NA
14      14   NA   1926   1933     NA
15      15   NA     NA     NA     NA
16      16   NA     NA     NA     NA
17      17   NA   1870     NA     NA
18      18   NA     NA   1923     NA
19      19   NA     NA     NA     NA
20      20   NA     NA     NA     NA


> dput(sample)
list(c("", "2"), c("", "1918"), "", "", c("", "1912", "1913"), 
    c("", "1893", "1898", "1925", "1993"), c("", "1820", "1830", 
    "1899", "1900"), "", c("", "1808", "1810", "1854", "1905", 
    "1907"), "", "", c("", "1885"), c("", "1900"), c("", "1926", 
    "1933"), "", "", c("", "1870"), c("", "1", "1923"), "", "")

Sotos · Accepted Answer

We can use rbind.fill from plyr package to bind the list, and then grepl to handle your condition,

df <- rbind.fill(lapply(lst1,function(i)as.data.frame(t(i))))
df[!apply(df, 1:2, function(i) grepl('[0-9]{4}', i))] <- NA
head(df)
#    V1   V2   V3   V4   V5   V6
#1      
#2  1918    
#3      
#4      
#5  1912 1913   
#6  1893 1898 1925 1993

extracting numbers of certain length from list

Answers (2)

Related Questions