Reputation: 121
Hei, after I've already got so many helpful advices here, I'd like to ask a question concerning detecting NAs - is there any possibility to view the ROWS of a dataframe which include NAs? The problem is that my dataframe's really huge so
is.na(data_frame)
doesn't show me all the rows (not even nearly), so I only know how many NAs there are and in which columns they are but I would really like to know, whether they are in the same rows - which mean that they possibly cause each other. As an example for my data, I'll just give you the head of the dataframe, if you need more then tell me
transect_id year day month LST precipitation Quarter
1 TR001 2010 191 7 30.62083 0 3
2 TR001 2010 191 7 30.62083 0 3
3 TR001 2010 191 7 30.62083 0 3
4 TR001 2010 191 7 30.62083 0 3
5 TR001 2010 191 7 30.62083 0 3
6 TR001 2010 191 7 30.62083 0 3
SumPre average.temp MinTemp MaxTemp prev.temp prev.Precip
1 1.895143 30.78058 27.73995 33.54386 30.43515 0
2 1.895143 30.78058 27.73995 33.54386 30.43515 0
3 1.895143 30.78058 27.73995 33.54386 30.43515 0
4 1.895143 30.78058 27.73995 33.54386 30.43515 0
5 1.895143 30.78058 27.73995 33.54386 30.43515 0
6 1.895143 30.78058 27.73995 33.54386 30.43515 0
species regional_gam prop_pheno_sampled
1 Pontia daplidice 0.00000 0.4496937
2 Polyommatus icarus zelleri 0.00000 0.3952952
3 Gonepteryx cleopatra 1.30963 0.4731522
4 Anaphaeis aurota 0.00000 0.3731392
5 Carcharodus alceae 0.00000 0.1646973
6 Euchloe belemia 1.40654 0.3373209
If I could see the rows with the NAs I could e.g. check, whether there are NAs for the LST (landscape surface temperature) in the same lines as NAs in the MaxTemp - so it would be obvious that one causes the other.
I hope I got my question clear :-) Thanks in advance!
Upvotes: 0
Views: 111
Reputation: 4474
What I always use is
df[rowSums(is.na(df))>0,]
This gives you all rows with at least one NA
. It should be also fairly efficient since rowSums
is a really fast base
function.
Or for columns
df[,colSums(is.na(df))>0]
Upvotes: 1