Reputation: 377
I have a text file that contains image pixel values as below:
#1 nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,1.225475199490155542e-01,1.044848965437222138e-01,1.237544502265838786e-01,1.715363333404669177e-01,1.922596029233400172e-01,1.809632738682011854e-01,1.797130234316194342e-01,1.738541208375123936e-01,1.444294554581726231e-01,1.321258390981746855e-01,1.344635498234532101e-01,1.436132527743466947e-01,1.395290556225499690e-01,1.374780604935658956e-01,1.346506483347080507e-01,1.280550646990075425e-01,1.248504215497622527e-01,1.178248061901537996e-01,1.298443201619972898e-01,1.553180115989083732e-01,1.580724143044860419e-01,1.784962367422186780e-01,1.907025124594779186e-01,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan
#2 nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,1.029349857154064074e-01,9.448919637788849579e-02,1.059611529861169132e-01,1.123315418475298866e-01,1.044427274454799576e-01,1.201363996329007783e-01,1.282688456719490722e-01,1.251468493081038524e-01,1.305904505950917782e-01,1.166948019212366294e-01,1.099250506785318382e-01,1.136641770357243175e-01,1.130515076243375772e-01,1.184654413023679964e-01,1.445082878208643895e-01,1.663965434098903795e-01,1.663395733842590318e-01,1.752476275152526075e-01,1.685796922638230499e-01,1.482366311004082449e-01,1.309908022384465853e-01,1.261424559469170870e-01,1.268358150633545067e-01,1.255352810594060065e-01,1.259829554332418666e-01,1.289792505226832475e-01,1.297540150693830274e-01,1.209480533761810861e-01,1.285694058734546119e-01,1.369298058593048373e-01,1.461700389952401702e-01,1.431042116739904002e-01,1.712214395634834019e-01,1.818925300859868255e-01,2.010257021882600748e-01,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan
#3 nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,9.861446120242163549e-02,9.304676318969960780e-02,9.864122376278822157e-02,1.075597393605647739e-01,1.131419975961711483e-01,1.146375133556569031e-01,1.204342658911874697e-01,1.228412754806565282e-01,1.240924670494341492e-01,1.163476394083799020e-01,1.073797480686657368e-01,1.017817224886293226e-01,1.131027905414023760e-01,1.114406335131803705e-01,1.227824308916071611e-01,1.329011478552513670e-01,1.441114715371090704e-01,1.604792748573601047e-01,1.527513461191236099e-01,1.380147589010027598e-01,1.288032806310404343e-01,1.338005227090968141e-01,1.255554854466473802e-01,1.173452604805394900e-01,1.143985402480809654e-01,1.202454679138123678e-01,1.267178125929230847e-01,1.241315491837501339e-01,1.347653795894559747e-01,1.349437732217280139e-01,1.301418957926175068e-01,1.313508293861232468e-01,1.742619338497762571e-01,1.858488867892321983e-01,1.877861224975270471e-01,1.803044688712685528e-01,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan
#4 nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,8.736886296542736852e-02,8.908654375958220684e-02,8.620668521597033007e-02,9.500020858506200150e-02,1.126136404574935440e-01,1.187951696788534656e-01,1.168147013779436694e-01,1.109278058355442492e-01,1.128276541805584010e-01,1.173942164407098532e-01,1.152133179543410046e-01,1.111828410014303326e-01,1.192855572113103724e-01,1.157219419285210882e-01,1.051462987022579870e-01,1.042841664852307976e-01,1.263179021208208075e-01,1.543027512926945510e-01,1.531517647661817527e-01,1.370377223529097022e-01,1.217984978313198102e-01,1.340931752979427627e-01,1.274053299614930634e-01,1.206931794950223541e-01,1.149389700113669505e-01,1.083743218115938711e-01,1.135429261076744967e-01,1.224571336189042570e-01,1.316256830092336905e-01,1.296892050846524258e-01,1.220541991422918193e-01,1.251462726710364792e-01,1.475487955738131740e-01,1.8
.
.
.
.
It has a matrix of values in the txt file, one value corresponding to a pixel. Each row is split as above. (scroll to the right)
When I read the file in R as:
txt <- read.table("ndvi_20180102_081439_1005_3B.txt")
It produces a data.frame as below:
V1
#1 nan,nan,nan,nan,-0.131231,nan,nan,nan,....
#2 nan,nan,nan,1.2323,nan,nan,-1,2313,nan,....
.
.
.
#187 nan,nan,nan,1.12323,nan,nan,...
#188 nan,nan,0.2323,nan,nan,...
I want it in this form to calculate the mean of pixel values:
#1 nan
#2 nan
#3 -1,23232
#4 nan
.
.
.
.
I tried to separate with tidyverse::separate but I don't want to calculate the number of variables since I need to do it for about 439 files in a loop.
Finally, I want this form:
#file1 #file2 #file3 ...... #file439
#1 nan nan nan
#2 nan nan nan
#3 nan -1,32 nan
#4 -1,3 0,22 nan
.
.
.
How can I convert the text in the desired form?
Upvotes: 1
Views: 69
Reputation: 377
This is how I solve it from the answer that @Rage provides:
library(data.table)
library(tidyverse)
txt <- fread(file = "ndvi_20180102_081439_1005_3B.txt",sep = ",")
proc_txt <- function(f) {
txt <- fread(file = f, sep = ",")
txt <- gather(txt)
txt <- na.omit(txt)
mean <- mean(txt$value)
return(mean)
}
txt_files <- list.files(path=".", pattern=".txt")
df_list <- lapply(xml_files, proc_txt)
final_df <- do.call(rbind, df_list)
The final output is a table that has one column and has mean of all values of pixel in a single file for example:
n1 <- "nan, nan, 2, 3, 4, nan"
n2 <- "nan, 1, 2, 3, nan"
n3 <- "nan, 3, 4, 5, nan"
The code above produces a table such as:
n1 3
n2 2
n3 4
Upvotes: 0
Reputation: 33417
data.table's fread
mentioned by @Rage is a good choice. We need to take a little effort to deal with your first column or rather header "#1 nan" which is separated by spaces:
library(data.table)
x <- "#1 nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,1.225475199490155542e-01,1.044848965437222138e-01,1.237544502265838786e-01,1.715363333404669177e-01,1.922596029233400172e-01,1.809632738682011854e-01,1.797130234316194342e-01,1.738541208375123936e-01,1.444294554581726231e-01,1.321258390981746855e-01,1.344635498234532101e-01,1.436132527743466947e-01,1.395290556225499690e-01,1.374780604935658956e-01,1.346506483347080507e-01,1.280550646990075425e-01,1.248504215497622527e-01,1.178248061901537996e-01,1.298443201619972898e-01,1.553180115989083732e-01,1.580724143044860419e-01,1.784962367422186780e-01,1.907025124594779186e-01,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan
#2 nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,1.029349857154064074e-01,9.448919637788849579e-02,1.059611529861169132e-01,1.123315418475298866e-01,1.044427274454799576e-01,1.201363996329007783e-01,1.282688456719490722e-01,1.251468493081038524e-01,1.305904505950917782e-01,1.166948019212366294e-01,1.099250506785318382e-01,1.136641770357243175e-01,1.130515076243375772e-01,1.184654413023679964e-01,1.445082878208643895e-01,1.663965434098903795e-01,1.663395733842590318e-01,1.752476275152526075e-01,1.685796922638230499e-01,1.482366311004082449e-01,1.309908022384465853e-01,1.261424559469170870e-01,1.268358150633545067e-01,1.255352810594060065e-01,1.259829554332418666e-01,1.289792505226832475e-01,1.297540150693830274e-01,1.209480533761810861e-01,1.285694058734546119e-01,1.369298058593048373e-01,1.461700389952401702e-01,1.431042116739904002e-01,1.712214395634834019e-01,1.818925300859868255e-01,2.010257021882600748e-01,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan
#3 nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,9.861446120242163549e-02,9.304676318969960780e-02,9.864122376278822157e-02,1.075597393605647739e-01,1.131419975961711483e-01,1.146375133556569031e-01,1.204342658911874697e-01,1.228412754806565282e-01,1.240924670494341492e-01,1.163476394083799020e-01,1.073797480686657368e-01,1.017817224886293226e-01,1.131027905414023760e-01,1.114406335131803705e-01,1.227824308916071611e-01,1.329011478552513670e-01,1.441114715371090704e-01,1.604792748573601047e-01,1.527513461191236099e-01,1.380147589010027598e-01,1.288032806310404343e-01,1.338005227090968141e-01,1.255554854466473802e-01,1.173452604805394900e-01,1.143985402480809654e-01,1.202454679138123678e-01,1.267178125929230847e-01,1.241315491837501339e-01,1.347653795894559747e-01,1.349437732217280139e-01,1.301418957926175068e-01,1.313508293861232468e-01,1.742619338497762571e-01,1.858488867892321983e-01,1.877861224975270471e-01,1.803044688712685528e-01,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan
#4 nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,8.736886296542736852e-02,8.908654375958220684e-02,8.620668521597033007e-02,9.500020858506200150e-02,1.126136404574935440e-01,1.187951696788534656e-01,1.168147013779436694e-01,1.109278058355442492e-01,1.128276541805584010e-01,1.173942164407098532e-01,1.152133179543410046e-01,1.111828410014303326e-01,1.192855572113103724e-01,1.157219419285210882e-01,1.051462987022579870e-01,1.042841664852307976e-01,1.263179021208208075e-01,1.543027512926945510e-01,1.531517647661817527e-01,1.370377223529097022e-01,1.217984978313198102e-01,1.340931752979427627e-01,1.274053299614930634e-01,1.206931794950223541e-01,1.149389700113669505e-01,1.083743218115938711e-01,1.135429261076744967e-01,1.224571336189042570e-01,1.316256830092336905e-01,1.296892050846524258e-01,1.220541991422918193e-01,1.251462726710364792e-01,1.475487955738131740e-01,1.8"
DT <- fread(x, fill=TRUE, na.strings="nan")
DT[, c("V0", "V1") := tstrsplit(V1, " ", fixed=TRUE)]
set(DT, which(DT[["V1"]]=="nan"),"V1", NA)
DT[, V1 := as.numeric(V1)]
cnames <- DT$V0
DT[, V0 := NULL]
DT <- transpose(DT)
DT <- na.omit(DT)
setnames(DT, names(DT), cnames)
print(head(DT))
DTmean <- DT[, lapply(.SD, mean)]
print(DTmean)
Results:
> print(head(DT))
#1 #2 #3 #4
1: 0.1225475 0.1136642 0.1017817 0.1111828
2: 0.1044849 0.1130515 0.1131028 0.1192856
3: 0.1237545 0.1184654 0.1114406 0.1157219
4: 0.1715363 0.1445083 0.1227824 0.1051463
5: 0.1922596 0.1663965 0.1329011 0.1042842
6: 0.1809633 0.1663396 0.1441115 0.1263179
> print(DTmean)
#1 #2 #3 #4
1: 0.1477638 0.1407628 0.1330294 0.1976434
Upvotes: 0
Reputation: 323
An easy workaround for this would be to use the fread
function from the data.table
package.
txt <- fread(file = "ndvi_20180102_081439_1005_3B.txt",sep = ",")
To get the mean along each column you can use
txt[,lapply(X = .SD,FUN = mean),.SDcols = colnames(txt)]
Hope that helps
Upvotes: 1