Reputation: 49
I have been saving the output of the Android command top into a text file using Python (Bare with me as this is an R inquiry).
Unfortunately, I was missing a parameter that would just save the output as parseable without any ASCII escape commands (See https://unix.stackexchange.com/questions/409053/how-to-disable-color-in-output-of-top-command). I was able to fix that problem using the suggestion in the link.
Since I discovered the fix recently, I have quite a few older non-processed files with these extra characters (see sample output below).
Copying part of the output from the actual text file:
[s[999C[999B[6n[u[H[J[?25l[H[J[s[999C[999B[6n[uTasks: 279 total, 5 running, 274 sleeping, 0 stopped, 0 zombie
Mem: 1702176K total, 1661708K used, 41439232 free, 11583488 buffers
Swap: 425540K total, 345512K used, 81948672 free, 487176K cached
400%cpu 138%user 3%nice 235%sys 5%idle 0%iow 0%irq 19%sirq 0%host
[7m PID USER PR NI VIRT RES SHR S[%CPU] %MEM TIME+ ARGS [0m
I have an R program that reads these text files and does some post processing which works fairly well MOST of the time. Other instances these extra characters in a given file require manual cleanup.
Is there a way that I could modify my read.table command to tell R to ignore these wherever they are found ?
My code to read the text file and store the output into a CSV file is below. There is additional post-processing after that, which is really not relevant to the ask:
for (file in data_files)
{
i = i + 1
ncol <- max(count.fields(paste(folder,file,sep="/"), sep = ""))
Top_Data_Frame <- read.table(paste(folder,file,sep="/"), header = FALSE, fill=TRUE, col.names=paste0('V', seq_len(ncol)))
write.csv(Top_Data_Frame, file = paste(folder,(paste((paste("The_Whole_File", i, sep="")),".csv", sep="")), sep="/"), row.names=FALSE)
Suggestions are appreciated.
Upvotes: 0
Views: 418
Reputation: 12558
Try this:
example <- "[s[999C[999B[6n[u[H[J[?25l[H[J[s[999C[999B[6n[uTasks: 279 total, 5 running, 274 sleeping, 0 stopped, 0 zombie
Mem: 1702176K total, 1661708K used, 41439232 free, 11583488 buffers
Swap: 425540K total, 345512K used, 81948672 free, 487176K cached
400%cpu 138%user 3%nice 235%sys 5%idle 0%iow 0%irq 19%sirq 0%host
[7m PID USER PR NI VIRT RES SHR S[%CPU] %MEM TIME+ ARGS [0m"
example <- str_remove_all(data, "(?ms)^(.*(?=Tasks))|([\\[a-zA-Z0-9]+ ])|[^[:ascii:]]")
str_match(example, "(?s)Tasks: (\\d+) total, (\\d+) running, (\\d+) sleeping, (\\d+) stopped, (\\d+) zombie\n\nMem: ([0-9K]+) total, ([0-9K]+) used, ([0-9K]+) free, ([0-9K]+) buffers\n\nSwap: ([0-9K]+) total, ([0-9K]+) used, ([0-9K]+) free, ([0-9K]+) cached\n\n([0-9%]+)cpu ([0-9%]+)user ([0-9%]+)nice ([0-9%]+)sys ([0-9%]+)idle ([0-9%]+)iow ([0-9%]+)irq ([0-9%]+)sirq ([0-9%]+)host\n\n\\[([0-9a-z]+) PID USER PR NI VIRT RES SHR S\\[%CPU\\] %MEM TIME\\+ ARGS \\[([0-9a-z]+)") %>%
as.data.frame() %>%
select(-1) %>%
setNames(c("Tasks_total", "Tasks_running", "Tasks_sleeping", "Tasks_stopped", "Tasks_zombie", "Mem_total", "Mem_used", "Mem_free", "Mem_buffers", "Swap_total", "Swap_used", "Swap_free", "Swap_cached", "cpu", "user", "nice", "sys", "idle", "iow", "irq", "sirq", "host", "PID", "CPU"))
Upvotes: 0