Reputation: 11
So, I have 3 .txt files according to the three categories of gene enrichment I downloaded from the GO platform and they just can't be read in R, I think it's due to the inconsistent columns.
First I tried using skip:
BP_results <- read.table("Data/analysisBP.txt", header = TRUE, sep = "\t", stringsAsFactors = FALSE, skip = 10, fill = TRUE)
It didn't work, so I tried converting the file to .csv and then separate the data in columns, but instead it separated each word of the categories by columns. I think the problem relies on the inconsistent columns from the .txt files I downloaded in GO. I also looked if there is any other options to download this data in a different type of file in GPO, but I'm unfamiliar with the XML and JSON options. How can I fix this? Do I change the files manually?
Any help is appreciated, thanks.
Upvotes: 1
Views: 79
Reputation: 11
Everything is good. I just changed the skip argument to 11 and it worked. I used the example file from de Gene Ontology webpage:
read.table("DATA/analysis.txt", header = TRUE, sep = "\t", stringsAsFactors = FALSE, skip = 11, fill = TRUE)
And maybe you can keep it simple:
read.delim("DATA/analysis.txt", stringsAsFactors = FALSE, skip = 11)
Or if you have readr installed, from tidyverse:
readr::read_tsv("DATA/analysis.txt", skip = 11)
Upvotes: 1