Julieta González
Julieta González

Reputation: 11

My txt. file can't be read by R for Gene Onthology analysis

So, I have 3 .txt files according to the three categories of gene enrichment I downloaded from the GO platform and they just can't be read in R, I think it's due to the inconsistent columns.

This is how my .txt files look: This is how it looks in my console in R:

First I tried using skip:

BP_results <- read.table("Data/analysisBP.txt", header = TRUE, sep = "\t", stringsAsFactors = FALSE, skip = 10, fill = TRUE)

It didn't work, so I tried converting the file to .csv and then separate the data in columns, but instead it separated each word of the categories by columns. I think the problem relies on the inconsistent columns from the .txt files I downloaded in GO. I also looked if there is any other options to download this data in a different type of file in GPO, but I'm unfamiliar with the XML and JSON options. How can I fix this? Do I change the files manually?

Any help is appreciated, thanks.

Upvotes: 1

Views: 79

Answers (1)

Everything is good. I just changed the skip argument to 11 and it worked. I used the example file from de Gene Ontology webpage:

read.table("DATA/analysis.txt", header = TRUE, sep = "\t", stringsAsFactors = FALSE, skip = 11, fill = TRUE)

And maybe you can keep it simple:

read.delim("DATA/analysis.txt", stringsAsFactors = FALSE, skip = 11)

Or if you have readr installed, from tidyverse:

readr::read_tsv("DATA/analysis.txt", skip = 11)

Upvotes: 1

Related Questions