Reputation: 1
I want to find novel and known RNAs and transcripts in a sequence of about 10 KB. What is the most easiest way using bioinformatics tools to start with if that sequence is not well annotated in ensembl and UCSC browsers? Does splices ESTs and RNA sequencing data one option? I am new to bioinformatics, your suggestions are useful for me.
Thanks in advance
Upvotes: 0
Views: 98
Reputation: 63
Do you have a linux server or computer or are you relying on web and windows-based programs?
To align RNA-seq reads, people generally use splice read aligners like Tophat, although BLAST would probably work too.
Initially I wrote long response explaining how to do this in Linux but I've just realised that Galaxy might be a much easier solution for a beginner. Galaxy is an online bioinformatics tool with a very user friendly interface; it's particularly designed for beginners. You can sign up and log in at this website: https://main.g2.bx.psu.edu/
There are tutorials on how to do things (see 'Help' menu) but my basic workflow for your experiment would go something like this:
Regarding viewing the output, I'm not sure what is available for a custom reference sequence on Windows, you might have to do a bit of research. For Linux/Mac, I'd recommend IGV.
Upvotes: 0
Reputation: 3441
I am a bit unclear on what exactly your desired end-product or output would look like. But I might suggest doing multiple sequence alignments and looking for those with high scores. Chances are if this 10KB sequence will have some of those known sequences but they won't match exactly, so I think you want a program that gives you alignment scores and not just simple matches. I use Perl in combination with Clustal to make alignments. Basically, you will need to make .fasta or .aln files with both the 10KB sequence and a known sequence of interest according to those file formats' respective convention. You can use the GUI version of clustal if you are not too programming savvy. If you want to use Perl, here is a script I wrote for aligning a whole directory of .fasta files. It can perform many alignments in one fell swoop. NOTE: you must edit the clustal executable path in the last line (system call) to match its location on your computer for this script to function.
#!/usr/bin/perl
use warnings;
print "Please type the list file name of protein fasta files to align (end the directory path with a / or this will fail!): ";
$directory = <STDIN>;
chomp $directory;
opendir (DIR,$directory) or die $!;
my @file = readdir DIR;
closedir DIR;
my $add="_align.fasta";
foreach $file (@file) {
my $infile = "$directory$file";
(my $fileprefix = $infile) =~ s/\.[^.]+$//;
my $outfile="$fileprefix$add";
system "/Users/Wes/Desktop/eggNOG_files/clustalw-2.1-macosx/clustalw2 -INFILE=$infile -OUTFILE=$outfile -OUTPUT=FASTA";
}
Upvotes: 1