What is the best tool for alternative splicing detection from RNA-seq data?
There are various aligners that are capable of identifying splicing events: TopHat, MapSplice, SpliceMap, HMMsplicer, GSNAP, STAR, RUM, SoapSplice, HISAT
I personally prefer differential splicing tools such as MISO, DEXSeq, and MATS. MISO is relatively new but quite popular.
I have some experience with MISO as well. Definitely a good tool, just one caveat to be aware of…
MISO creates many, many output files per sample. So if you’re analyzing a large cohort, you might potentially create hundreds of thousands of small data files. This shouldn’t be a problem as long as your server is setup to handle this (i.e. adequate inode allocation). It might be a good idea to run a single sample first to get an estimate of how many files you’ll be generating for your project, then let your sysadmin know so they can make appropriate adjustments or recommendations. MISO also has a compression tool called miso_zip that will free up the inodes and some space as well for long term storage, but you’ll need to make sure you’ve finished your downstream analysis first.
Be careful of DEXSeq, because DEXSeq claims to perform exonic differential expression, but in reality it breaks overlapping exons into unique fragments, which then results in these pseudo-exons that never get mapped back to real exons. At least that was my experience with the tool ~3 years ago.
You can also look into tools recently developed by Lior Pachter’s group (https://pachterlab.github.io/), including Sleuth and Kallisto. He recently gave a VanBUG talk, recorded here (https://www.youtube.com/watch?v=j53GbHE2eP8&feature=youtu.be) that examines transcript expression in great detail, from both a philosophical and algorithmic standpoint.
He identifies in this talk the many issues with current approaches, and cautions of using tools like DESeq2, EdgeR, etc. for differential expression of transcripts (and genes for that matter).
I use Kallisto+Sleuth very regularly and find that they outperform DESeq2 and other popular tools for most analyses, especially as it handles ambiguous mapping via bootstrap analysis of quantification error. I typically use an organism’s NCBI RefSeq transcript sets (which include isoforms) as the reference data set. I have number of tutorials on using Sleuth and Kallisto in different situations at http://achri.blogspot.ca if anyone’s interested in giving them a go.