Examining RNA-Seq Differential expression list for signature themes

Now that we have a list of differentially expressed genes, we can examine it for themes. To do this, we will use the GOSeq module. This module is based on the R package GOSeq, by Matthew Young. It is designed to find enriched gene groups in length-biased data, such as RNASeq data. Compare it to tools like EASE for microarray data.

  1. From the Statistics drop-down menu, choose the item Gene Ontology analysis for GOSeq Initialization Dialog: The GOSeq initialization dialog.RNA-seq.
  2. Click the Cluster Analysis tab at the top of the Initialization Dialog.
  3. Leave the GOSeq parameters Significance Level: Alpha, Number of Permutations and Number of Genes per Transcript Length Bin set at their default values.
  4. You should have a cluster pre-selected in the cluster selector dialog. If you have more than one cluster available in this dialog, choose the one you want to examine for geneset enrichment.
  5. Choose Download from GeneSigDb from the drop-down menu. Click the Download button.
  6. Check that the Choose Annotation Type drop-down menu is set at GENE_SYMBOL.
  7. Leave the File Location field blank.
  8. Click Ok. GOSeq will run.

 

 

Signature theme results

In the Result Tree, you will see a new result node named GOSEQ. GOSeq Output: Gene signatures, published in GeneSigDb, with enrichment in the list of selected genes. Future plans include adding links from this display directly to the gene signature web page, where the list of genes in the signature and the source publication can be found.

  1. Open this node and select the node labeled Results Table. This table contains the complete list of genelists downloaded from the GeneSigDb database, as well as a rating for each list as to wether the contents of that list is enriched in the selected group of differentiated genes used to run GOSeq.
  2. Double-click on the header labeled p-value to sort the list. Those gene lists with low p-values, like Human StemCell_Brendel05_21genes, listed here, are enriched in the set of differentially expressed genes we found in our previous edgeR analysis. You can explore this gene list by going to the GeneSigDb website.

 
Gene signatures, published in GeneSigDb, with enrichment in the list of selected genes. Future plans include adding links from this display directly to the gene signature web page, where the list of genes in the signature and the source publication can be found.

From here, you can continue examining gene signatures of interest by searching the GeneSigDb website, or continue on with another analysis by simply selecting it from one of the drop-down menus. For this pilot, most of the standard MeV modules are available to use. A few of them, like the EASE and GSEA modules, require specific annotation files that are currently only available for DNA micoarray data. Part of the full RNASeq implementation project will be to adapt MEV to fully support RNASeq analysis in all modules. However, that support is not yet available.