A new MeV publication

MeV has been featured in a chapter of the book Biomedical Informatics for Cancer Research, published by Springer Publishing. The chapter is aptly called MeV: MultiExperiment Viewer. In it, we describe the features of MeV, overview how the software is used and highlight several of MeV's analysis tools.

 

Howe E, Holton K, Nair S, Schlauch D, Sinha R, Quackenbush J: MeV: MultiExperiment Viewer. In Biomedical Informatics for Cancer Research. 2010:267-277.  The book is available at Amazon.

MeV v4.5.1 is released

This bugfix release addresses several issues reported by our users. The MeV team recommends that all MeV users upgrade to this release.

 

Fixes in v4.5.1

  • Data matrices received as a gaggle broadcast with data type “intensity” are now loaded into the correct internal data structure.

  • When data is cleared using File->Clear Data, the Sample and Gene cluster managers are now reset properly so they will reappear on loading of new data.

  • GSEA Experiment viewer for individual gene sets now displays the correct genes.

  • Updated SupportFileDefinition to check for appropriate suffix in filename when attempting to match files in the repository.

  • Missing header for SOTA and SOM modules is fixed.

  • Mislabeling in NMF’s consensus viewer is resolved.

  • Mislabeled group numbers in LIMMA is fixed.

  • Bug where EASE and BN dialogs would clear the list of arrays supported for the default-selected organism has been fixed.

MeV and R

As of v4.5, MeV for Mac OS requires R v2.9, and will not function properly if another version is installed. This means that any MeV for Mac users that have installed R v2.10 will find that MEV is unable to run the LIMMA module, as it depends on R. When version 2.10.1 of R is released (currently scheduled for Decepber 14th by the R project) we will be assembling a version of MeV that will work with R v2.10. We will also be providing an upgrade script that will allow previous installations of MeV to work with R v2.10.x.

Release History

MeV v4.8.1

December 16, 2011

Bug Fixes

  • nEASE table viewer link shortcuts work properly.
  • Handles error when saving large heatmaps from Experiment Viewer.
  • Analysis options are customized to include data-specific options.

MeV v4.8

November 18, 2011

New CLValid Module
A new module for cluster validation, CLVALID, has been added to MeV’s existing clustering tools.  CLVALID uses the R package "clValid" to compare the relative properties of 10 different clustering methods across a several different numbers of clusters. This module aims to help choose a method that is most compact, well-separated, connected, and stable. It also optionally makes use of bioconductor annotation packages to biologically validate the results.

New Annotation Support Files
The MeV team has built a new pipeline for producing the annotation files that are used to support modules like EASE and to display chromosomal location information and GO terms in the various gene table views throughout MeV. The new annotations are collected from Bioconductor v2.6, and are more complete than before. The files also include many new arrays, such as Affymetrix's Exon (ST) arrays.

Complete list of supported arrays


In the future, we will be able to easily add new arrays as the Bioconductor team releases them, and to easily update these annotations when a new Bioconductor version is integrated into MeV. As a rule, MeV will provide annotation from the version of Bioconductor that is supported by MeV's currently-supported version of R. Currently, that version is R v2.11 and Bioconductor v2.6. This coordination is to ensure that annotation used internally by R modules is consistent with any annotation MeV displays.

MeV v4.7.3

July 15, 2011


New Features

* The EASE module has been re-enabled for use with RNA-Seq data.

MeV v4.7.2

July 11, 2011

New Features

* Custom annotation loading for RNA-Seq data.

Bugfixes

* Restored the MeV.exe icon.

* Fixed the broken link to HBGB Genome Browser.

* Fixed a Mac-specific bug in the GOSeq module that prevents the module from running.

* Fixed bug for zero-variance genes using Pearson Correlation.

* Enabled top-panel resizing.


Other Changes

* The GOSeq Module has been moved to the Meta Analysis toolbar.


MeV v4.7.1

May 19, 2011

Bugfixes

* A few new validation checks to RNA-Seq file loader

* Fixed a bug that showed up if the input RANseq file was incomplete.

MeV v4.7

May 16, 2011

New RNA-Seq Features

MeV is now capable of loading and analyzing RNA-Seq data.

New File Loader
MeV can now load summarized RNASeq data from a simple, tab-delimited file format. This format is fully described in the appendix of the MeV user manual. The loader can load count data, RPKM or FPKM, or combinations of the two data types.

GOSeq: GO term enrichment detection for RNASeq data (Young, et al, 2010).
GOSEQ is a technique for identifying differentially expressed sets of genes, such as GO terms while accounting for the biases inherent to sequencing data.

EdgeR: differential expression analysis of digital gene expression data  (Robinson et al., 2010).
EdgeR is a Bioconductor software package for examining differential expression of replicated count data. An over-dispersed Poisson model is used to account for both biological and technical variability. Empirical Bayes methods are used to moderate the degree of over-dispersion across transcripts, improving the reliability of inference. The methodology can be used even with the most minimal levels of replication, provided at least one phenotype or experimental condition is replicated.

DESeq: Digital gene expresion analysis based on the negative binomial distribution  (Anders and Huber, 2010).
The BioC package DESeq provides a powerful tool to estimate the variance in count data and test for differential expression. It can estimate variance-mean dependence in count data from high-throughput sequencing assays and test for differential expression based on a model using the negative binomial distribution.

DGESeq: An R package to identify differentially expressed genes from RNA-Seq data  (L. Wang et al., 2010).
Identify differentially expressed genes from RNA-seq data. RNA sequencing is modeled as a random sampling process, in which each read is sampled independently and uniformly from every possible nucleotide in the sample. Under this assumption the number of reads coming from a gene (or transcript isoform) follows a binomial distribution (and could be approximated by a Poisson distribution). Based on this statistical model, Fisher’s exact test, likelihood ratio test and 2 other methods were proposed to identify differentially expressed genes.


Other New Features

Expression Graphs
In the Sample Cluster Manager, a new graph view is available, called Expression Graphs. This option allows the creation of boxplots and bar charts of individual genes or groups of genes, compared across sample groups.

Major updates to GSEA user interface
Simpler, easier UI allows more intuitive use of the Gene Set Enrichment Module.  Several calculation improvements and algorithm fixes have been applied to the newest release.

Import File feature added to List Import option in Cluster Manager
Clusters can now be created by loading a file containing a gene list.

New MeV User Manual
We have updated the MEV manual to a web-based format. Now, the help buttons within MeV link directly to a local copy of the user manual. Full information about the linked module is available immediately.

R 2.11
All R-dependent MeV functions call R version 2.11 by default. 

R package auto-download
MeV now automatically downloads R support packages after installation. The packages no longer have to be included in the initial download.

"Set as Data Source" Option Removed
We have removed a feature of the MeV result tree. Previously, the right-click context menu for cluster nodes in the result tree contained an option called "Set as Data Source". Choosing this option would cause the genes in the selected cluster to be treated as the entire MeV source dataset. All subsequent modules and filters run in MeV would be applied only to that subset of the data. We have removed this option because it was redundant and not particularly stable. Users who want to work with only a subset of their data have two options, both of which are more robust and fit better with the MeV data metaphor.
Option 1. Launch as new Viewer:
Create a gene or sample cluster from the result node of interest, by right-clicking on the viewer window and choosing Store Entire Cluster.
Go to the appropriate Cluster Manager (Gene Cluster Manager or Sample Cluster Manager, in the result tree under the node Cluster Manager).
Right-click on the cluster you just created, and choose Open/Launch -> Launch MeV Session. This will create a new Multiple Array Viewer containing only the data from the selected cluster. All analyses executed in this MAV will only apply to the selected data.

Option 2. Select data cluster during module execution
Create a gene or sample cluster from the result node of interest, by right-clicking on the viewer window and choosing Store Entire Cluster.
In the module execution dialog, select that cluster from the cluster selection panel. The module will apply its analysis to only the genes/samples in that cluster.


Bugfixes

* The viewers for single-color Affymetrix data now handle the heatmap display of zero values properly
* GSEA bugfixes
* Missing HCL header bug resolved.
* NMF Plotviewer error fixed.
* The GSEA p-value graph viewer now saves and restores state correctly.
* The Windows version of MeV v4.6.2 shipped without a MeV executable. While the program was still usable with the tmev.bat file, it was annoying the MeV.exe file has been restored.
* Agilent file loader fixed for loading of 1-color data.
* Agilent file loader fixed for loading of multiple samples simultaneously.



MeV v4.6.2 bugfix release

November 23, 2010

  • Hierarchical Clustering header trees now appear when HCL is auto-launched by another module.
  • RHook default/cached prop loading and remote access and other Exception handling and forwarding.
  • 64 Bit Java bypass added to TMEV.bat launch file.
  • BN progress bar now updates properly when the module is run more than once.
  • Added placeholder file to data/BN_files/kegg directory to ensure the folders always appear regardless of which unzip utility is used.
  • Removed Cluster Analysis option for modules when Clusters have not been created.
  • Several small fixes pertaining to loading and unloading data.
  • Default setting changed in SAM when using R.
  • Error handling in Non-Negative Matrix Factorization improved, Plotviewer label issues resolved.

MeV v4.6.1 bugfix release

August 12, 2010

 

  • Errors fixed in selection of EASE file system.
  • Default distance metric for HCL run after LIMMA is now Pearson Correlation.
  • Loading annotation after expression data now stores organism and array data.
  • Manually loaded of annotation files are now correctly flagged.
  • Search function disabled when no data is loaded.
  • MAGE-TAB file loader displays and processes files in preview window correctly.
  • MeV manual link now redirects correctly.
  • Toolbar resets to disabled correctly on clear data command.
  • SAM default settings updated.
  • Improvement to display of MeV banner in progress dialogs.
  • Hierarchical tree view is no longer displayed for all nEASE sub-results.
  • Improvement to "Save EASE table" menu options in EASE and nEASE results.
  • State-saving improvements to EASE module.
  • More compact, simpler nEASE results.
  • Improvements to the Minet documentation.
  • UI tweaks in BN dialog.

MeV v4.6 release

July 2, 2010


MeV v4.6 includes a large number of new features, including several new modules and large improvements to existing favorites.

Major additions


Attract Module

The Attract algorithm identifies the core gene expression modules that are differentially activated between cell types or different sample groups, and elucidates the set of expression profiles which describe the range of transcriptional behavior within each module.


Global Ancova Module

A technique for identifying differentially expressed gene sets based off of the calculation of an F-test between groups of samples. Analyses are typically run in a two-class format but may also be applied to additional groups.  Global Ancova fits linear models to the data and compares them using the extra sum of squares principle. The result table includes p-values, permutation p-values and asymptotic p-values.


Minet Module

For a given dataset, minet infers the network in two steps. First, the mutual information between all pairs of variables in dataset is computed according to the estimator argument. Then the algorithm given by method considers the estimated mutual information in order to build the network.


SURV Module

The Survival (SURV) module contains two functions for the analysis of censored survival data. The first is a basic comparison of the survival curves of two groups of samples. The second feature of the module is the creation of a cox proportional hazards model based on the loaded gene expression data, using survival time as the reporting value.


EASE UI Rewrite

The EASE UI is simpler and easier to use now.


Updates to the BN module

Network Seed allows the user is to provide a file representing a network. Network seed can be used in one of the three ways:

1.    Using the user network seed alone and bypassing literature based network seeding altogether.

2.    Using the user network seed along with Literature mining seed.

3.    User provided network is used as a complete network and the network structure is not learned, only the Conditional Probability Tables (CPTs) associated with the network is learned for downstream exploration.


A node by the name of “CLASS” shows up in the network which captures the effect of sample groups on the network. Once the network is displayed the “CLASS” node behaves and can be treated as any other node in the network.


Updates to the GSEA module

Two new viewers are now provided, including a p-value graph viewer and a gene set membership plot. Gene sets can now be automatically downloaded from GeneSigDB and MSigDb.


Updates to the SAM module

A new addition to the SAM module integrates RHook to make a newer version of SAM available to users.  MeV’s SAM now makes use of serially correlated time-course data in the exploration of statistically significant gene expression.


Hierarchical Clustering Trees

MeV now displays hierarchical trees with meaningful and proportional node heights along with an optional scale tailored to the chosen distance metric used in constructing the tree.


Other Changes

Rama significance testing for spotted array data has been disabled, along with the Bridge module. These functions never worked properly and have been unsupported for some time. They are still available in older versions of MeV. The most recent version of MeV that contains these features is MeV v4.5.1.


We have also retired the Single Array Viewer.

Minor Additions

  • FDR calculation is displayed in the TTEST module.
  • Annotation can now be auto-loaded by MeV after expression data has already been loaded.
  • Agilent file loader has been updated to work with the latest file formats.
  • Pearson correlation coefficients are now the default distance metric for most analysis modules.

 

Known Issues

MeV's R-driven modules (LIMMA, Attract, Surv, SAM, etc) will not be accessible when MeV is launched via Java Webstart. This is due to difficulty with including the required R libraries with the Webstart download. Until we identify a solution to this problem, the workaround is to download MeV and run it locally.

 

MeV v4.5.1 bugfix release

December 17, 2009

Bugfixes

  • Data matrices received as a gaggle broadcast with data type “intensity” are now loaded into the correct internal data structure.

  • When data is cleared using File->Clear Data, the Sample and Gene cluster managers are now reset properly so they will reappear on loading of new data.

  • GSEA Experiment viewer for individual gene sets now displays the correct genes.

  • Updated SupportFileDefinition to check for appropriate suffix in filename when attempting to match files in the repository.

  • Missing header for SOTA and SOM modules is fixed.

  • Mislabeling in NMF’s consensus viewer is resolved.

  • Mislabeled group numbers in LIMMA is fixed.

  • Bug where EASE and BN dialogs would clear the list of arrays supported for the default-selected organism has been fixed.

MeV v4.5 release

November 13, 2009

MeV is now provided under the conditions of the Artistic License v2.0. It was previously released under the Artistic License v1.0.

Major additions

R project

MeV has integrated a R (CRAN) hook where by R functions and libraries can be called within the Java instance using shared libraries. Mev developed a library around the JRI (rForge) API which uses JNI, to make this happen. The user in *most cases would not have to set up or configure anything to run R dependent modules and it should be completely transparent. Please also note that MeV does not produce a command line interface to run R commands. This integration was done to leverage the R environment where many algorithms are readily available and does not need to be re-implemented in Java. MeV would use R internally as appropriate and the user should not expect to see any change in behavior of MeV.

**Notes**

  1. On Windows and Mac OS X (10.6) where the default JVM is 64 Bit, the user will be thrown a warning to change to 32 bit JVM. This is required because the API is not ready for 64 Bit environment yet and we are working on a solution. However once the default JVM is set to 32 Bit, the R dependent modules would run. To help the users setup the 32 bit JVM we will be providing help and assistance to do the following:
    a. On Windows: To install 32 Bit Java if not already installed and to modify the launch script TMEV.sh to point to it.
    b. On Mac OS X we would provide instructions on how to set up 32bit JVM as default. The default in OS X 10.6 (Snow leopard is 64 bit).
  2. On Mac we expect the user to have R 2.9.x universal binary installed in the Application Framework. We do not expect such for Windows and Linux systems.
  3. Please use MeV SourceForge page for submitting queries, questions and issues. We have set up a page for this particular JVM issue named R MeV Integration, JVM issues.

LIMMA Module

Linear Models for Microarray Data, a statistical framework for the analysis of gene expression microarray data, using linear models for analyzing designed experiments and the assessment of differential expression, was implemented as a new module into MeV.

This module was implemented using the R framework described above and without writing a single line of Java code for the numerical analysis.

NMF Module

Non-negative Matrix Factorization, a technique which makes use of an algorithm based on decomposition by parts of an extensive data matrix into a small number of relevant metagenes, has been implemented as a new module into MeV. NMF’s ability to identify expression patterns and make class discoveries has been shown to able to have greater robustness over popular clustering techniques such as HCL and SOM.

GSEA UI Rewrite

GSEA user interface has undergone a major revamp in this release. The slick new UI is more intuitive and user friendly. Two additional viewers “Leading Edge Graph” --displaying the subset of genes contributing most to the gene set level metric and “Test Statistic Graph” --showing how genes within a gene set contribute to the overall gene-set-level metric have been also been added.

Nested EASE

Nested EASE (nEASE) is an extension of the EASE module. The nEASE algorithm includes a second, sub-level, iterative Fisher’s Exact Test on significantly enriched GO terms identified in a first-level EASE analysis. This sub-classification approach provides increased sensitivity for detecting enriched GO terms and thus affords a deeper understanding of possible mechanisms underlying a given condition under study.

A tutorial describing how to use the new nested EASE feature is available on the MeV website, mev.tm4.org.

Venn Diagrams

A new addition to the Cluster Manager allows users to viewer relationships between two or three clusters of samples or genes in the form of a Venn Diagram. This interactive addition also displays a p-value for two-cluster diagrams representing the likelihood of the given overlap occurring in a random sampling of identical size.

Minor Additions

  • EASE analysis results now save much more efficiently.
  • Most File Choosers now default to the last-used data directory.
  • The Gaggle interface now supports metadata as part of its broadcast. See the Gaggle page on the MeV website for details.
  • Chromosomal location annotation will now be rendered as links to the UCSC Genome browser in MeV’s standard table views.
  • Expression data can be viewed in the Institute for Systems Biology’s Genome Browser, through a new right-click menu option in many result viewers. This feature requires that chromosomal location information has been loaded.
  • Enabled multiple selecting for automatic cluster creation.

Bugfixes

  • USC no longer throws an error when a result set of size zero is returned.
  • PCA, COA and TRN modules can now save and load analysis results regardless of whether Java3D is installed.
  • The title bar in error message dialogs read an appropriate “Algorithm Exception” instead of “Out of Memory Exception”.
  • A bug in EASE prevented the use of the Trim Options checkbox. This has been fixed.
  • The Save Analysis progress dialog now responds to the Cancel and close window buttons.
  • The state-saving functions of the SAM and TTEST modules have been re-written for greater speed and stability.
  • Cluster Manager heatmaps and expression viewers no longer fail when clusters are created on data that has been filtered.
  • Rank Products no longer misreports p-values for datasets larger than 40 samples.
  • Inverted “Show Color” checkbox in Cluster Manager is fixed.
  • Fixed null pointer exception in Original Data viewer when launching new session.
  • PPI seeding in BN module was not reading the ppi file correctly leading to missed interactions. It is now been fixed.
  • BN module would not run on Windows if install path had spaces in them (e.g. ‘Documents Settings’, ‘Program Files’ etc). This problem has been fixed.
  • GEO GDS and Series Matrix file loaders have Affymetrix as default selection
  • Null pointer exception no longer thrown on clicking heat map cells after loading GEO files
  • MeV file loader loads data even if annotations are missing

Known Issues

• The LIMMA module will not be accessible when MeV is launched via Java Webstart. This is due to difficulty with including the required R libraries with the Webstart download. We expect to address this problem for the next major MeV release in May. Until then, the workaround is to download MeV and run it from the local desktop.

MeV v4.4.1 release

June 30, 2009

  • State-saving optimizations in the TTEST and SAM modules.
  • Fixed problem in Rank Products analysis that prevented two-class unpaired from running more than 40 samples.
  • Fixed GUI issues with Cluster Manager.
  • Changed display of node heights for Hierarchical Clustering.
  • Fixed cluster storing dialog box bug- dialog box could not be cancelled.
  • When launched via Java Webstart, MeV now loads data with correct row indexing
  • Table links to external GO databases now support multiple terms.
  • Chromosomal location end-coordinates are now loaded fully, rather than with a truncated last digit
  • Annotation fields are now re-loaded in the same order they were saved in.
  • Hypergeometrc distribution is now calculated correctly.

MeV v4.5 is released

The MeV development team is proud to announce that MeV v4.5 is now available for download. This release includes many new features and improvements to several existing systems, including state-saving and the annotation model.

 

Full Release Notes

 

R project: an under-the-hood improvement that allows MeV to use pre-built R libraries

New module: Linear Models for Microarray Data (LIMMA)

New module: Non-Negative Matrix Factorization (NMF)

A re-write of the GSEA module

A new feature in the EASE module, Nested EASE

Venn diagram displays of gene cluster membership

Many bugfixes and minor enhancements

Features

Hierarchical Clustering display of a K-means generated cluster. Samples are color-coded by disease state and cancer subtype.Hierarchical Clustering display of a K-means generated cluster. Samples are color-coded by disease state and cancer subtype.MeV's strength lies in its easy user-interface coupled with a powerful suite of statistical tools.

  • Load a variety of data types, such as expression, SNP, exon, PPI and copy number data
  • Test for differential expression, template matching, and functional enrichment of groups of features.
  • Group and label features with color-coded tags and track those features through different analyses.
  • Automatically download annotation data for arrays made by many manufacturers, such as Affymetrix, Illumina and Agilent.

Gaggle Metadata Settings Used by MeV

MeV is capable of sending and receiving expression and annotation data within the Gaggle framework of bioinformatics applications. As of v4.5, MeV will broadcast certain pre-defined metadata tags in addition to the matrix, namelist and network data. It will also look for these tags when receiving broadcasts.

Example Data Files

MeV is capable of loading genomic data from many different types of files. Affymetrix, Agilent, Illumina, GenePix and others are availble.  MeV also supports several platform-independent file formats such as TDMS, MAGE-TAB and GEO.

The Tab-Delimited Multiple Sample (TDMS) file loader.The Tab-Delimited Multiple Sample (TDMS) file loader.Tab-delimited Multiple Sample files

Download an example file

Loaded with the TDMS file loader. This file format is a flexible, vendor-independent format where each row contains a record for one gene, with samples arranged by column. Any number of gene or sample annotation rows are allowed. MeV will attempt to "guess" which rows and columns contain annotation and will color-code them accordingly. Please verify that MeV has guessed well. If it has not, click to select the upper-leftmost cell that contains expression data. MeV will re-color the cells to reflect your selections. Gene annotations are colored blue/purple, sample annotations in blue-green, and sample annotation headers are in yellow. Expression values are striped in blue and white.


Gene Set Enrichment Analysis (GSEA)

What is GSEA?

Gene Set Enrichment Analysis (GSEA) is a computational method that determines whether an a priori defined set of genes shows statistically significant, concordant differences between two biological states (e.g. phenotypes).

MeV and Gaggle

MeV has supported the Gaggle framework since September 2008 (MeV v4.2). Gaggle is a powerful communications system that allows connects supported programs (geese) to seamlessly transmit data to one another, without the need for intermediate flat files. MeV can use the Gaggle to send and receive data with other systems biology platforms, such as R, Cytoscape, and various web databases.

 

 

Syndicate content