December 16, 2011
- nEASE table viewer link shortcuts work properly.
- Handles error when saving large heatmaps from Experiment Viewer.
- Analysis options are customized to include data-specific options.
November 18, 2011
New CLValid Module
A new module for cluster validation, CLVALID, has been added to MeV’s
existing clustering tools. CLVALID uses the R package "clValid" to
compare the relative properties of 10 different clustering methods
across a several different numbers of clusters. This module aims to help
choose a method that is most compact, well-separated, connected, and
stable. It also optionally makes use of bioconductor annotation packages
to biologically validate the results.
New Annotation Support Files
The MeV team has built a new pipeline for producing the annotation files
that are used to support modules like EASE and to display chromosomal
location information and GO terms in the various gene table views
throughout MeV. The new annotations are collected from Bioconductor
v2.6, and are more complete than before. The files also include many new
arrays, such as Affymetrix's Exon (ST) arrays.
Complete list of supported arrays
In the future, we will be able to easily add new arrays as the
Bioconductor team releases them, and to easily update these annotations
when a new Bioconductor version is integrated into MeV. As a rule, MeV
will provide annotation from the version of Bioconductor that is
supported by MeV's currently-supported version of R. Currently, that
version is R v2.11 and Bioconductor v2.6. This coordination is to ensure
that annotation used internally by R modules is consistent with any
annotation MeV displays.
July 15, 2011
* The EASE module has been re-enabled for use with RNA-Seq data.
July 11, 2011
* Custom annotation loading for RNA-Seq data.
* Restored the MeV.exe icon.
* Fixed the broken link to HBGB Genome Browser.
* Fixed a Mac-specific bug in the GOSeq module that prevents the module from running.
* Fixed bug for zero-variance genes using Pearson Correlation.
* Enabled top-panel resizing.
* The GOSeq Module has been moved to the Meta Analysis toolbar.
May 19, 2011
* A few new validation checks to RNA-Seq file loader
* Fixed a bug that showed up if the input RANseq file was incomplete.
May 16, 2011
New RNA-Seq Features
MeV is now capable of loading and analyzing RNA-Seq data.
New File Loader
MeV can now load summarized RNASeq data from a simple, tab-delimited file format. This format is fully described in the appendix of the MeV user manual. The loader can load count data, RPKM or FPKM, or combinations of the two data types.
GOSeq: GO term enrichment detection for RNASeq data (Young, et al, 2010).
GOSEQ is a technique for identifying differentially expressed sets of genes, such as GO terms while accounting for the biases inherent to sequencing data.
EdgeR: differential expression analysis of digital gene expression data (Robinson et al., 2010).
EdgeR is a Bioconductor software package for examining differential expression of replicated count data. An over-dispersed Poisson model is used to account for both biological and technical variability. Empirical Bayes methods are used to moderate the degree of over-dispersion across transcripts, improving the reliability of inference. The methodology can be used even with the most minimal levels of replication, provided at least one phenotype or experimental condition is replicated.
DESeq: Digital gene expresion analysis based on the negative binomial distribution (Anders and Huber, 2010).
The BioC package DESeq provides a powerful tool to estimate the variance in count data and test for differential expression. It can estimate variance-mean dependence in count data from high-throughput sequencing assays and test for differential expression based on a model using the negative binomial distribution.
DGESeq: An R package to identify differentially expressed genes from RNA-Seq data (L. Wang et al., 2010).
Identify differentially expressed genes from RNA-seq data. RNA sequencing is modeled as a random sampling process, in which each read is sampled independently and uniformly from every possible nucleotide in the sample. Under this assumption the number of reads coming from a gene (or transcript isoform) follows a binomial distribution (and could be approximated by a Poisson distribution). Based on this statistical model, Fisher’s exact test, likelihood ratio test and 2 other methods were proposed to identify differentially expressed genes.
Other New Features
In the Sample Cluster Manager, a new graph view is available, called Expression Graphs. This option allows the creation of boxplots and bar charts of individual genes or groups of genes, compared across sample groups.
Major updates to GSEA user interface
Simpler, easier UI allows more intuitive use of the Gene Set Enrichment Module. Several calculation improvements and algorithm fixes have been applied to the newest release.
Import File feature added to List Import option in Cluster Manager
Clusters can now be created by loading a file containing a gene list.
New MeV User Manual
We have updated the MEV manual to a web-based format. Now, the help buttons within MeV link directly to a local copy of the user manual. Full information about the linked module is available immediately.
All R-dependent MeV functions call R version 2.11 by default.
R package auto-download
MeV now automatically downloads R support packages after installation. The packages no longer have to be included in the initial download.
"Set as Data Source" Option Removed
We have removed a feature of the MeV result tree. Previously, the right-click context menu for cluster nodes in the result tree contained an option called "Set as Data Source". Choosing this option would cause the genes in the selected cluster to be treated as the entire MeV source dataset. All subsequent modules and filters run in MeV would be applied only to that subset of the data. We have removed this option because it was redundant and not particularly stable. Users who want to work with only a subset of their data have two options, both of which are more robust and fit better with the MeV data metaphor.
Option 1. Launch as new Viewer:
Create a gene or sample cluster from the result node of interest, by right-clicking on the viewer window and choosing Store Entire Cluster.
Go to the appropriate Cluster Manager (Gene Cluster Manager or Sample Cluster Manager, in the result tree under the node Cluster Manager).
Right-click on the cluster you just created, and choose Open/Launch -> Launch MeV Session. This will create a new Multiple Array Viewer containing only the data from the selected cluster. All analyses executed in this MAV will only apply to the selected data.
Option 2. Select data cluster during module execution
Create a gene or sample cluster from the result node of interest, by right-clicking on the viewer window and choosing Store Entire Cluster.
In the module execution dialog, select that cluster from the cluster selection panel. The module will apply its analysis to only the genes/samples in that cluster.
* The viewers for single-color Affymetrix data now handle the heatmap display of zero values properly
* GSEA bugfixes
* Missing HCL header bug resolved.
* NMF Plotviewer error fixed.
* The GSEA p-value graph viewer now saves and restores state correctly.
* The Windows version of MeV v4.6.2 shipped without a MeV executable. While the program was still usable with the tmev.bat file, it was annoying the MeV.exe file has been restored.
* Agilent file loader fixed for loading of 1-color data.
* Agilent file loader fixed for loading of multiple samples simultaneously.
MeV v4.6.2 bugfix release
November 23, 2010
- Hierarchical Clustering header trees now appear when HCL is auto-launched by another module.
- RHook default/cached prop loading and remote access and other Exception handling and forwarding.
- 64 Bit Java bypass added to TMEV.bat launch file.
- BN progress bar now updates properly when the module is run more than once.
- Added placeholder file to data/BN_files/kegg directory to ensure the folders always appear regardless of which unzip utility is used.
- Removed Cluster Analysis option for modules when Clusters have not been created.
- Several small fixes pertaining to loading and unloading data.
- Default setting changed in SAM when using R.
- Error handling in Non-Negative Matrix Factorization improved, Plotviewer label issues resolved.
MeV v4.6.1 bugfix release
August 12, 2010
- Errors fixed in selection of EASE file system.
- Default distance metric for HCL run after LIMMA is now Pearson Correlation.
- Loading annotation after expression data now stores organism and array data.
- Manually loaded of annotation files are now correctly flagged.
- Search function disabled when no data is loaded.
- MAGE-TAB file loader displays and processes files in preview window correctly.
- MeV manual link now redirects correctly.
- Toolbar resets to disabled correctly on clear data command.
- SAM default settings updated.
- Improvement to display of MeV banner in progress dialogs.
- Hierarchical tree view is no longer displayed for all nEASE sub-results.
- Improvement to "Save EASE table" menu options in EASE and nEASE results.
- State-saving improvements to EASE module.
- More compact, simpler nEASE results.
- Improvements to the Minet documentation.
- UI tweaks in BN dialog.
MeV v4.6 release
July 2, 2010
MeV v4.6 includes a large number of new features, including several new modules and large improvements to existing favorites.
The Attract algorithm identifies the core gene expression modules that are differentially activated between cell types or different sample groups, and elucidates the set of expression profiles which describe the range of transcriptional behavior within each module.
Global Ancova Module
A technique for identifying differentially expressed gene sets based off of the calculation of an F-test between groups of samples. Analyses are typically run in a two-class format but may also be applied to additional groups. Global Ancova fits linear models to the data and compares them using the extra sum of squares principle. The result table includes p-values, permutation p-values and asymptotic p-values.
For a given dataset, minet infers the network in two steps. First, the mutual information between all pairs of variables in dataset is computed according to the estimator argument. Then the algorithm given by method considers the estimated mutual information in order to build the network.
The Survival (SURV) module contains two functions for the analysis of censored survival data. The first is a basic comparison of the survival curves of two groups of samples. The second feature of the module is the creation of a cox proportional hazards model based on the loaded gene expression data, using survival time as the reporting value.
EASE UI Rewrite
The EASE UI is simpler and easier to use now.
Updates to the BN module
Network Seed allows the user is to provide a file representing a network. Network seed can be used in one of the three ways:
1. Using the user network seed alone and bypassing literature based network seeding altogether.
2. Using the user network seed along with Literature mining seed.
3. User provided network is used as a complete network and the network structure is not learned, only the Conditional Probability Tables (CPTs) associated with the network is learned for downstream exploration.
A node by the name of “CLASS” shows up in the network which captures the effect of sample groups on the network. Once the network is displayed the “CLASS” node behaves and can be treated as any other node in the network.
Updates to the GSEA module
Two new viewers are now provided, including a p-value graph viewer and a gene set membership plot. Gene sets can now be automatically downloaded from GeneSigDB and MSigDb.
Updates to the SAM module
A new addition to the SAM module integrates RHook to make a newer version of SAM available to users. MeV’s SAM now makes use of serially correlated time-course data in the exploration of statistically significant gene expression.
Hierarchical Clustering Trees
MeV now displays hierarchical trees with meaningful and proportional node heights along with an optional scale tailored to the chosen distance metric used in constructing the tree.
Rama significance testing for spotted array data has been disabled, along with the Bridge module. These functions never worked properly and have been unsupported for some time. They are still available in older versions of MeV. The most recent version of MeV that contains these features is MeV v4.5.1.
We have also retired the Single Array Viewer.
- FDR calculation is displayed in the TTEST module.
- Annotation can now be auto-loaded by MeV after expression data has already been loaded.
- Agilent file loader has been updated to work with the latest file formats.
- Pearson correlation coefficients are now the default distance metric for most analysis modules.
MeV's R-driven modules (LIMMA, Attract, Surv, SAM, etc) will not be accessible when MeV is launched via Java Webstart. This is due to difficulty with including the required R libraries with the Webstart download. Until we identify a solution to this problem, the workaround is to download MeV and run it locally.
MeV v4.5.1 bugfix release
December 17, 2009
Data matrices received as a gaggle broadcast with data type “intensity” are now loaded into the correct internal data structure.
When data is cleared using File->Clear Data, the Sample and Gene cluster managers are now reset properly so they will reappear on loading of new data.
GSEA Experiment viewer for individual gene sets now displays the correct genes.
Updated SupportFileDefinition to check for appropriate suffix in filename when attempting to match files in the repository.
Missing header for SOTA and SOM modules is fixed.
Mislabeling in NMF’s consensus viewer is resolved.
Mislabeled group numbers in LIMMA is fixed.
Bug where EASE and BN dialogs would clear the list of arrays supported for the default-selected organism has been fixed.
MeV v4.5 release
November 13, 2009
MeV is now provided under the conditions of the Artistic License v2.0. It was previously released under the Artistic License v1.0.
MeV has integrated a R (CRAN) hook where by R functions and libraries can be called within the Java instance using shared libraries. Mev developed a library around the JRI (rForge) API which uses JNI, to make this happen. The user in *most cases would not have to set up or configure anything to run R dependent modules and it should be completely transparent. Please also note that MeV does not produce a command line interface to run R commands. This integration was done to leverage the R environment where many algorithms are readily available and does not need to be re-implemented in Java. MeV would use R internally as appropriate and the user should not expect to see any change in behavior of MeV.
- On Windows and Mac OS X (10.6) where the default JVM is 64 Bit, the user will be thrown a warning to change to 32 bit JVM. This is required because the API is not ready for 64 Bit environment yet and we are working on a solution. However once the default JVM is set to 32 Bit, the R dependent modules would run. To help the users setup the 32 bit JVM we will be providing help and assistance to do the following:
a. On Windows: To install 32 Bit Java if not already installed and to modify the launch script TMEV.sh to point to it.
b. On Mac OS X we would provide instructions on how to set up 32bit JVM as default. The default in OS X 10.6 (Snow leopard is 64 bit).
- On Mac we expect the user to have R 2.9.x universal binary installed in the Application Framework. We do not expect such for Windows and Linux systems.
- Please use MeV SourceForge page for submitting queries, questions and issues. We have set up a page for this particular JVM issue named R MeV Integration, JVM issues.
Linear Models for Microarray Data, a statistical framework for the analysis of gene expression microarray data, using linear models for analyzing designed experiments and the assessment of differential expression, was implemented as a new module into MeV.
This module was implemented using the R framework described above and without writing a single line of Java code for the numerical analysis.
Non-negative Matrix Factorization, a technique which makes use of an algorithm based on decomposition by parts of an extensive data matrix into a small number of relevant metagenes, has been implemented as a new module into MeV. NMF’s ability to identify expression patterns and make class discoveries has been shown to able to have greater robustness over popular clustering techniques such as HCL and SOM.
GSEA UI Rewrite
GSEA user interface has undergone a major revamp in this release. The slick new UI is more intuitive and user friendly. Two additional viewers “Leading Edge Graph” --displaying the subset of genes contributing most to the gene set level metric and “Test Statistic Graph” --showing how genes within a gene set contribute to the overall gene-set-level metric have been also been added.
Nested EASE (nEASE) is an extension of the EASE module. The nEASE algorithm includes a second, sub-level, iterative Fisher’s Exact Test on significantly enriched GO terms identified in a first-level EASE analysis. This sub-classification approach provides increased sensitivity for detecting enriched GO terms and thus affords a deeper understanding of possible mechanisms underlying a given condition under study.
A tutorial describing how to use the new nested EASE feature is available on the MeV website, mev.tm4.org.
A new addition to the Cluster Manager allows users to viewer relationships between two or three clusters of samples or genes in the form of a Venn Diagram. This interactive addition also displays a p-value for two-cluster diagrams representing the likelihood of the given overlap occurring in a random sampling of identical size.
- EASE analysis results now save much more efficiently.
- Most File Choosers now default to the last-used data directory.
- The Gaggle interface now supports metadata as part of its broadcast. See the Gaggle page on the MeV website for details.
- Chromosomal location annotation will now be rendered as links to the UCSC Genome browser in MeV’s standard table views.
- Expression data can be viewed in the Institute for Systems Biology’s Genome Browser, through a new right-click menu option in many result viewers. This feature requires that chromosomal location information has been loaded.
- Enabled multiple selecting for automatic cluster creation.
- USC no longer throws an error when a result set of size zero is returned.
- PCA, COA and TRN modules can now save and load analysis results regardless of whether Java3D is installed.
- The title bar in error message dialogs read an appropriate “Algorithm Exception” instead of “Out of Memory Exception”.
- A bug in EASE prevented the use of the Trim Options checkbox. This has been fixed.
- The Save Analysis progress dialog now responds to the Cancel and close window buttons.
- The state-saving functions of the SAM and TTEST modules have been re-written for greater speed and stability.
- Cluster Manager heatmaps and expression viewers no longer fail when clusters are created on data that has been filtered.
- Rank Products no longer misreports p-values for datasets larger than 40 samples.
- Inverted “Show Color” checkbox in Cluster Manager is fixed.
- Fixed null pointer exception in Original Data viewer when launching new session.
- PPI seeding in BN module was not reading the ppi file correctly leading to missed interactions. It is now been fixed.
- BN module would not run on Windows if install path had spaces in them (e.g. ‘Documents Settings’, ‘Program Files’ etc). This problem has been fixed.
- GEO GDS and Series Matrix file loaders have Affymetrix as default selection
- Null pointer exception no longer thrown on clicking heat map cells after loading GEO files
- MeV file loader loads data even if annotations are missing
• The LIMMA module will not be accessible when MeV is launched via Java Webstart. This is due to difficulty with including the required R libraries with the Webstart download. We expect to address this problem for the next major MeV release in May. Until then, the workaround is to download MeV and run it from the local desktop.
MeV v4.4.1 release
June 30, 2009
- State-saving optimizations in the TTEST and SAM modules.
- Fixed problem in Rank Products analysis that prevented two-class unpaired from running more than 40 samples.
- Fixed GUI issues with Cluster Manager.
- Changed display of node heights for Hierarchical Clustering.
- Fixed cluster storing dialog box bug- dialog box could not be cancelled.
- When launched via Java Webstart, MeV now loads data with correct row indexing
- Table links to external GO databases now support multiple terms.
- Chromosomal location end-coordinates are now loaded fully, rather than with a truncated last digit
- Annotation fields are now re-loaded in the same order they were saved in.
- Hypergeometrc distribution is now calculated correctly.