The MeV Team welcomes help from outside the core group. We've collected here a series of useful information for those interested in developing for MeV.
MeV is a free, open-source Java desktop application for the analysis of genomic data. It is organized into modules, which are easy to write and plug into the MeV framework. The source code is available in a Subversion repository at http://mev-tm4.sourceforge.net.
If you are interested in developing for MeV, please have a look at the Code Contribution Guidelines. The developer docs folder in the svn repository has several useful documents, including the package overview and a module writing guide.
We welcome help with documentation, tutorials and feedback as well. Please talk to us in the Developer or Help forums if you are interested in writing code, upgrading documentation, or just sharing how you use MeV.
MeV is capable of being launched and run from the web via Java Webstart. It is also possible to have MeV launch and automatically load a specific datafile on startup. These two features together allow us the ability to provide simple weblinks that, when clicked, will launch MeV and load a dataset of choice.
Launch MeV with pre-loaded sample data.
Launch MeV with no data loaded.
-help
Print help text and exit MeV.
-gaggle
If this flag is present, MeV will automatically try to connect to the Gaggle boss on startup. It will start the boss if one is not already started. This option requires an internet connection.
-fileType TYPE
This flag specifies the type of datafile to be loaded. Options are:
tdms: Tab Delimited, Multiple Sample Files (TDMS) (*.*)
mev: MeV Files (*.mev and *.ann)
tav: TIGR ArrayViewer Files (*.tav)affy-gcos: Affymetrix GCOS(using MAS5) Files
dchip: dChip/DFCI_Core Format Files
gw-affy: GW Affymetrix Files
bioconductor-mas5: Bioconductor(using MAS5) Files
rma: RMA Files
cgh: CGH Tab Delimited, Multiple Sample
affy-gp: GEO SOFT Affymetrix Format Files
two-channel-gpl: GEO SOFT Two Channel Format Files
genepix: GenePix Format Files
agilent: Agilent Files
geo-series-matrix: GEO Series Matrix Files
geo-gds: GEO GDS Format Files
-fileUrl URL
The URL of the data file to be preloaded. This must be a complete url, including the http:// or ftp:// protocol indicators. Relative urls are not supported. Local filesystem files are also not supported at this time.
-firstRow
The index of the first row of expression data (rather than annotation) in a TDMS-like file. If both this and the firstColumn flag are set, MeV can load TDMS-like datafiles without interaction from the user. This number is zero-indexed.
-firstColumn
The index of the first column of expression data (rather than annotation) in a TDMS-like file. If both this and the firstRow flag are set, MeV can load TDMS-like datafiles without interaction from the user. This number is zero-indexed.
-arrayType
This value specifies the name of the array that the data in -fileUrl came from. The string must exactly match one of the list of currently-supported arrays, found at ftp://occams.dfci.harvard.edu/pub/bio/tgi/data/Resourcerer/pipeline/supp.... The species name is not required.
MeV requires Java v1.6 to run (v1.5 for Mac OS). Please ensure that you have the correct version before attempting to launch with Java Webstart. You can download Java v1.6 on the Java website.
MeV's command line options are relatively simple. To use them, you must edit the MeV launch file (Windows: tmev.bat; *nix: tmev.sh; Mac: ; Webstart: mev.jnlp).
The options are:
-help
Print this help text and exit MeV.
-gaggle
If this flag is present, MeV will automatically try to
connect to the Gaggle boss on startup. It will start the boss if
one is not already started. This option requires an internet
connection.
-fileType TYPE
This flag specifies the type of datafile to be loaded.
Options are:
tdms Tab Delimited, Multiple Sample Files (TDMS) (*.*)
mev MeV Files (*.mev and *.ann)
tav TIGR ArrayViewer Files (*.tav)
affy-gcos Affymetrix GCOS(using MAS5) Files
dchip dChip/DFCI_Core Format Files
gw-affy GW Affymetrix Files
bioconductor-mas5 Bioconductor(using MAS5) Files
rma RMA Files
cgh CGH Tab Delimited, Multiple Sample
affy-gp GEO SOFT Affymetrix Format Files
two-channel-gpl GEO SOFT Two Channel Format Files
genepix GenePix Format Files
agilent Agilent Files
geo-series-matrix GEO Series Matrix Files
geo-gds GEO GDS Format Files
-fileUrl URL
The URL of the data file to be preloaded. This must be a complete
url, including the http:// or ftp:// protocol indicators. Relative
urls are not supported. Local filesystem files are also not
supported at this time.
-firstRow
The index of the first row of expression data (rather than annotation) in a TDMS-like file.
If both this and the firstColumn flag are set, MeV can load TDMS-like datafiles
without interaction from the user. This number is zero-indexed.
-firstColumn
The index of the first column of expression data (rather than annotation) in a TDMS-like file.
If both this and the firstRow flag are set, MeV can load TDMS-like datafiles
without interaction from the user. This number is zero-indexed.
-arrayType
This value specifies the name of the array that the data in -fileUrl came from. The
string must exactly match one of the list of currently-supported arrays, found at
ftp://occams.dfci.harvard.edu/pub/bio/tgi/data/Resourcerer/kingdom_speci....
The species name is not required.
MeV is open-source software, released under the Artistic License v2.0 and hosted at Sourceforge.net. You are welcome to check out the source code and build and modify it to your heart's content, as long as you conform to the terms of the license.
MeV is a complex package. Here are a few steps to follow that should get you started building the package.
The MeV teams welcomes contributions of bugfixes, UI enhancements and new modules to the MeV project. Contact the MeV team in the Developers forum if you are interested in being a part of the project.
All code contributed by an outside developer must be reviewed by the MeV team. If MeV contains unit tests that test the modified code, all tests must pass before the new changes are accepted. After they are accepted, the MeV team will merge them into the source repository and they will be released with the next major MeV version. These releases are done twice per year, in November and May.
Specifications New code submitted to the MeV project should include, either within the code itself, or in a separate document, specifications describing what the new changes do or will do. This document can be submitted with code changes or ahead of time. If you want to ensure your contribution is accepted into the standard MeV distribution submitting this ahead of time is the best bet.
Developer documentation Code contributed to the project should include complete Javadoc-style code comments. Small changes that are not suitable for Javadoc should be briefly described with inline comments.
User Documentation New features or changes to existing features must to be documented in the MeV user manual. Complex new modules or systems can also be written up tutorial-style for inclusion in the tutorials section of the MeV website.
Code and UI conventions When loading data from flat files for use in MeV, the file chooser should be opened in MeV's default directory.
Getting the source code MeV source code can be checked out from the MeV subversion repository. We recommend that if you would like your contribution to be included in future releases you check out the trunk.
If you do not plan to contribute your changes to the MeV project, please keep in mind that you will not be able to distribute your work unless you also include complete source code, and you must abide by all terms of the Artistic License v2.0. That said, we recommend you choose the latest bugfix branch and make your changes there.
MeV is capable of sending and receiving expression and annotation data within the Gaggle framework of bioinformatics applications. As of v4.5, MeV will broadcast certain pre-defined metadata tags in addition to the matrix, namelist and network data. It will also look for these tags when receiving broadcasts.
See the general information on using MeV in the Gaggle framework
Example metadata structure for a matrix broadcast:
metadata-root | |-> identifier-type: PROBE_ID |-> MeV-metadata |--> data-type: intensity |--> array-name: affy_HG-U133A |--> algorithm-source: HCL |--> log-status: unlogged
Example metadata structure for a namelist broadcast
metadata-root | |-> identifier-type: ENTREZ_ID |-> user-interactive: false |-> MeV-metadata |--> algorithm-source: KMC
For all broadcasts, MeV will include within the Metadata tuple a single called "MeV-metadata". All MeV-specific metadata will reside in this Tuple. Some metadata items will be included only in matrix broadcasts, while others will be included in namelists as well. Currently, no additional Metadata beyond a placeholder "MeV-metadata" item is included in Network broadcasts. When receiving a broadcast, MeV will attempt to locate and use these metadata items as well. The function of each item is described in its individual section, below.
Data type (matrix only, inbound and outbound broadcasts)
data-type: "ratio" | "intensity" This field indicates the type of data included in a matrix broadcast. An incoming broadcast including the "ratio" value will be interpreted as a two-color array, and loaded into the appropriate data structure. A broadcast with a data type of "intensity" will be treated as a single color, or intensity-based array, such as an Affymetrix array. The main noticeable difference between these two will be in the color scaling that MeV applies to its heatmaps. When MeV accepts an incoming matrix broadcast, it assumes that the data is of type "intensity" unless the data-type value indicates otherwise.
Array name (matrix only, outbound broadcast only)
array-name: affy_HG-U133A | affy_HG_U95E | TIGR_25K_Mouse_Set | etc. This field indicates the name of the array that MeV has stored for the broadcast dataset. This field will only be included if MeV's automatic annotation loader has been used to load the data. MeV currently does not act on this data if it is received as part of a broadcast. In the future, we hope to allow MeV to auto-load annotation for an incoming set of data using this array name. The names should match an array available on the Resourcerer ftp site.
Algorithm source (matrix and namelists, outbound broadcast only)
algorithm-source: KMC | HCL | etc. This value indicates the source (within MeV) of the dataset being broadcast. This will most often be an algorithm name, though it will sometimes not be populated.
User Interactive (Namelists only, inbound broadcast only)
user-interactive: true | false When receiving an incoming namelist broadcast, this flag determines whether MeV will ask the user to validate the incoming list of genes or will attempt to coerce the genes in the namelist into a cluster without user interaction. Cluster colors and names will be automatically assigned, and the identifier is required.
Identifier Type (matrix and namelists, inbound and outbound broadcasts) identifier-type: ENTREZ_ID | UNIGENE_ID | GENE_SYMBOL | REFSEQ_ACC | etc. For incoming broadcasts, this field indicates which of the supported MeV annotation types is used as an identifier in the incoming broadcast. This value will often be the same as the row titles title of the broadcast. This value will be used to inform MeV of the identifier type of incoming broadcasts and bypass certain user-interactions when receiving namelists.
Log-status (matrix only, inbound and outbound broadcasts)
log-status : unlogged | log2 | log10 This field indicates whether the data in the matrix has been log-transformed. The information is stored in MeV's data model. It is currently unused but in the future will be used by several of MeV's component modules.
Module Development
By convention, each module generally has a long name (eg, One-way Anova) and a shorter acronym, (OWA). The module is generally referred to by this short acronym, and the acronym is used as the algorithm class name and its package name. Convention states that the class name for the main Algorithm implementation should be all caps, (OWA.java) whereas the packages should be lowercase (org.tigr.microarray.mev.cluster.algorithm.impl.owa.* and org.tigr.microarray.mev.cluster.gui.owa.*)
A new module requires the creation of the following components.
GUI classes:
included are all GUI components for the module, e.g. expression viewers, tables, init dialogs and **GUI.java.
-An IClusterGUI implemenation.
-An initialization dialog for parameter collection.
-Extension of base IViewer implementations for basic cluster viewers.
-Optional module specific viewers
-Optional helper classes
All GUI classes are contained in the package org.tigr.microarray.mev.cluster.gui.impl.<module-package>
source\org\tigr\microarray\mev\cluster\gui\impl
Algorithm Engine:
org.tigr.microarray.mev.cluster.algorithm.impl
This package contains all AbstractAlgorithm extensions that correspond
to module implementations. These are the classes that do the computational
work during algorithm execution. Main method, AlgorithmData execute(AlgorithmData) where the sole argument is the parameter container and the return AlgorithData has the accumulated results.
An algorithm class takes in an AlgorithmData object and returns and object of the same class back to the Module's GUI class.
source\org\tigr\microarray\mev\cluster\algorithm\impl\**.java
Parameters info:
Each module provides an html page describing the use of the module's parameters. This short tutorial is not designed to describe the algorithm in any depth, but walk a user through the types of analysis available for their data.
source\org\tigr\microarray\mev\cluster\gui\impl\dialogs\dialogHelpUtil\dialogHelpPages\**_parameters.html
To create the help window, add the appropriate module key to
source\org\tigr\microarray\mev\cluster\gui\impl\dialogs\dialogHelpUtil\HelpWindow.java
Build Script:
The MeV build script, located in the \build_script directory of the source tree needs the following edits:
1.) Change Module Selection Properties by adding:
<property name="**" value="y"/>
2.) Add algorithm dependency by adding algorithm name to:
<target name="algorithm-modules"
depends=
and
<target name="modules-only"
depends=
-note: With new toolbar layout, target order is no longer essential for correct categorizing of modules.
3.) Add build target for algorithm:
(example)
<target name="**" depends="**-GUI" if="**">
<javac debug="${debug}" target="${java.target.version}" sourcepath="" srcdir="${alg.impl.dir}" destdir="${dest.dir}">
<include name="**.java"/>
<classpath>
<pathelement location="${lib.dir}/JSciCore.jar"/>
<!-- jars to support module compilation -->
<pathelement location="${lib.dir}/mev-util.jar"/>
<pathelement location="${lib.dir}/mev-gui-impl.jar"/>
<pathelement location="${lib.dir}/mev-gui-support.jar"/>
<pathelement location="${lib.dir}/mev-algorithm-impl.jar"/>
<pathelement location="${lib.dir}/mev-algorithm-support.jar"/>
<pathelement location="${lib.dir}/mev-base.jar"/>
</classpath>
</javac>
<propertyfile file="${alg.properties.file}">
<entry key="**" value="org.tigr.microarray.mev.cluster.algorithm.impl.**"/>
</propertyfile>
</target>
4.) Add build target for GUI. Specify which Module Category (Clustering, statistics, etc.) to which the module belongs.
(example)
<target name="**-GUI">
<javac debug="${debug}" target="${java.target.version}" srcdir="${gui.impl.dir}/**" destdir="${dest.dir}">
<classpath refid="module.build.class.path"/>
</javac>
<propertyfile file="${gui.properties.file}">
<entry key="gui.names" value="**:" operation="+"/>
<entry key="**.name" value="**"/>
<entry key="**.class" value="org.tigr.microarray.mev.cluster.gui.impl.**.**GUI"/>
<entry key="**.category" value="${STATISTICS}"/>
<entry key="**.smallIcon" value="analysis.gif"/>
<entry key="**.largeIcon" value="**_button.gif"/>
<entry key="**.tooltip" value="Example Algorithm"/>
</propertyfile>
</target>
Icon
Add icon button gif to
source\org\tigr\microarray\mev\cluster\gui\impl\images
and
source\org\tigr\images
button icon is a gif 32 pixels x 32 pixels
State-saving
If the module uses only the standard viewers to display its results, no state-saving work need be done. If the module includes custom viewers, however, please consult the state-saving documentation for information on how to ensure that the new viewer will save and load completely.
Documentation
There should be a section of the MeV user manual for each module. It should describe what the module does and how to use it. Include plenty of screenshots.
If there is a paper reference for the module, add it to the MeV manual in the References section at the end and in the manual section devoted to the module. Also add the reference to the MeV page on the TM4 website. Also write a short description of the module to be included in the release notes when the new module is released with MeV.