PRP file format | Revision history | PDF documents | History of ProtPlot | Latest version |
Once the ProtPlot program is started, it loads the set of PRP files that you downloaded with the ProtPlot program. The virtual protein data for each tissue is used to construct a Master Protein Index where proteins will be present for some tissues and not for others. The data is presented in a pseudo 2D-gel image with the estimated isoelectric point (pI) on the horizontal axis and the molecular mass (Mw) on the vertical axis. Sliders on each of the axes allow you control the minimum and maximum values of pI and Mw displayed and thus the Mw vs. pI scatterplot zoom region you want to select. By clicking on a spot in in the scatterplot, you will display information on that protein. You also define that protein as the current protein. The current protein is used in some of the clustering methods, protein specific reports (Expression Profile report), and the Expression Profile plot. If you have enabled the popup Genomic-ID Web browser and you are connected to the Internet, it will popup a Web page from the selected Genomic database for that protein.
You select various options from the pull-down menus. Some of the more commonly used options are replicated as check-boxes at the bottom of the window.
ratio = (expression X / expression Y)where expression X (expression Y) is the expression of corresponding proteins. Alternatively, you may compute the ratio of the mean expression of two different sets of samples (the X set and the Y set). The X and Y sets may be thought of as experimental conditions and the members of the sets being "replicates" in some sense. E.g., the X set could be cancer samples and the Y set could be normal samples. The ratio of the X/Y sets for each corresponding protein is computed as
ratio = (mean X-set expression / mean Y-set expression)
The following shows one of the (Mw vs. pI) scatterplots when the display mode was set to (X-set/Y-set) ratio mode:
It is also possible to create an (X vs Y) scatter plot or (Mean X-set vs. Mean Y-set) scatterplot when the corresponding ratio display mode is set. The following window shows the (Mean X-set vs. Mean Y-set) scatterplot:
The following table summaries the four types of display modes:
Display Mode | Current sample | Single X/Y | X-set/Y-set | EP-set |
---|---|---|---|---|
Expression | yes | no | no | no |
Single samples ratio | no | yes | no | no |
X-set and Y-set samples ratio | no | no | yes | no |
Mean Expression | no | no | no | yes |
This may be invoked either from the File menu or the pull-down sample selector at the lower-left corner of the main window.
For example, you invoke this chooser for a the specific tissue sample you want to view by using the (File menu | Select samples | Select Current PRP sample). For X (Y) data, you invoke the choosers using (File menu | Select samples | Select X (Y) PRP sample(s)). You may switch between single (X/Y) and (X set/Y set) mode using the (File menu | Select samples | Use Sample X and Y sets else single X and Y samples [CB]) command.
There is an alternative display called the 'Expression Profile' (EP) plot which display a list of a subset of PRP samples for the currently selected protein. You may also display the scatterplot on the mean EP data for all proteins. The EP samples are specified using the (File menu | Select samples | Select Expression List of samples) command.
In the (Filter menu | State | Protein Sets) submenu there are a number of commands to manipulate protein set files. You may individually save (or restore) any particular saved filtered set to (or from) a set file in the "Set" folder. There are also commands to compute the set intersection, union or difference between two protein set files and leave the resulting protein set in the saved Filter set.
indicate that the command
Filter Name | Current sample | Single X/Y | X-set/Y-set | EP-set |
---|---|---|---|---|
> 200K Daltons | yes | yes | yes | yes |
Tissue type | yes | yes | yes | yes |
Expression (Ratio) range | expression | ratio | ratio | expression |
X/Y (inside/outside) range | no | yes | yes | no |
(X-set, Y-set) t-Test | no | yes | yes | no |
(X-set, Y-set) KS-Test | no | yes | yes | no |
(X-set, Y-set) Missing data | no | yes | yes | no |
At Most (Least) N samples | no | no | yes | yes |
AND of saved cluster set | yes | yes | yes | yes |
AND of saved filter set | yes | yes | yes | yes |
Starting ProtPlot by clicking on the ProtPlot startup icon will not read the state file when it starts up. However, if you have saved a state, clicking on the state file or a shortcut to the state file will cause it to be read when ProtPlot starts up.
You may save the current state using either the (File | State | Save State) command to save it under the current name, or using either the (File | State | Save As State) command to save it under a new name you may specify. Then you may also change the current state using (File | State | Open Statefile) command.
You may scroll the scatterplot in both the pI and Mw axes by adjusting the end-point scrollbars on the corresponding axes. You may display the scatterplot with a log transform of MW by toggling the log MW switch.
The popup plots and scatterplot may be saved as .gif image files which are put into the project's "Report" folder. Similarly, reports are saved as tab-delimited .txt text files in the "Report" folder. Because it prompts you for a file name, you may browse your file system and save the file in another disk location.
The cluster distance metric is the 'distance' between two proteins based on their expression profile. The metric may be selected in the Cluster Menu. Currently, there is one clustering method: cluster proteins most similar to the current protein (specified by clicking on a spot in the scatterplot or using the Find Protein by name in the Files menu). It requires you to specify a) the current protein, and b) the threshold distance cutoff. The threshold distance is specified interactively by the "Distance Threshold T" slider. The 'Similar Proteins Cluster' Report will be updated if you change either the current protein or the cluster distance.
The cluster distance metric must be computed in a way to take missing data into account since a simple Eucledian distance can not be used with the type of sparse data present in the ProtPlot database. ProtPlot has several ways to compute the distance metric using various models for handling missing data.
You may save the set of proteins created by the current clustering settings by pressing the "Save Cluster Results" button in the lower-right of the cluster report window. This set of proteins is available for use in future data filtering using the (Filter menu | Filter by AND of Saved Clustered proteins [CB]). When you save the state of the ProtPlot database (Filter menu | State | Save State), it will also save the set of saved clustered proteins in the database "Set" folder. You may restore any particular saved clustered set file.
You may bring up the EP plot window by clicking on the "EP Plot" button and then click on any spot in the scatterplot to see its expression profile. Clicking on the "Scroll Cluster EP Plots" button brings up a scrollable list of expression profiles for just the clustered proteins sorted by similarity.
The following window illustrates the scrollable list of EP plots sorted by the current cluster report similarity.
You may mark the proteins belonging to the cluster in the scatterplot
with black boxes by selecting the " View cluster boxes" checkbox at the lower left of the
cluster reportwindow. This is illustrated in the following window:
Filter Name | Current sample | Single X/Y | X-set/Y-set | EP-set |
---|---|---|---|---|
Statistics or proteins passing filter | SP-ACC/ID, pI, Mw, expression | SP-ACC/ID, pI, Mw, X/Y, X, Y expr, Tissues | SP-ACC/ID, pI, Mw, mnX/mnY, (mn,sd,cv,n) expr for X- & Y-sets, Tissues. If using t-test then (dF, t-stat, F-stat). If using KS-test then (dF, D-stat) | SP-ACC/ID, pI, Mw, (mn,sd,cv,n) exprfor EP-set, Tissues |
Expression profiles of proteins passing filter | SP-ACC/ID, expr data EP-set | SP-ACC/ID, expr data EP-set | SP-ACC/ID, expr data EP-set | SP-ACC/ID, expr data EP-set |
X &Y sets of missing proteins pasing filter | no | no | SP-ACC/ID, (mn,sd,cv,n)for X- & Y-sets | no |
EP set statistics of proteins passing filter | no | no | no | SP-ACC/ID, (mn,sd,cv,n) for EP-set |
List of samples in current EP profile | {Nbr, sample-name, expression) | {Nbr, sample-name, expression) | {Nbr, sample-name, expression) | {Nbr, sample-name, expression) |
List of all sample assignments | Current, X, Y, X-set, Y-set, EP-set | Current, X, Y, X-set, Y-set, EP-set | Current, X, Y, X-set, Y-set, EP-set | Current, X, Y, X-set, Y-set, EP-set |
List of # proteins/sample | {Sample-name, # proteins in sample} | {Sample-name, # proteins in sample} | {Sample-name, # proteins in sample} | {Sample-name, # proteins in sample} |
ProtPlot state | State | State | State | State |
Djamel Medjahed, LMT, SAIC-Frederick
Revised: 08-26-2004