top of page

FAQS:

Leukemia Atlases

overallschematic.png

LPA contains a repository of proteomic data generated using Reverse Phase Protein Array technology  on  samples from patients with leukemia and the corresponding computational interpretation of this data. Patient samples come from pediatric and adult populations. The samples are whole cell lysates taken from bone marrow biopsies and blood at the time of diagnosis.

What are the Leukemia Protein

Atlases (LPA)?

Currently there is data from three arrays shown at this website:

Adult AML Patients aged 15-87. Number = 235. Antibody # = 231

Pediatric AML Patients aged 0-18. Number = 95. Antibody # = 194

Pediatric ALL Patients aged 0-18. Number = 73. Antibody # = 194

 

Arrays coming soon:
Adult APL Patients aged 14-71. Number = 21. Antibody # = 231
Pediatric AML from COG 1031 trial. Number = 505. Antibody # = 292
T-cell ALL Pediatric N= 360, Adult N=85. Antibody # = 325

 

What arrays are shown?

Samples are printed in 5 serial dilutions onto a nitrocellulose membrane in 2 micron dots.  The first dot has about 85 cell equivalents of protein, the 5th has about 5 cell equivalents.  Slides are stained with a primary antibody against the target protein, then with a secondary antibody conjugated to biotin. DAB is then precipitated leaving a brown coloration on the slide. The slide is then scanned with a program to  generate a numerical value. To convert the 5 serial dilution values into a single value several normalization steps are applied  to account for background (topographical ) or between slides if it is a multi slide set (variable slope) and then Supercurve is applied. Values are then median normalized. Values are then  compared to the expression data from 10 normal bone marrow derived ( non-GCSF stimulated) CD34+ cell samples, with the median of the normal cells set to 0. Data is expressed in log2 and can be less than zero if they are below the median of the normal CD34+ cells, or greater than zero if above the median of the normal cells. 

What is the reverse phase protein array methodology?

To include known biological relationships in our analysis, we developed a novel computational approach called “MetaGalaxy” which incorporates existing knowledge regarding relationships between proteins. First the antibodies are divided into Protein Functional Groups (PFG), based on known association in a common pathway or function (e.g. apoptosis induction, cell cycle regulation, Signal transduction pathway). For the adult AML array there were 341 PFG. Clustering is then performed on each PFG and this forms the global MetaGalaxy plot. The optimal number of clusters is determined based on the “Progeny Clustering” algorithm (Hu et al., 2015).  There are a finite number of PFG clusters ( n=154 for the Adult AML array)  but each patient will be a member of 31 (one for each PFG) and not a member of the other 123. Clustering is then again performed  based on PFG cluster membership and an algorithm, similar to Chi-square, is used to select the optimal number of  groupings.  When several PFG repeatedly co-segregate with each other we call that a “Constellation”. On MetaGalaxy plots, these are seen horizontally and will appear as a blue block. When recurrent patters of constellation membership  are seen in several patients that forms a protein expression “Signature”, which are seen vertically in a MetaGalaxy plot.

How was the data analyzed, and what

is the MetaGalaxy method?

Across the top of the home page are 6 tabs.

 

HOME: Allows you to select one of the other 5 tabs, or  select one of the circles to go to one of the available databases. This will open in a new window.

GLOBAL NETWORKS: Takes you to the Global Meta-Galaxy  analysis for one of the available databases. See below

FUNCTIONAL NETWORKS: Takes you to the individual Protein Functional Group analyses for one of the available databases. See below

CODE: Takes you to links to the R programming code used for this analysis  along with a tutorial and a set of sample data. Users can load their own dataset and perform the specified analysis.

PUBLICATIONS: A list of our publications from these datasets is presented here. Please read them all and cite us often.

ABOUT US | FAQS:  Some basic information about the site and those who’s efforts generated the data and site. Information on how to contact us is here as well.

How do I use the

LPA?

Once a user has selected a Global Network dataset (Adult AML, Pedi AML or Pedi ALL), the webpage will show three separate figures

  1. The MetaGalaxy clustering plot shows the number of signatures across the top and the number of constellations along the left side. Positive membership is shown by a blue  pixel, non-membership by a yellow pixel.  The more blue pixels a  rectangle contains, the more often the PFG in that signature were found to co-associate. For example in the Adult AML dataset constellation #2 (yellow)  is near universal in Signatures 9, 10, 11, 12 , while constellations 9 and 10 (orange and aqua) are almost never seen in signature 11 (aqua).

  2. Survival Curves  are shown for the Meta-Galaxy analysis. Users can select for overall survival or remission duration and they can select for all patients or those with  specific cytogenetic subgroups (favorable, intermediate or unfavorable), as well as for selected mutations (DNMT3a, IDH1 or 2, and a combination of NPM1 and FLT3 in the Adult AML dataset).

  3. Signature Based Drug Targets. This plot shows all the proteins with expression that is significantly downregulated (cool colors) or upregulated (warm colors) than that of the normal CD34+ controls for each signature.   

What are the Global Networks?

Once a user has selected a Global Network dataset (Adult AML, Pedi AML or Pedi ALL), there is a choice of which Protein Functional Group (PFG) is desired. 

  1. The PFG  clustering plot shows the optimal number of clusters for a given PFG.  The annotation across the top (pink and red) shows the source (Fresh peripheral blood or bone marrow)natures across the top. The individual protein names  are listed along the right side.  The legend for the color scale is also shown. The number of clusters observed  is shown in the “cluster” annotation above the heatmap. Some clusters only appeared in cryopreserved cells and were therefore excluded from the remainder of the presented analysis.

  2. Survival Curves  are shown for the Meta-Galaxy analysis. Users can select for overall survival or remission duration and they can select for all patients or those with  specific cytogenetic subgroups (favorable, intermediate or unfavorable), as well as for selected mutations (DNMT3a, IDH1 or 2, and a combination of NPM1 and FLT3 in the Adult AML dataset).

  3. Functional patterns show the median expression of each  members of a PFG  in each cluster. Each protein is in the same orientation  in each figure.

  4. Large Network Plot: This shows the main members of a PFG in the larger nodes and all significantly associated proteins in the small nodes. Connecting lines are doted if from the literature, dashed if from this dataset, or solid if from both.  Node color reflects the median expression of that protein in that cluster. The selector at the bottom allow you to toggle between the different clusters. Color scale is the same as that used in the heatmap above.

What are the Functional Protein Networks?

bottom of page