Calculating Similarities: Introduction
Now we need to make sense of these data. Recall that our original question was how tumor formation affects gene expression patterns.
A common analysis method starts by calculating similarities between the expression patterns of individual genes.
There are many ways of calculating similarities. One popular method is the Pearson correlation coefficient, a measure that describes how two variables (in this case, expression levels from two genes) go up and down together.
Gene A | Gene B | Gene C | Gene D | |
---|---|---|---|---|
Sample 1 | 0.602 | 0 | −0.481 | 0 |
Sample 2 | 0.301 | −0.0969 | 0 | 0.114 |
Sample 3 | 0.544 | 0.301 | −0.602 | 0.477 |
Sample 4 | 0.176 | −0.301 | −0.602 | 0 |
Sample 5 | −0.0969 | 0 | 0.0792 | −0.0969 |