Samples visualization chart
In order to demonstrate the similarity of different samples on the basis of gene expression, PandaOmics needs to visualize them on a chart. It requires the application of an additional computational approach as each sample is a 20.000 dimensional object (if we take all 20.000 protein coding genes into account), on a two-dimensional space (a chart). Pandomics provides three solutions for data visualization utilizing machine learning approaches:
UMAP − Uniform Manifold Approximation and Projection for Dimension Reduction;
T-SNE − t-distributed stochastic neighbor embedding;
PCA − Principal Component Analysis;
The default visualization method is UMAP, however it is possible to switch to another one at any moment. You can overlay metadata (sample annotation) on this chart exploring the distribution of characteristics across the samples and compare them with clustering suggestions