Prioritization Scores
PandaOmics employs a sophisticated scoring system to prioritize therapeutic targets based on diverse data sources, including multimodal omics datasets, text-based evidence, financial metrics, and expert insights. High score values indicate a stronger association between a gene and the disease of interest, greater druggability, or higher commercial potential, depending on the specific score. Users can view the ranked gene list in the Target ID interface, presented as a dashboard with a heatmap and customizable filters.
Score Categories
PandaOmics organizes its prioritization scores into distinct categories, each leveraging different data types and/or methodologies to evaluate potential therapeutic targets:
  • assess the biological relevance of a gene to a specific disease using multi-omics data, such as gene expression, proteomics, and methylation profiles. These scores are derived from bioinformatics analyses and ML models that integrate data from public and user-uploaded datasets (accessible only to the uploader).
  • evaluate the relevance of a gene based on its mentions in scientific literature, patents, grants, and clinical trials. These scores leverage AI-based text mining and natural language processing to quantify attention and trends.
  • estimate the commercial viability of a target based on funding trends, extracted from grant applications.
  • scores gauge a gene's prominence in high-impact research based on expert publications and journal influence.
  • LLM (Large Language Models)
    scores leverage AI to analyze diverse biomedical text, including scientific literature, patents, and clinical trial reports to evaluate a target’s disease relevance, market potential, and druggability.
Configuring and Using Scores
By default, PandaOmics ranks genes by averaging a set of five core scores optimized for exploring well-studied targets with robust evidence. For researchers investigating less-studied or novel targets, enabling additional scores beyond the core set allows for more refined gene rankings and deeper insights.
Users can also select predefined presets like "Trending Genes" or "Novel Targets (Small Molecules)" to toggle specific combinations of scores and druggability filters tailored to their research focus.

Practical Tips

  • For initial exploration, use the default core scores to identify well-validated targets with strong evidence across categories.
  • For drug repurposing, explore the Indication Prioritization tab to identify diseases where a target may have high clinical success potential.
  • Include scores from the Omics category for emerging or less-studied targets.
  • Apply druggability filters to align gene rankings with project goals.
  • Use the Knowledge Graph and ChatPandaGPT to dive deeper into the evidence behind top-ranked targets, ensuring robust hypothesis validation.
OMICs
Omics score values represent the probability of a given evidence group indicating the association of a given gene to a given disease. All scores range from 0 to 1, with 0 indicating no evidence, and 1 - the highest degree of evidence
A Network Neighbors score utilizes several graph-based methods applied to the protein-protein interaction network enriched with differentially expressed/methylated genes. A target will be scored higher if there are more network neighbors with significant differences in expression or methylation levels.
Text-based (NLP)
This group of scores is based on the analysis of text sources, including scientific publications, grants, patents, clinical trials. Insilico monitors the biomedical text data and uses advanced NLP and AI-based technology, to analyze the contextualized mentions of entities such as genes, diseases, drugs and KOLs in a variety of data sources. The scores are calculated for each gene in the disease-agnostic manner (total mentions of the gene), and in the context of disease areas of interest. Attention spikes are calculated for target — disease associations only (no disease-agnostic scores).
The Attention score measures the overall attention to the target at all times. PandaOmics calculates the total number of mentions of a gene in various texts across all time periods. Both disease-agnostic and disease-specific mentions are counted. The text corpus used for analysis includes scientific publications, grants, patents, and clinical trials.
Financial scores
Total grant funding to investigate the given gene at all times. Please note that the distribution of total funding is skewed with the mean value of approx. $8.9 million, but the median value is only $1.7 million. Also, 5% of entries have funding above $30 million.
Key Opinion Leaders (KOL)
This score measures the average Hirsch index of the researchers who published scientific articles studying given gene-disease association.
LLM
Measures the evidence supporting the target's role in disease mechanisms, considering experimental validation and robust scientific data. Higher scores reflect solid validation and reduced uncertainties.