PandaClaw

Welcome to the official user manual for

Your autonomous, AI partner in biological science.

Built on an advanced agent harness framework, PandaClaw is designed to streamline and scale complex multi-omics analyses by acting as an intelligent orchestration layer.
This guide will walk you through the platform’s capabilities, how to navigate the research workflow, and how to leverage our expansive ecosystem of tools and skills.

Introduction to PandaClaw

PandaClaw bridges native data environments with highly specialized bioinformatic tools to execute end-to-end biological data analyses in real time. Rather than relying on static, manual pipelines, PandaClaw dynamically interprets your prompts to plan, execute, critique, and report on multi-step workflows autonomously.

Core Advantages

Seamless Data Interoperability

PandaClaw eliminates data silos through comprehensive integration capabilities. It possesses native access to the PandaOmics platform, internal data warehouses, external biological databases, and proprietary user data. This allows the agent to autonomously aggregate and cross-reference multi-omics datasets and clinical information for high-fidelity analysis.
Autonomous Execution via LLM Agent Architecture

Built upon advanced agent harness frameworks, PandaClaw moves beyond traditional, static bioinformatics pipelines. When presented with a research objective, the agent autonomously formulates a multi-step analytical workflow (incorporating intelligence routing, planning, execution, critique, and report generation), dynamically selects the most appropriate and standardized bioinformatics tools, and runs the analysis in real time.
Expansive Bioinformatics Toolkit and Skill Ecosystem
To execute highly specialized workflows, PandaClaw leverages a vast, integrated repository of computational resources:
- Deep PandaOmics Integration: PandaClaw leverages PandaOmics’ tools and proprietary data to support target discovery and indication expansion. By enabling real-time querying of in-house datasets, it facilitates high-fidelity, data-driven drug target analysis within a single, unified ecosystem.
- Comprehensive Tool Library: The system features multiple curated native tools (including direct integrations with PandaOmics, public databases, and internal data warehouses) and provides seamless access to ToolUniverse (https://github.com/mims-harvard/ToolUniverse). This extends the platform’s capabilities to over 1,000 bioinformatics tools, ensuring researchers have the right instrument and the optimal resources for every stage of analysis.
- Advanced Skill Architecture: Powered by over 140 specialized scientific skills, this architecture enables the execution of complex biological workflows tailored for drug discovery. One key feature is TargetClaw, which integrates the latest PandaOmics TargetPro model for target identification. Additionally, analytical modules such as Gene Signatures, Gene Ontology (GO) Pathway enrichment, Gene Set Enrichment Analysis (GSEA), and Multiple Group Comparisons operate as in-house, production-ready capabilities that scale seamlessly to support robust multi-omics analysis.
Intelligent Error-Handling and Context Management

The agent is designed for computational resilience. If a specific bioinformatics tool encounters formatting issues or anomalous data, PandaClaw autonomously diagnoses the error, adjusts parameters, and self-corrects without requiring manual intervention. Furthermore, it retains session memory, enabling researchers to conduct iterative, multi-turn analyses within a continuous contextual workspace.
Data Provenance and Reproducible Reporting

To ensure the highest standards of transparency and security, all analytical workflows are performed within an isolated local execution sandbox. PandaClaw automatically synthesizes findings into comprehensive, figure-rich reports while maintaining strict data provenance. The system systematically archives all raw outputs, intermediate datasets, and generated visuals in a structured, navigable file system for immediate auditing and retrieval.

PandaClaw Skills

In PandaClaw, a "Skill" is a specialized analytical workflow designed to execute specific biological analyses. Each skill dictates the required data inputs, the computational tools utilized, and the format of the final outputs and report.

TargetClaw (Beta);Synergize PandaOmics and TargetPro models for target discovery.;–;Enter a specific disease to generate an AI-prioritized portfolio of therapeutic targets, including prominent clinical-stage candidates, high-confidence validated targets, and novel targets.;Find therapeutic targets for idiopathic pulmonary fibrosis. Target Evaluation;Assess druggability metrics and the competitive pharmacological landscape.;–;Enter a target gene or protein to assess its druggability metrics and competitive pharmacological landscape.;Perform a comprehensive target evaluation for EGFR, focusing on its druggability profile and competitive landscape. Target-Disease Evaluation;Evaluate disease-specific targets and elucidate Mechanisms of Action.;–;Enter a target gene or protein alongside a specific disease to evaluate its therapeutic potential and elucidate its Mechanisms of Action.;Evaluate APOE as a therapeutic target for Alzheimer's disease and elucidate its Mechanism of Action. Longevity Lobster;Identify dual-purpose anti-aging therapeutic targets.;–;Enter a specific disease to identify dual-purpose therapeutic targets exhibiting both disease-specific efficacy and anti-aging properties.;Find dual-purpose anti-aging targets for obesity. Evaluate GLP1R as a dual-purpose anti-aging target for obesity. Gene Signatures Analysis;Find gene expression patterns across diseases.;Differentially Expressed Genes (DEG) data;Input a target gene to analyze its expression patterns across multiple diseases. Upload custom DEG data to inform the analysis and generate an automated, comprehensive report and box-plot visualizations.;Analyze EGFR expression patterns across the 10 cancers exhibiting the greatest absolute median logFC in case-control comparisons. Multi-Group Comparison;Compare gene expression across multiple experimental cohorts.;expression.csv and metadata.csv;Specify a target gene, the desired clinical or phenotypic metadata variable for stratification, and the specific cohorts to execute a comparative multi-group expression analysis.;Compare VEGFA expression stratified by clinical cancer stage across multiple experimental cohorts. Correlation Analysis;Evaluate correlations between discrete variables in stratified groups.;expression.csv and metadata.csv (optional for gene-to-gene correlations, but required when analyzing correlations between a gene and a metadata variable);Specify two target genes, or a gene and a phenotypic metadata variable, to execute a bivariate correlation analysis. Optionally, designate a stratification variable and specify if Pearson correlation is preferred over the default Spearman method.;Show the Pearson correlation between EGFR and VEGFA. Evaluate the correlation between SPP1 expression and patient age within stratified cancer cohorts Gene Ontology (GO) and Pathway Analysis;Identify enriched biological processes and signaling pathways.;–;Enter a target gene panel or specific query to evaluate Gene Ontology (GO) enrichment, biological networks, and complex pathway interactions.;Identify enriched biological processes and signaling pathways for this gene panel: TP53, BRCA1, BRCA2, KRAS, and EGFR. ;;;Upload DEG data to execute GO and pathway enrichment queries.;Conduct GO analysis for genes based on uploaded DEG data Gene Set Enrichment Analysis (GSEA);Identify significant pathway enrichment from DEG datasets.;DEG data;Define your GSEA parameters, or provide a direct prompt to initiate the pathway enrichment evaluation.;Execute GSEA on the provided DEG dataset to identify significant pathway enrichment. Single Cell Signature;Plot gene expression across cell types between cases and controls at single-cell level.;;Specify disease name or dataset identifier and target genes to generate a comparative split-column dot plot displaying expression profiles between case and control cohorts across discrete cell populations.;Generate a single-cell dot plot depicting expression patterns of S100A8, S100A9, IL1B, TNFAIP3, JAK3, and IRF1 in rheumatoid arthritis. CADD Structural Review;Review a target's structural and ligand data to guide in silico drug design.;;Enter a single target gene or protein for an in-depth CADD (Computer-Aided Drug Design) structural review — crystal-structure landscape, druggable-pocket analysis versus paralogs, pocket druggability with 3D and 2D visualizations, and a top-3 SBDD strategy.;CADD structural review of NSD2

Your Research Workflow

The entire analytical process is driven through a single chat interface.

Step-by-Step Guide

Select a Skill (Optional but Recommended)

Before typing your query, select a specific "Skill" from the interface. A skill focuses PandaClaw on a specific domain (e.g., Target Evaluation, Gene Signatures).
Note: If you do not select a skill, PandaClaw’s intelligent routing will automatically assign the most appropriate one based on your natural language prompt.
Upload Data (If Required)

Certain analytical skills require you to provide data files. Use the attachment tool in the chat box to upload your specific multi-omics datasets or other data modalities, such as clinical files.
Ask Your Research Question

Type your research objective directly into the chat box.
Let PandaClaw Work

Click Send. PandaClaw will autonomously formulate a multi-step analytical workflow, select the required standardized tools, run the analysis, and generate your final report.

Your research doesn't have to stop once the final report is generated. With PandaClaw’s contextual memory, you can continue the conversation directly in the chat box. Feel free to ask follow-up questions, adjust analysis parameters, request deeper insights, or pivot your research strategy based on the initial findings.

Natural Language Prompts

Simple conversational commands instructing the agent on what you want to achieve.
Biological Entities

Text inputs such as a specific target protein name, a gene list, or a specific disease context.
Data Files

Raw or normalized multi-omics data and other data modalities. Supported formats include but are not limited to CSV, TSV, Excel, JSON, BED, VCF, GFF/GTF, PDF, and plain text — covering expression matrices, clinical metadata, patient cohort information, genomic annotations, and more.

Dynamic Planning

The agent evaluates your prompt and your uploaded data, mapping out the precise sequence of statistical and biological steps needed to answer your question.
Skill Execution

PandaClaw securely routes your data through our library of external and in-house skills and tools, which integrate over 140 specialized scientific skills and more than 1,000 bioinformatics tools.
Autonomous Quality Control

As the data is processed within an isolated, secure execution sandbox, the agent autonomously standardizes formats, diagnoses data anomalies, and adjusts algorithmic parameters to ensure the analysis completes successfully.

Comprehensive Synthesized Reports

Highly contextualized, figure-rich summaries generated directly in the user interface, linking the analysis results to concrete hypotheses and literature-backed evidence.
Publication-Ready Figures

Including box plots, volcano plots, heatmaps, functional enrichment network graphs, and structural protein maps.
Structured Datasets & File Archives

Intermediate results and organized data tables (e.g., statistical variances, gene lists) are automatically saved to a rigidly structured, easily searchable file system.
Auditing and Retrieval

You can efficiently navigate the file system to verify the underlying data, audit the agent's exact workflow steps, and download files for external presentations or further local review.

Security & Data Privacy

Bioinformatics research involves highly sensitive, proprietary, and often unpublished data. PandaClaw is built from the ground up to ensure your intellectual property remains secure at all times.

Back to Top