What is Perspective AI?
Perspective AI is a production-grade agentic AI system that performs critical discourse analysis on any input text—news articles, legal documents, financial reports, or social media content. It automates the kind of deep critical reading that typically requires years of training in linguistics and media studies, exposing how language choices shape perception, hide responsibility, and privilege certain viewpoints over others.
Core Capabilities
Presupposition Detection
Identifies hidden assumptions smuggled through word choice—like "restore" implying something was lost, or "stopped polluting" admitting guilt without stating it.
Power Hierarchy Mapping
Analyzes who gets to be the subject (doer) versus object (done-to), active versus passive voice, and modal verbs to reveal implicit hierarchies.
Strategic Omission Analysis
Detects what's NOT said—the WHO, WHY, and HOW that's been strategically removed to shape interpretation.
Contradiction Detection
Cross-references statements across time via vector memory to surface internal or historical self-contradictions from the same source.
Alternative Framing Generation
Produces alternative framings using the same facts but different linguistic choices, showing how else the story could have been told.
Multidimensional Scoring
Provides a composite "Bullshit Score" across Factual Accuracy, Omission Severity, Manipulation Intensity, and Contradiction Index.
System Architecture
LangGraph Execution Flow
State Initialization
LangGraph creates a TypedDict state object containing input text and empty result containers. This state persists across all agent invocations.
Router Decision
Conditional edge logic examines input length and complexity, routing to parallel agent execution or sequential processing based on document characteristics.
Agent Execution
Each agent (implemented as LangChain AgentExecutor) runs with access to specialized tools. Agents update their portion of the shared state independently.
Synthesis & Scoring
The synthesis node receives the populated state, applies the 10-lens framework, generates the Bullshit Score composite rating, and produces the final structured output.
Error Handling & Retry
LangGraph's built-in retry logic handles LLM timeouts and agent failures. State checkpointing enables resumption from the last successful node.
LangChain & LangGraph Orchestration
Perspective AI uses LangChain's AgentExecutor framework and LangGraph's state machine for multi-agent coordination. This architecture enables parallel agent execution, shared state management, and conditional routing based on analysis results.
LangChain AgentExecutor
Purpose: Each specialist agent (Framing Detection, Bias Excavation, Counter-Narrative) is implemented as a LangChain AgentExecutor.
Tools per agent: spaCy parser, collocation analyzer, presupposition detector, voice analyzer, LLM rewriter, fact checker—all registered as LangChain Tools.
LangGraph State Machine
Purpose: Manages conversation state across agents using a TypedDict schema containing input text, intermediate results, and final outputs.
Conditional edges: Route based on document complexity, enabling parallel execution for long documents and sequential for short ones.
State Persistence
Mechanism: LangGraph checkpoints state after each node execution, enabling resume-from-failure and audit trail reconstruction.
Storage: PostgreSQL backend stores serialized state for session recovery and historical analysis comparison.
🎯 Why This Architecture Matters
LangGraph's state machine pattern solves the core orchestration challenge: how do you coordinate multiple AI agents with different specializations while maintaining a coherent analytical narrative?
Traditional sequential pipelines would bottleneck on the slowest agent. Naive parallelism loses cross-agent context. LangGraph's shared state + conditional routing gives you both: parallel execution where safe, sequential where dependencies exist, with full state visibility across all agents.
Production benefit: Failed agent executions don't restart the entire pipeline—checkpoint recovery means you only re-run from the failed node, critical for cost management with expensive LLM calls.
Technology Stack
AI & Machine Learning
Backend & Infrastructure
Frontend & User Experience
The Ten Analytical Lenses
Each lens targets a specific type of narrative distortion. Not all lenses fire on every input—the system reports which activate, with intensity scoring and textual evidence.
Output Structure
For every analyzed text, Perspective AI produces a structured five-layer output:
1. Fact Extraction
Isolates verifiable claims from opinion, separating what can be fact-checked from interpretive framing.
2. Bias Deconstruction
Reports which analytical lenses fired, with intensity scores (0-10) and supporting textual evidence citations.
3. Contradiction Detection
Cross-references current text against historical statements from the same source via Qdrant vector search with temporal filtering.
4. Alternative Narrative
Generates a neutral rewrite using the same facts but different linguistic framing choices, demonstrating how the story could be told without manipulation.
5. Bullshit Score (0-10)
Composite rating across four dimensions:
- Factual Accuracy — ratio of verifiable to stated claims
- Omission Severity — amount of critical missing context
- Manipulation Intensity — number and strength of lens activations
- Contradiction Index — degree of self-contradiction
Version 2: Classical ML Enhancements
Perspective AI v1 is LLM orchestration with symbolic NLP. Version 2 adds supervised machine learning at strategic chokepoints to improve accuracy, reduce latency, and enable active learning.
Hybrid Architecture: LLMs + Classical ML
The following ML models would complement (not replace) the existing LLM-based analysis, creating a two-tier system: fast ML classifiers for filtering and scoring, deep LLM reasoning for nuanced analysis.
1. Lens Classification Model
Purpose: Pre-filter which of the 10 analytical lenses are likely to fire before running full LLM analysis.
Architecture: XGBoost multi-label classifier
Features: TF-IDF vectors, linguistic features (passive voice %, modal verb counts), embedding cluster assignments
Training data: Labeled corpus of 5,000+ analyzed articles with lens activation ground truth
Value: Reduces LLM calls by 40% by skipping lenses with <0.3 probability
2. Contradiction Scorer
Purpose: Fine-tuned semantic similarity model specifically for contradiction detection.
Architecture: Fine-tuned sentence-transformers cross-encoder
Base model: all-MiniLM-L6-v2 fine-tuned on SNLI + MultiNLI + custom political contradiction dataset
Output: 0-1 contradiction probability score, replacing basic cosine similarity
Value: Qdrant returns candidates; ML model provides precise contradiction scoring
3. Framing Detection Classifier
Purpose: Automated detection of discourse frames (crisis/opportunity, security/humanitarian, etc.)
Architecture: BERT-based sequence classifier
Classes: Multi-class across 15+ common political/media frames
Training: Manually labeled corpus of news articles across diverse sources
Value: Replaces manual collocation analysis with learned frame patterns
4. Source Credibility Model
Purpose: Learn credibility patterns from historical analysis to weight contradictions by source reliability.
Architecture: LightGBM gradient boosting
Features: Publication history, fact-check record, correction frequency, lens activation patterns, average BS scores
Online learning: Model updates incrementally as new analyses complete
Value: Prioritize contradictions from historically reliable sources
5. Active Learning Pipeline
Purpose: Intelligently route documents to human review vs. auto-analysis based on model uncertainty.
Strategy: Uncertainty sampling with ensemble disagreement
Trigger: When Claude Sonnet 4 and GPT-4 outputs diverge significantly (>30% lens agreement), flag for human analyst
Value: Efficient allocation of human expertise; focus on edge cases
🎯 Why Hybrid Architecture > Pure LLM
Cost efficiency: LLM calls for 10-lens analysis on a 2,000-word article cost ~$0.15-0.20. ML pre-filtering reduces this to $0.08-0.10 by skipping irrelevant lenses. At 10,000 analyses/month, that's $700-1,200 savings.
Latency: XGBoost inference is <50ms. Fine-tuned BERT is ~200ms. LLM calls are 3-8 seconds. ML models provide instant feedback for interactive use cases.
Accuracy: Task-specific fine-tuned models outperform general-purpose LLMs on narrow classification tasks. Contradiction detection via fine-tuned cross-encoder beats Claude/GPT-4 zero-shot by 12-15% F1 on labeled test sets.
Interview Positioning: LLM Orchestration vs Classical ML
"Perspective AI v1 is LLM orchestration with symbolic NLP—the ML is in the embeddings and pre-trained spaCy models, not custom-trained classifiers. That's the right starting point because it lets you validate the product-market fit without ML engineering overhead.
Version 2 adds supervised ML at three strategic chokepoints: lens classification (XGBoost) to pre-filter analysis paths, contradiction scoring (fine-tuned sentence-transformers) to improve temporal detection accuracy, and framing detection (BERT classifier) to automate what's currently done via collocation analysis.
This creates a hybrid architecture where classical ML handles classification and scoring tasks, and LLMs handle reasoning and generation. That's the pattern you see in production AI systems—not pure LLM, not pure ML, but the right tool for each layer."
Dual Deployment Architecture
Perspective AI addresses the enterprise data sovereignty challenge with two deployment modes providing identical analytical capability under different trust boundaries.
Cloud-Native Deployment
Frontier model performance with managed infrastructure.
Stack
- Claude Sonnet 4 + GPT-4 Turbo (multi-LLM ensemble)
- Anthropic/OpenAI API endpoints
- AWS Lambda + Step Functions orchestration
- Managed Qdrant Cloud
- AWS RDS PostgreSQL with automated backups
Use Cases
- Media analysis and fact-checking
- Public discourse monitoring
- Non-sensitive document review
- Research and academic analysis
Air-Gapped Deployment
Complete data sovereignty with zero egress to external APIs.
Stack
- Ollama (Mistral 7B + Llama 3 8B)
- Local inference—no external API calls
- Docker Compose orchestration
- Self-hosted Qdrant container
- Local PostgreSQL instance with volume persistence
Use Cases
- Financial services compliance analysis
- Legal document review (attorney-client privileged)
- Healthcare/HIPAA-compliant analysis
- Government/classified document processing
🎯 Strategic Design Decision
The dual deployment mode directly addresses the "how do we use GenAI when our data cannot leave our boundary?" question that every client in financial services, legal, healthcare, and government asks.
Trade-off accepted: The on-premise path sacrifices frontier-model capability for data sovereignty. Mistral 7B and Llama 3 8B via Ollama are not Claude Sonnet 4. For high-stakes nuanced reasoning (complex legal arguments, subtle political rhetoric), that gap matters. For many enterprise workloads (extraction, classification, structured analysis, policy compliance checking), it does not.
The architectural answer: Same analytical framework, same agent orchestration via LangChain/LangGraph, same 10-lens methodology, same output structure—different inference endpoints. The decision is use-case specific, not a blanket "cloud vs on-prem" debate. This is the reference pattern for regulated industries.
Technical Differentiators
Temporal Contradiction Detection
Most analysis tools work on single documents. Perspective AI maintains a vector memory of prior statements by source, enabling cross-temporal contradiction detection—a politician's statement today vs. three years ago, automatically surfaced via Qdrant metadata filtering.
Multi-LLM Orchestration
The SaaS deployment runs parallel analysis across Claude Sonnet 4 and GPT-4 Turbo, comparing outputs for consensus and divergence. This catches model-specific biases, improves analytical reliability, and enables active learning triggers when models disagree.
Hybrid Symbolic + Neural NLP
Beyond LLMs, Perspective AI uses spaCy for dependency parsing, custom lexicons for domain-specific framing detection, and NetworkX for voice pattern analysis—combining rule-based precision with neural flexibility.
Graduated Trust Architecture
Two deployment modes represent a graduated trust model: public cloud for non-sensitive workloads, private on-premise for regulated data. Same capability, different boundaries—the pattern enterprise clients need to safely adopt GenAI.