Agent-first competitive intelligence demo powered by saved article analysis, MCP tools, and an optional live CocoIndex pipeline. The sample demo runs without credentials; the live mode uses Tavily, LLM extraction, CocoIndex, and Postgres.
This pipeline automatically:
- Exposes MCP tools so agents can search events, find trends, and create briefs
- Runs locally without credentials for reliable analyst demos using saved article JSON
- Generates briefings and dashboards in Markdown, JSON, CSV, and static HTML
- Searches the web using Tavily AI (AI-native search engine optimized for agents)
- Extracts competitive intelligence events using DeepSeek LLM analysis:
- Product launches and feature releases
- Partnerships and collaborations
- Funding rounds and financial news
- Key executive hires/departures
- Acquisitions and mergers
- Indexes both raw articles and extracted events in PostgreSQL
- Enables queries like:
- "What has OpenAI been doing recently?"
- "Which competitors are making the most news?"
- "Find all partnership announcements"
- "What are the most significant competitive moves this week?"
- Python 3.11+
- For sample mode: no credentials required
- For live CocoIndex mode:
- PostgreSQL database
- Tavily API key from tavily.com (free tier: 1,000 searches/month)
- OpenRouter API key for LLM extraction via GPT-4o-mini (cost-effective: ~$0.15 per 1M input tokens, ~$0.60 per 1M output tokens)
You can now run a no-credential analyst workflow before setting up Tavily, OpenRouter, CocoIndex, or PostgreSQL:
python3 competitive_intel.py sample --dashboard --slug demoThis reads:
watchlist.json- editable competitors, aliases, event categories, and scoring termsdata/sample_articles.json- saved article records to analyze
It writes a run bundle to reports/:
brief-*.md- analyst-readable intelligence briefintel-events-*.json- structured event exportintel-events-*.csv- spreadsheet-friendly exportdashboard-*.html- static dashboard with search and significance filters
Use your own article file by matching the sample JSON schema:
python3 competitive_intel.py sample \
--config watchlist.json \
--input my_articles.json \
--output-dir reports \
--dashboardFor a live one-off run, pass competitors at runtime instead of editing .env:
python3 competitive_intel.py live \
--competitors "Apple,Microsoft" \
--focus "product launch, partnership" \
--max-results 2 \
--slug apple-microsoft-live.env holds credentials and fallback defaults. The current competitor set
should usually come from CLI args or MCP tool args.
Start the MCP server so other agents can call the tools:
python3 competitive_intel.py mcpOr configure an MCP-capable client from mcp-config.example.json. Replace the
placeholder paths with this repo's absolute path and point the command at your
.venv/bin/python.
Available tools:
analyze_saved_articlessearch_eventsget_trending_competitorscreate_briefcreate_dashboardrun_cocoindex_update
Example agent prompts:
- "Analyze the saved articles and create a dashboard."
- "Search events about enterprise customers."
- "Which competitors are trending?"
- "Create a board-ready competitive intelligence brief."
Run the deterministic agent demo transcript:
python3 agent_demo.py --slug demo-agentThis creates reports/demo-agent-transcript.md along with briefing and
dashboard artifacts.
Run the local and MCP smoke tests:
python3 -m unittest test_local_intel.pySee DEMO.md for the full agent-first demo script.
Choose Option A (Local) or Option B (Cloud):
# Install PostgreSQL (macOS)
brew install postgresql@15
brew services start postgresql@15
# Create database
createdb competitive_intel
# Your connection string:
# postgresql://username:password@localhost:5432/competitive_intelGoogle Cloud SQL Example:
- Create PostgreSQL instance in Google Cloud Console
- Note the Public IP address (e.g.,
34.71.19.121) - Create database:
postgres(or custom name) - Set password for
postgresuser - Allow your IP in Cloud SQL connections
Connection string format:
postgresql://postgres:YOUR_PASSWORD@PUBLIC_IP:5432/postgres
💡 Special characters in password? URL-encode them:
@→%40#→%23&→%26
Example: Password Lucas@123 becomes Lucas%40123
AWS RDS / Azure: Same format, just use your cloud database endpoint instead of public IP.
pip install -e .Copy the example environment file and add your credentials:
cp .env.example .envEdit .env and set:
DATABASE_URL- Your PostgreSQL connection string (from Step 1)COCOINDEX_DATABASE_URL- Same as DATABASE_URL (required by CocoIndex)OPENAI_API_KEY- OpenRouter API key from openrouter.aiTAVILY_API_KEY- Tavily API key from tavily.comCOMPETITORS- Comma-separated list of companies to trackSEARCH_DAYS_BACK- How many days back to search (default: 7)MAX_RESULTS_PER_COMPETITOR- Articles fetched per competitorEVENT_QUERY- Search terms used for event discovery
Example (Local PostgreSQL):
DATABASE_URL=postgresql://user:password@localhost:5432/competitive_intel
COCOINDEX_DATABASE_URL=postgresql://user:password@localhost:5432/competitive_intel
OPENAI_API_KEY=sk-or-v1-...
TAVILY_API_KEY=tvly-...
COMPETITORS=OpenAI,Anthropic,Google AI,Meta AI,Mistral AI
REFRESH_INTERVAL_SECONDS=3600
SEARCH_DAYS_BACK=7
MAX_RESULTS_PER_COMPETITOR=10
EVENT_QUERY=(funding OR partnership OR product launch OR acquisition OR executive hire)Example (Google Cloud SQL):
DATABASE_URL=postgresql://postgres:Lucas%40123@34.71.19.121:5432/postgres
COCOINDEX_DATABASE_URL=postgresql://postgres:Lucas%40123@34.71.19.121:5432/postgres
OPENAI_API_KEY=sk-or-v1-...
TAVILY_API_KEY=tvly-...
COMPETITORS=Apple,Google,Microsoft,Amazon,Meta
REFRESH_INTERVAL_SECONDS=3600
SEARCH_DAYS_BACK=7
MAX_RESULTS_PER_COMPETITOR=10
EVENT_QUERY=(funding OR partnership OR product launch OR acquisition OR executive hire)Option A: Interactive Mode (Recommended for first-time users)
Run the interactive CLI that prompts you for what to monitor:
python3 run_interactive.pyThis will ask you:
- Which companies to track
- What types of events to focus on (product launches, partnerships, funding, etc.)
- Time range to search (default: 7 days)
- How many articles per company (default: 10)
- One-time sync or continuous monitoring
See INTERACTIVE_DEMO.md for example sessions and use cases.
Option B: Direct Mode (For automated/scheduled runs)
Initial sync:
cocoindex update main -fContinuous monitoring (live mode):
cocoindex update -L main.pyOption C: Agent Tool Mode
Run the MCP server and ask an MCP-capable agent to use local mode or live CocoIndex mode:
python3 mcp_server.pyExample live calls:
run_cocoindex_update(live=true, competitors="Apple,Microsoft", max_results=2, event_query="(product launch OR partnership)")search_events(mode="cocoindex", competitor="Apple")get_trending_competitors(mode="cocoindex", days=7)create_brief(mode="cocoindex")create_dashboard(mode="cocoindex")
COMPETITORS in .env is only the default. MCP callers can override it per
run with competitors, either as a comma-separated string or a JSON array, so
agents can switch from AI labs to any market category without editing files.
Human-friendly equivalent:
python3 competitive_intel.py live \
--competitors "Perplexity,Glean" \
--focus "funding, partnerships, product launches" \
--max-results 2 \
--slug perplexity-glean-liveRun the test script to verify data extraction:
python3 test_results.pySave extracted intelligence to a text file:
python3 generate_report.pyThis creates intelligence_report_YYYY-MM-DD_HH-MM-SS.txt with:
- Summary statistics
- Event type distribution
- Competitor rankings
- Detailed intelligence by company
See USAGE_GUIDE.md for more commands and TESTING.md for comprehensive testing.
Once the pipeline is running, you can query your competitive intelligence:
"What has Anthropic been doing recently?"
→ Uses: search_by_competitor(competitor="Anthropic")
"Find funding news about OpenAI"
→ Uses: search_by_competitor(competitor="OpenAI", event_type="funding")
"What are the most significant competitive moves this week?"
→ Uses: get_high_significance_events(days=7)
"Which AI companies are making the most news?"
→ Uses: get_trending_competitors(days=7)
"What partnerships has Google AI announced?"
→ Uses: search_partnerships(partner="Google AI")
Stores raw articles from news sources and blogs:
id- Article URL (primary key)title- Article headlinecontent- Article text/summaryurl- Source URLsource- Publisher namepublished_at- Publication timestamp
Stores extracted competitive intelligence events:
article_id- Reference to source articleevent_type- Category: product_launch, partnership, funding, key_hire, acquisitioncompetitor- Primary company involveddescription- Event summarysignificance- Impact rating: high, medium, lowrelated_companies- Other companies mentioned (partners, investors, etc.)
Edit main.py TavilySearchSource configuration:
flow.add_source(
TavilySearchSource(
api_key=tavily_api_key,
competitor=competitor.strip(),
days_back=7, # Adjust lookback period
max_results=20, # Increase results per competitor
),
refresh_interval_seconds=1800, # Check every 30 minutes
)Set EVENT_QUERY in .env:
EVENT_QUERY=(funding OR partnership OR product launch OR acquisition OR executive hire OR regulatory)The interactive CLI also writes this value when you choose an event focus.
Edit .env to track different companies:
COMPETITORS=Company1,Company2,Company3Edit the CompetitiveEvent model in main.py to track different event categories.
Adjust REFRESH_INTERVAL_SECONDS in .env:
3600= hourly (default)1800= every 30 minutes86400= daily
CocoIndex provides CocoInsight (free beta) for visualizing data lineage and debugging:
- See how data flows through the pipeline
- Inspect LLM extraction results
- Troubleshoot indexing issues
Visit the CocoIndex documentation for CocoInsight setup.
┌─────────────────────────────────────────────────────────────────────┐
│ COMPETITIVE INTELLIGENCE MONITOR │
└─────────────────────────────────────────────────────────────────────┘
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Tavily AI │──────▶│ CocoIndex │──────▶│ PostgreSQL │
│ Search │ │ Pipeline │ │ Database │
└──────────────┘ └──────────────┘ └──────────────┘
│ │ │
│ │ │
▼ ▼ ▼
Articles Extraction Intelligence
(web data) (GPT-4o-mini) (structured)
-
Data Ingestion (Tavily AI Search)
- Searches web for competitor mentions
- Filters by time range (configurable: 1-30 days)
- Returns clean, full article content
- Output: Raw articles with metadata
-
LLM Extraction (GPT-4o-mini via OpenRouter)
- Processes article content through LLM
- Extracts structured
CompetitiveEventobjects - Classifies: product launches, partnerships, funding, hires, acquisitions
- Assigns significance: high, medium, low
- Output: Structured intelligence events
-
Dual Indexing (CocoIndex + PostgreSQL)
- Articles Table: Raw content, URLs, sources, timestamps
- Events Table: Extracted intelligence with relationships
- Incremental updates (only new data processed)
- Output: Queryable database
-
Query Layer (SQL + Python)
- Search by competitor
- Filter by event type
- Rank by significance
- Trend analysis
- Output: Intelligence reports
- Incremental Processing: CocoIndex tracks processed articles, avoiding duplicate work
- Dual Indexing: Both raw content and extracted entities for maximum flexibility
- Weighted Scoring: High-significance events = 3 points, medium = 2, low = 1
- Relational Queries: Join articles with events for full context
- Real-time Monitoring: Continuous mode refreshes every hour (configurable)
- Local Analyst Mode: Run saved articles through a lightweight scorer with no external services
Tavily is an AI-native search engine designed specifically for AI agents and LLMs:
- Clean content extraction - Returns full article text, not just snippets
- Relevance scoring - Built-in ranking for competitive intelligence
- No scraping needed - Handles content extraction and cleaning
- Free tier - 1,000 searches/month (enough for hourly monitoring of 5-10 competitors)
- Advanced search - Deeper crawling for comprehensive results
- Refine search queries - Add industry-specific keywords or event types
- Add custom event types - Track regulation changes, PR crises, etc.
- Sentiment analysis - Classify news as positive/negative/neutral
- Alert system - Get notified of high-significance events via email/Slack
- Dashboard - Build a web UI for exploring competitive intelligence
- Export reports - Generate weekly/monthly competitor summary reports
competitive-intelligence/
├── main.py # Core pipeline definition
├── competitive_intel.py # User-friendly sample/live/MCP CLI
├── local_intel.py # Local analyst workflow, reports, and dashboard
├── agent_demo.py # Deterministic agent transcript demo
├── providers.py # Local and CocoIndex-backed data providers
├── mcp_server.py # Agent-facing MCP tool server
├── mcp-config.example.json # MCP client configuration template
├── docker-compose.yml # Local Postgres for live CocoIndex demos
├── live_demo_check.py # Credential-gated live CocoIndex smoke check
├── DEMO.md # Agent-first CocoIndex demo script
├── watchlist.json # Editable local watchlist
├── watchlist.example.json # Local watchlist and scoring configuration
├── data/sample_articles.json # Demo input records for local analysis
├── run_interactive.py # Interactive CLI for easy setup
├── test_results.py # Validation and testing script
├── test_local_intel.py # Local analyst workflow tests
├── generate_report.py # Report generation tool
├── clear_and_run.py # Fresh data testing utility
├── pyproject.toml # Project dependencies
├── .env.example # Environment template
├── .env # Your credentials (git-ignored)
│
├── README.md # This file
├── QUICKSTART.md # 3-minute setup guide
├── USAGE_GUIDE.md # Complete command reference
├── TESTING.md # Testing procedures
├── INTERACTIVE_DEMO.md # Interactive mode examples
├── CLAUDE.md # Developer guidance
├── CONTRIBUTING.md # Contribution guidelines
└── LICENSE # MIT License
We welcome contributions! See CONTRIBUTING.md for guidelines.
- Report bugs via GitHub Issues
- Submit feature requests
- Improve documentation
- Add new data sources
- Create new query handlers
This project is licensed under the MIT License - see the LICENSE file for details.
- Built with CocoIndex - Modern data pipeline framework
- Powered by Tavily AI Search - AI-native search engine
- LLM extraction via OpenRouter - Multi-model API gateway
- Documentation: Full docs | Quick Start | Usage Guide
- Issues: Report bugs or request features via GitHub Issues
- CocoIndex: cocoindex.io
- Examples: github.com/cocoindex-io/cocoindex
Built with ❤️ using CocoIndex | Track your competitors automatically