Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 31 additions & 0 deletions agents/observability-engineer.agent.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
---
name: 'observability-engineer'
description: 'Observability specialist for structured logging, metrics, distributed tracing, and alerting - designing the three pillars so production issues are diagnosable from telemetry'
tools: ['codebase', 'edit/editFiles', 'search', 'runCommands', 'terminalCommand']
---

# Observability Engineer

You are an observability engineer. You make systems explain themselves: when something breaks in production, the telemetry should already contain the answer.

## Core Expertise

- **Structured logging**: JSON logs, level discipline, correlation IDs, redaction of secrets/PII
- **Metrics**: RED (rate, errors, duration) for services, USE (utilization, saturation, errors) for resources, Prometheus naming conventions, label-cardinality control
- **Distributed tracing**: OpenTelemetry instrumentation, context propagation across HTTP/queues, sampling strategies
- **Alerting**: symptom-based alerts on SLOs rather than cause-based noise; every alert links to a runbook
- **Stacks**: Prometheus/Grafana, ELK, Grafana Loki/Tempo, CloudWatch, Application Insights, Datadog

## Working Method

1. Start from the question "what would we need to see to debug last month's worst incident?" and work backwards.
2. Prefer auto-instrumentation (OTel SDKs, agent-based) first; add manual spans/metrics only for business-critical paths.
3. Enforce cardinality budgets: no unbounded label values (user IDs, URLs with IDs) in metrics - those belong in traces and logs.
4. Connect the pillars: logs carry `trace_id`, metrics exemplars link to traces, dashboards link to log queries.
5. For every dashboard panel or alert proposed, state the action a human would take when it fires; delete it if there is none.

## Response Style

- Deliver concrete artifacts: instrumentation code, scrape/collector configs, dashboard JSON, alert rules.
- Keep explanations to one line per decision; the configs are the deliverable.
- Flag telemetry that leaks secrets or PII as a blocking issue.
1 change: 1 addition & 0 deletions docs/README.agents.md
Original file line number Diff line number Diff line change
Expand Up @@ -149,6 +149,7 @@ See [CONTRIBUTING.md](../CONTRIBUTING.md#adding-agents) for guidelines on how to
| [Neon Performance Analyzer](../agents/neon-optimization-analyzer.agent.md)<br />[![Install in VS Code](https://img.shields.io/badge/VS_Code-Install-0098FF?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/agent?url=vscode%3Achat-agent%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Fagents%2Fneon-optimization-analyzer.agent.md)<br />[![Install in VS Code Insiders](https://img.shields.io/badge/VS_Code_Insiders-Install-24bfa5?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/agent?url=vscode-insiders%3Achat-agent%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Fagents%2Fneon-optimization-analyzer.agent.md) | Identify and fix slow Postgres queries automatically using Neon's branching workflow. Analyzes execution plans, tests optimizations in isolated database branches, and provides clear before/after performance metrics with actionable code fixes. | |
| [New Relic Incident Response Agent](../agents/new-relic-incident-response.agent.md)<br />[![Install in VS Code](https://img.shields.io/badge/VS_Code-Install-0098FF?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/agent?url=vscode%3Achat-agent%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Fagents%2Fnew-relic-incident-response.agent.md)<br />[![Install in VS Code Insiders](https://img.shields.io/badge/VS_Code_Insiders-Install-24bfa5?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/agent?url=vscode-insiders%3Achat-agent%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Fagents%2Fnew-relic-incident-response.agent.md) | Identify and fix production issues by correlating New Relic observability data with code changes. Analyze alerts, transaction traces, error analytics, and deployments to find root causes and suggest code fixes. | |
| [Next.js Expert](../agents/expert-nextjs-developer.agent.md)<br />[![Install in VS Code](https://img.shields.io/badge/VS_Code-Install-0098FF?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/agent?url=vscode%3Achat-agent%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Fagents%2Fexpert-nextjs-developer.agent.md)<br />[![Install in VS Code Insiders](https://img.shields.io/badge/VS_Code_Insiders-Install-24bfa5?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/agent?url=vscode-insiders%3Achat-agent%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Fagents%2Fexpert-nextjs-developer.agent.md) | Expert Next.js 16 developer specializing in App Router, Server Components, Cache Components, Turbopack, and modern React patterns with TypeScript | |
| [Observability Engineer](../agents/observability-engineer.agent.md)<br />[![Install in VS Code](https://img.shields.io/badge/VS_Code-Install-0098FF?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/agent?url=vscode%3Achat-agent%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Fagents%2Fobservability-engineer.agent.md)<br />[![Install in VS Code Insiders](https://img.shields.io/badge/VS_Code_Insiders-Install-24bfa5?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/agent?url=vscode-insiders%3Achat-agent%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Fagents%2Fobservability-engineer.agent.md) | Observability specialist for structured logging, metrics, distributed tracing, and alerting - designing the three pillars so production issues are diagnosable from telemetry | |
| [Octopus Release Notes With Mcp](../agents/octopus-deploy-release-notes-mcp.agent.md)<br />[![Install in VS Code](https://img.shields.io/badge/VS_Code-Install-0098FF?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/agent?url=vscode%3Achat-agent%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Fagents%2Foctopus-deploy-release-notes-mcp.agent.md)<br />[![Install in VS Code Insiders](https://img.shields.io/badge/VS_Code_Insiders-Install-24bfa5?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/agent?url=vscode-insiders%3Achat-agent%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Fagents%2Foctopus-deploy-release-notes-mcp.agent.md) | Generate release notes for a release in Octopus Deploy. The tools for this MCP server provide access to the Octopus Deploy APIs. | octopus<br />[![Install MCP](https://img.shields.io/badge/Install-VS_Code-0098FF?style=flat-square)](https://aka.ms/awesome-copilot/install/mcp-vscode?name=octopus&config=%7B%22command%22%3A%22npx%22%2C%22args%22%3A%5B%22-y%22%2C%22%2540octopusdeploy%252Fmcp-server%22%5D%2C%22env%22%3A%7B%7D%7D)<br />[![Install MCP](https://img.shields.io/badge/Install-VS_Code_Insiders-24bfa5?style=flat-square)](https://aka.ms/awesome-copilot/install/mcp-vscodeinsiders?name=octopus&config=%7B%22command%22%3A%22npx%22%2C%22args%22%3A%5B%22-y%22%2C%22%2540octopusdeploy%252Fmcp-server%22%5D%2C%22env%22%3A%7B%7D%7D)<br />[![Install MCP](https://img.shields.io/badge/Install-Visual_Studio-C16FDE?style=flat-square)](https://aka.ms/awesome-copilot/install/mcp-visualstudio/mcp-install?%7B%22command%22%3A%22npx%22%2C%22args%22%3A%5B%22-y%22%2C%22%2540octopusdeploy%252Fmcp-server%22%5D%2C%22env%22%3A%7B%7D%7D) |
| [One Shot Feature Issue Planner](../agents/one-shot-feature-issue-planner.agent.md)<br />[![Install in VS Code](https://img.shields.io/badge/VS_Code-Install-0098FF?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/agent?url=vscode%3Achat-agent%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Fagents%2Fone-shot-feature-issue-planner.agent.md)<br />[![Install in VS Code Insiders](https://img.shields.io/badge/VS_Code_Insiders-Install-24bfa5?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/agent?url=vscode-insiders%3Achat-agent%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Fagents%2Fone-shot-feature-issue-planner.agent.md) | Cloud Agent to Turn a single new-feature request into a complete, issue-ready implementation plan without follow-up questions. | |
| [OpenAPI to Application Generator](../agents/openapi-to-application.agent.md)<br />[![Install in VS Code](https://img.shields.io/badge/VS_Code-Install-0098FF?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/agent?url=vscode%3Achat-agent%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Fagents%2Fopenapi-to-application.agent.md)<br />[![Install in VS Code Insiders](https://img.shields.io/badge/VS_Code_Insiders-Install-24bfa5?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/agent?url=vscode-insiders%3Achat-agent%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Fagents%2Fopenapi-to-application.agent.md) | Expert assistant for generating working applications from OpenAPI specifications | |
Expand Down
1 change: 1 addition & 0 deletions docs/README.skills.md
Original file line number Diff line number Diff line change
Expand Up @@ -359,6 +359,7 @@ See [CONTRIBUTING.md](../CONTRIBUTING.md#adding-skills) for guidelines on how to
| [structured-autonomy-generate](../skills/structured-autonomy-generate/SKILL.md)<br />`gh skills install github/awesome-copilot structured-autonomy-generate` | Structured Autonomy Implementation Generator Prompt | None |
| [structured-autonomy-implement](../skills/structured-autonomy-implement/SKILL.md)<br />`gh skills install github/awesome-copilot structured-autonomy-implement` | Structured Autonomy Implementation Prompt | None |
| [structured-autonomy-plan](../skills/structured-autonomy-plan/SKILL.md)<br />`gh skills install github/awesome-copilot structured-autonomy-plan` | Structured Autonomy Planning Prompt | None |
| [structured-logging-adopter](../skills/structured-logging-adopter/SKILL.md)<br />`gh skills install github/awesome-copilot structured-logging-adopter` | Convert ad-hoc print/console logging to structured JSON logging with consistent fields, levels, and correlation IDs. Use when the user wants to adopt structured logging, replace console.log or print statements, prepare logs for aggregation in ELK, Loki, or CloudWatch, or add request correlation to logs. | None |
| [suggest-awesome-github-copilot-agents](../skills/suggest-awesome-github-copilot-agents/SKILL.md)<br />`gh skills install github/awesome-copilot suggest-awesome-github-copilot-agents` | Suggest relevant GitHub Copilot Custom Agents files from the awesome-copilot repository based on current repository context and chat history, avoiding duplicates with existing custom agents in this repository, and identifying outdated agents that need updates. | None |
| [suggest-awesome-github-copilot-instructions](../skills/suggest-awesome-github-copilot-instructions/SKILL.md)<br />`gh skills install github/awesome-copilot suggest-awesome-github-copilot-instructions` | Suggest relevant GitHub Copilot instruction files from the awesome-copilot repository based on current repository context and chat history, avoiding duplicates with existing instructions in this repository, and identifying outdated instructions that need updates. | None |
| [suggest-awesome-github-copilot-skills](../skills/suggest-awesome-github-copilot-skills/SKILL.md)<br />`gh skills install github/awesome-copilot suggest-awesome-github-copilot-skills` | Suggest relevant GitHub Copilot skills from the awesome-copilot repository based on current repository context and chat history, avoiding duplicates with existing skills in this repository, and identifying outdated skills that need updates. | None |
Expand Down
85 changes: 85 additions & 0 deletions skills/structured-logging-adopter/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
---
name: structured-logging-adopter
description: 'Convert ad-hoc print/console logging to structured JSON logging with consistent fields, levels, and correlation IDs. Use when the user wants to adopt structured logging, replace console.log or print statements, prepare logs for aggregation in ELK, Loki, or CloudWatch, or add request correlation to logs.'
license: MIT
---

# Structured Logging Adopter

Migrate a codebase from unstructured `print`/`console.log` statements to structured, machine-parseable logging that log aggregators can index.

## When to Use This Skill

Use this skill when you need to:
- Replace `console.log`/`print`/`System.out.println` scattered through a codebase
- Emit JSON logs consumable by ELK, Grafana Loki, CloudWatch, or Datadog
- Standardize log levels and event fields across services
- Add correlation/request IDs so one request can be traced across log lines

## Migration Workflow

1. **Inventory**: find all logging call sites (`console.*`, `print`, `fmt.Println`, `System.out`) and classify each as debug noise, operational event, or error.
2. **Pick the idiomatic library**: pino (Node), structlog or stdlib `logging` + JSON formatter (Python), slog (Go), Serilog (.NET), Logback + logstash-encoder (Java).
3. **Define the field contract**: `timestamp`, `level`, `message`, `service`, `env`, plus domain fields (`order_id`, `user_id`). Never interpolate values into the message string - put them in fields.
4. **Convert call sites**, deleting debug noise instead of porting it.
5. **Add correlation**: middleware generates/propagates a request ID and injects it into every log line in that request's scope.
6. **Configure output**: JSON to stdout in production (12-factor), pretty-print only in local dev.

## Usage Examples

### Example 1: Node.js with pino

```javascript
// Before
console.log("User " + userId + " placed order " + orderId);

// After
import pino from "pino";
const logger = pino({ base: { service: "orders-api" } });
logger.info({ userId, orderId }, "order placed");
```

### Example 2: Python with structlog

```python
# Before
print(f"payment failed for order {order_id}: {err}")

# After
import structlog
log = structlog.get_logger()
log.error("payment failed", order_id=order_id, error=str(err))
```

### Example 3: Request correlation middleware (Express)

```javascript
import { randomUUID } from "crypto";
app.use((req, res, next) => {
req.log = logger.child({ requestId: req.get("x-request-id") ?? randomUUID() });
next();
});
// handlers use req.log.info(...) so every line carries requestId
```

## Level Conventions

| Level | Use for |
|---|---|
| `error` | Failed operations needing attention; always include the error and context fields |
| `warn` | Degraded but handled: retries, fallbacks, deprecations |
| `info` | Business events: request completed, order created, job finished |
| `debug` | Developer diagnostics; disabled in production by default |

## Guidelines

1. **Log events, not sentences** - `"order placed"` + fields beats `"User 42 placed order 7 successfully!!"`.
2. **Never log secrets or PII** - redact tokens, passwords, card numbers; hash user identifiers when policy requires.
3. **One log per event** - avoid multi-line logs; stack traces go in an `error` field, not raw output.
4. **snake_case or camelCase, not both** - pick the aggregator-friendly convention and enforce it.
5. **Errors logged at the boundary** - log where the error is handled, not at every rethrow (avoids duplicates).

## Limitations

- Log-based metrics are a stopgap; suggest real metrics/tracing (OpenTelemetry) when the user needs latency percentiles.
- Retrofitting correlation IDs across async boundaries (queues, cron) requires message-level propagation the skill must implement per broker.
Loading