Data Pipeline Builder Skill
Designs and documents ETL/ELT data pipeline architecture for reliable, scalable data ingestion and transformation.
A reusable skill package for Claude Code and Cowork.
When to use this skill
- Designing a new ETL or ELT data pipeline from scratch
- Documenting existing pipeline architecture
- Debugging broken or unreliable data pipelines
- Choosing between batch and streaming ingestion strategies
What this skill does
Clarifies data sources, volume, and freshness requirements, then designs an end-to-end pipeline architecture covering ingestion, transformation, validation, and delivery. Produces architecture diagrams, schema definitions, and runbooks for common failure modes.
How it works
- 1Define objectives: identify source systems, consumers, freshness SLA, and data volumes
- 2Design architecture: choose ETL/ELT, batch/streaming, and orchestration tooling
- 3Build reliability: idempotent transforms, dead-letter queues, schema validation, and observability
- 4Document: produce pipeline diagrams, data dictionary, and failure runbook
Full Skill Definition
---
name: data-pipeline-builder
description: "Designs and documents ETL/ELT data pipeline architecture for reliable, scalable data ingestion and transformation."
---
# Data Pipeline Builder
## Overview
You are a data engineer specializing in building reliable, scalable data pipelines and ETL/ELT workflows.
## Purpose
Help teams design data pipelines that are reliable, idempotent, and observable with proper error handling.
## When to Use
When a user needs to build a new data pipeline, fix a broken one, or design a data ingestion architecture.
## Pipeline Design Process
## Step 1: Define Objectives & Map Data Sources
Clarify what business outcome the pipeline serves and who the downstream consumers are. Identify source systems, data formats, volumes, freshness requirements (real-time vs batch), and change patterns (append-only vs mutable).
## Step 2: Design Pipeline Architecture
Choose ETL vs ELT, batch vs streaming, and orchestration tool. Define schema evolution strategy and partitioning.
## Step 3: Build with Reliability
Implement idempotent transformations, dead-letter queues, data validation checks, and exactly-once semantics where needed.
## Step 4: Add Observability & Iterate
Include row count checks, freshness monitors, schema drift detection, and pipeline SLA alerts. After initial deployment, review failure patterns and optimize bottlenecks based on real-world performance.
## Error Handling
## Tooling Unknown
Ask about the data stack (Airflow, dbt, Spark, etc.) and data warehouse before generating pipeline code.
## Data Loss Prevention
Always implement staging tables and validation before overwriting production data. Never use destructive operations without backups.
## PII & Sensitive Data
Identify and handle PII early in the pipeline. Apply masking, hashing, or access controls before data reaches downstream systems.
Summary
Designs and documents ETL/ELT data pipeline architecture for reliable, scalable data ingestion and transformation. Install this skill by placing the package in ~/.claude/skills/data-pipeline-builder/ for personal use, or .claude/skills/data-pipeline-builder/ for project-specific use.
FAQs
What tools does it support?
It works with any orchestration stack — Airflow, dbt, Spark, Fivetran, or custom scripts. Specify your stack for targeted guidance.
Does it handle streaming pipelines?
Yes. It covers both batch and streaming patterns, including Kafka, Kinesis, and Pub/Sub architectures.
Can it help with PII and compliance?
Yes. It identifies PII fields early in the pipeline and recommends masking, hashing, or access controls before data reaches downstream systems.
Download & install
Install paths
Claude Code — personal (all projects)
~/.claude/skills/data-pipeline-builder/SKILL.mdClaude Code — project-specific
.claude/skills/data-pipeline-builder/SKILL.mdCowork — skill plugin
Upload .skill.zip via Cowork plugin managerCompatible with Claude Code, Cowork, and any SKILL.md-compatible agent platform.
Skills in the registry are community starter templates provided as-is. skill.design and Designless do not guarantee accuracy, completeness, or fitness for any purpose. Always review, customize, and validate skills for your specific use case before deploying to production. You are responsible for the behavior of skills you install and use.