vt-c-markdown-diagram-processor¶

Extracts Mermaid diagram code blocks from markdown files, manages the diagram-to-image conversion workflow, and rebuilds markdown with properly scaled image references. Applies smart scaling based on image dimensions (height>700px gets 60% width, height>600px gets 70%, etc.). Use when processing markdown documentation that contains Mermaid code blocks that need to be converted to embedded images for PDF/Word output.

Plugin: core-standards
Category: Documentation Pipeline
Command: /vt-c-markdown-diagram-processor

Markdown Diagram Processor¶

Overview¶

This skill handles the extraction of Mermaid diagrams from markdown files and rebuilding markdown with image references. It implements smart scaling logic to ensure diagrams display properly in final documents.

Core Functions¶

1. Extract Diagrams from Markdown¶

python scripts/extract_diagrams.py \
  --input docs/markdown/ \
  --output docs/markdown_mermaid/mermaid/ \
  --manifest extraction_manifest.json

What it does: - Scans markdown files for ```mermaid code blocks - Extracts each diagram to separate .mmd file - Creates manifest with metadata - Preserves diagram order and source references

Output: - .mmd files: {source}_{NN}_{type}.mmd - Manifest JSON with extraction details

2. Rebuild Markdown with Images¶

python scripts/rebuild_with_images.py \
  --input docs/markdown/ \
  --output docs/markdown_mermaid/ \
  --images-dir images/

What it does: - Replaces ```mermaid blocks with image references - Applies smart scaling attributes - Maintains all other markdown content - Validates no mermaid blocks remain

Output: - Modified markdown files with ![](images/diagram.png){width=70%} references

Smart Scaling Logic¶

Automatic Dimension-Based Scaling¶

The skill analyzes generated PNG dimensions and applies appropriate scaling:

if height > 700:
    scaling = "{width=60%}"
elif height > 600:
    scaling = "{width=70%}"
elif width > 600:
    scaling = "{width=70%}"
else:
    scaling = "{width=80%}"

Why this works: - Tall diagrams (>700px height): Reduced to 60% to fit pages - Medium diagrams (>600px): 70% width for readability - Wide diagrams: 70% width to prevent overflow - Small diagrams: 80% width for clarity

Manual Scaling Override¶

python scripts/rebuild_with_images.py \
  --scaling custom \
  --scaling-config scaling-rules.json

Custom scaling rules:

{
  "flowchart": {"width": "75%"},
  "sequence": {"width": "85%"},
  "class": {"width": "65%"},
  "default": {"width": "70%"}
}

Workflow Integration¶

Complete Pipeline¶

# Step 1: Extract diagrams
python scripts/extract_diagrams.py \
  --input docs/markdown/ \
  --output docs/markdown_mermaid/mermaid/

# Step 2: (External) Generate images with mermaid-to-images skill

# Step 3: Rebuild markdown with image references
python scripts/rebuild_with_images.py \
  --input docs/markdown/ \
  --output docs/markdown_mermaid/ \
  --images-dir images/

# Step 4: Validate
python scripts/validate_rebuild.py \
  --input docs/markdown_mermaid/

Validation¶

Mandatory Validation Checks¶

python scripts/validate_rebuild.py --input docs/markdown_mermaid/

Checks: - ✓ No ```mermaid blocks remain - ✓ All image references valid - ✓ All images have scaling attributes - ✓ Image files exist - ✓ No broken links

Output:

{
  "status": "pass",
  "files_checked": 11,
  "mermaid_blocks_found": 0,
  "images_without_scaling": 0,
  "broken_image_links": 0,
  "issues": []
}

File Naming Convention¶

Extracted Diagram Files¶

Pattern: {source}_{index:02d}_{type}.mmd

Examples: - architecture_01_flowchart.mmd - architecture_02_sequence.mmd - api-guide_01_flowchart.mmd

Image Files¶

Same naming as .mmd but with .png extension: - architecture_01_flowchart.png - architecture_02_sequence.png

Manifest Format¶

Extraction Manifest¶

{
  "extraction_timestamp": "2025-10-23T15:30:00Z",
  "source_directory": "docs/markdown",
  "output_directory": "docs/markdown_mermaid/mermaid",
  "total_files_processed": 11,
  "total_diagrams_extracted": 35,
  "files": [
    {
      "source_file": "architecture.md",
      "diagrams_found": 3,
      "diagrams": [
        {
          "index": 1,
          "type": "flowchart",
          "output_file": "mermaid/architecture_01_flowchart.mmd",
          "line_start": 45,
          "line_end": 67,
          "first_line": "flowchart TD"
        }
      ]
    }
  ]
}

Integration Points¶

Works With¶

mermaid-diagrams-branded: Can apply branding to extracted diagrams
mermaid-to-images: Provides .mmd files for conversion
document-converter-branded: Supplies markdown with embedded images
docs-pipeline-orchestrator: Core component of full pipeline

Input¶

Markdown files with ```mermaid code blocks
Located in specified input directory

Output¶

Extracted .mmd files in mermaid/ subdirectory
Rebuilt markdown files with image references
Manifest JSON files for traceability

Error Handling¶

Common Issues¶

Issue: Malformed Mermaid Blocks

<!-- Wrong -->
```mermaid
flowchart TD
    Missing closing backticks

<!-- Correct -->
```mermaid
flowchart TD
    Start --> End

**Solution:**
```bash
python scripts/validate_syntax.py --input docs/markdown/problematic.md

Issue: UTF-8 Encoding Problems

# Auto-sanitize files
python scripts/sanitize_utf8.py --input docs/markdown/ --fix

Issue: Image Path Mismatch

Ensure images/ directory is relative to markdown output location:

docs/markdown_mermaid/
├── architecture.md         # Contains: ![](images/arch_01.png)
└── images/
    └── arch_01.png        # Must be in this location!

Best Practices¶

✅ Do¶

Extract to clean directory structure
Validate after each step
Use manifest files for traceability
Apply smart scaling (don't hardcode)
Check UTF-8 encoding before processing

❌ Don't¶

Mix mermaid blocks and image references in same file
Skip validation (causes downstream issues)
Hardcode absolute image paths
Modify .mmd files after extraction (use mermaid-diagrams-branded)
Process without backing up original markdown

File Structure¶

~/.claude/skills/vt-c-markdown-diagram-processor/
├── SKILL.md (this file)
└── scripts/
    ├── extract_diagrams.py         # Extract mermaid blocks
    ├── rebuild_with_images.py      # Replace with image refs
    ├── validate_rebuild.py         # Validation checks
    ├── sanitize_utf8.py            # Encoding fixes
    └── analyze_dimensions.py       # Smart scaling logic

Version History¶

v1.0 (2025-10-23): Initial release
Smart scaling based on dimensions
UTF-8 sanitization
Comprehensive validation

References¶

Smart scaling algorithm: Based on subagents implementation
Markdown image syntax: CommonMark + Pandoc attributes
UTF-8 handling: Python codecs module