Skip to content

vt-c-markdown-diagram-processor

Extracts Mermaid diagram code blocks from markdown files, manages the diagram-to-image conversion workflow, and rebuilds markdown with properly scaled image references. Applies smart scaling based on image dimensions (height>700px gets 60% width, height>600px gets 70%, etc.). Use when processing markdown documentation that contains Mermaid code blocks that need to be converted to embedded images for PDF/Word output.

Plugin: core-standards
Category: Documentation Pipeline
Command: /vt-c-markdown-diagram-processor


Markdown Diagram Processor

Overview

This skill handles the extraction of Mermaid diagrams from markdown files and rebuilding markdown with image references. It implements smart scaling logic to ensure diagrams display properly in final documents.

Core Functions

1. Extract Diagrams from Markdown

python scripts/extract_diagrams.py \
  --input docs/markdown/ \
  --output docs/markdown_mermaid/mermaid/ \
  --manifest extraction_manifest.json

What it does: - Scans markdown files for ```mermaid code blocks - Extracts each diagram to separate .mmd file - Creates manifest with metadata - Preserves diagram order and source references

Output: - .mmd files: {source}_{NN}_{type}.mmd - Manifest JSON with extraction details

2. Rebuild Markdown with Images

python scripts/rebuild_with_images.py \
  --input docs/markdown/ \
  --output docs/markdown_mermaid/ \
  --images-dir images/

What it does: - Replaces ```mermaid blocks with image references - Applies smart scaling attributes - Maintains all other markdown content - Validates no mermaid blocks remain

Output: - Modified markdown files with ![](images/diagram.png){width=70%} references

Smart Scaling Logic

Automatic Dimension-Based Scaling

The skill analyzes generated PNG dimensions and applies appropriate scaling:

if height > 700:
    scaling = "{width=60%}"
elif height > 600:
    scaling = "{width=70%}"
elif width > 600:
    scaling = "{width=70%}"
else:
    scaling = "{width=80%}"

Why this works: - Tall diagrams (>700px height): Reduced to 60% to fit pages - Medium diagrams (>600px): 70% width for readability - Wide diagrams: 70% width to prevent overflow - Small diagrams: 80% width for clarity

Manual Scaling Override

python scripts/rebuild_with_images.py \
  --scaling custom \
  --scaling-config scaling-rules.json

Custom scaling rules:

{
  "flowchart": {"width": "75%"},
  "sequence": {"width": "85%"},
  "class": {"width": "65%"},
  "default": {"width": "70%"}
}

Workflow Integration

Complete Pipeline

# Step 1: Extract diagrams
python scripts/extract_diagrams.py \
  --input docs/markdown/ \
  --output docs/markdown_mermaid/mermaid/

# Step 2: (External) Generate images with mermaid-to-images skill

# Step 3: Rebuild markdown with image references
python scripts/rebuild_with_images.py \
  --input docs/markdown/ \
  --output docs/markdown_mermaid/ \
  --images-dir images/

# Step 4: Validate
python scripts/validate_rebuild.py \
  --input docs/markdown_mermaid/

Validation

Mandatory Validation Checks

python scripts/validate_rebuild.py --input docs/markdown_mermaid/

Checks: - ✓ No ```mermaid blocks remain - ✓ All image references valid - ✓ All images have scaling attributes - ✓ Image files exist - ✓ No broken links

Output:

{
  "status": "pass",
  "files_checked": 11,
  "mermaid_blocks_found": 0,
  "images_without_scaling": 0,
  "broken_image_links": 0,
  "issues": []
}

File Naming Convention

Extracted Diagram Files

Pattern: {source}_{index:02d}_{type}.mmd

Examples: - architecture_01_flowchart.mmd - architecture_02_sequence.mmd - api-guide_01_flowchart.mmd

Image Files

Same naming as .mmd but with .png extension: - architecture_01_flowchart.png - architecture_02_sequence.png

Manifest Format

Extraction Manifest

{
  "extraction_timestamp": "2025-10-23T15:30:00Z",
  "source_directory": "docs/markdown",
  "output_directory": "docs/markdown_mermaid/mermaid",
  "total_files_processed": 11,
  "total_diagrams_extracted": 35,
  "files": [
    {
      "source_file": "architecture.md",
      "diagrams_found": 3,
      "diagrams": [
        {
          "index": 1,
          "type": "flowchart",
          "output_file": "mermaid/architecture_01_flowchart.mmd",
          "line_start": 45,
          "line_end": 67,
          "first_line": "flowchart TD"
        }
      ]
    }
  ]
}

Integration Points

Works With

  • mermaid-diagrams-branded: Can apply branding to extracted diagrams
  • mermaid-to-images: Provides .mmd files for conversion
  • document-converter-branded: Supplies markdown with embedded images
  • docs-pipeline-orchestrator: Core component of full pipeline

Input

  • Markdown files with ```mermaid code blocks
  • Located in specified input directory

Output

  • Extracted .mmd files in mermaid/ subdirectory
  • Rebuilt markdown files with image references
  • Manifest JSON files for traceability

Error Handling

Common Issues

Issue: Malformed Mermaid Blocks

<!-- Wrong -->
```mermaid
flowchart TD
    Missing closing backticks

<!-- Correct -->
```mermaid
flowchart TD
    Start --> End
**Solution:**
```bash
python scripts/validate_syntax.py --input docs/markdown/problematic.md

Issue: UTF-8 Encoding Problems

# Auto-sanitize files
python scripts/sanitize_utf8.py --input docs/markdown/ --fix

Issue: Image Path Mismatch

Ensure images/ directory is relative to markdown output location:

docs/markdown_mermaid/
├── architecture.md         # Contains: ![](images/arch_01.png)
└── images/
    └── arch_01.png        # Must be in this location!

Best Practices

✅ Do

  • Extract to clean directory structure
  • Validate after each step
  • Use manifest files for traceability
  • Apply smart scaling (don't hardcode)
  • Check UTF-8 encoding before processing

❌ Don't

  • Mix mermaid blocks and image references in same file
  • Skip validation (causes downstream issues)
  • Hardcode absolute image paths
  • Modify .mmd files after extraction (use mermaid-diagrams-branded)
  • Process without backing up original markdown

File Structure

~/.claude/skills/vt-c-markdown-diagram-processor/
├── SKILL.md (this file)
└── scripts/
    ├── extract_diagrams.py         # Extract mermaid blocks
    ├── rebuild_with_images.py      # Replace with image refs
    ├── validate_rebuild.py         # Validation checks
    ├── sanitize_utf8.py            # Encoding fixes
    └── analyze_dimensions.py       # Smart scaling logic

Version History

  • v1.0 (2025-10-23): Initial release
  • Smart scaling based on dimensions
  • UTF-8 sanitization
  • Comprehensive validation

References

  • Smart scaling algorithm: Based on subagents implementation
  • Markdown image syntax: CommonMark + Pandoc attributes
  • UTF-8 handling: Python codecs module