4.2 KiB

Raw Blame History

Lesson 2: Dual Embeddings

Module 19: Design Space | Time: 8 min

Why Two Embeddings?

When you describe a hero section as "dark navy background, centered white heading, coral CTA button," that description could match hundreds of designs. But they all look different — different fonts, different spacing, different imagery, different moods.

Text (semantic) embeddings capture meaning — what the design is about. Visual embeddings capture appearance — what the design looks like.

Together they find patterns that either alone would miss.

Semantic Embeddings (1536d)

Generated by OpenRouter using text-embedding-3-small. Takes the text description and produces a 1536-dimensional vector that represents its meaning.

What it catches:

Conceptual similarity: "trust section with testimonials" matches "social proof area with client quotes"
Design principles: "breathing room between sections" matches "generous whitespace for visual calm"
Pattern descriptions: "card grid with hover effect" matches "interactive card layout with motion"

What it misses:

Visual style: two "minimalist hero sections" could look completely different
Aesthetic quality: a well-designed card and a poorly-designed card might have identical descriptions
Color harmony: "navy and coral" is semantically similar to "navy and red" but aesthetically different

When to Use Semantic Search

search_space({
  query: "mobile navigation for service sites with 4-6 actions"
})

Use when you're looking for conceptual patterns — approaches, solutions, principles.

Visual Embeddings (1024d)

Generated by Voyage AI using voyage-multimodal-3. Takes a screenshot and produces a 1024-dimensional vector that represents its visual appearance.

What it catches:

Layout similarity: two designs with the same grid structure match even if described differently
Color harmony: designs with similar palettes cluster together
Typography feel: designs with similar heading weights and sizes match
Compositional patterns: similar visual hierarchy, similar white space distribution

What it misses:

Intent and reasoning: why the design was made this way
Context: which project, which persona, which business goal
Transferability: whether the pattern works in other contexts

When to Use Visual Search

search_visual_similarity({
  image_base64: "[screenshot of your design]"
})

Use when you're looking for aesthetic matches — designs that look like what you're making.

Dual Search in Practice

Site Analysis

During site analysis, every section gets both embeddings:

Semantic: "Hero section with full-width navy background, centered Rubik Light heading at 48px, coral CTA with generous padding, confident calm tone"
Visual: Screenshot of the actual hero

Later, an agent designing a new hero can search both ways:

"What hero patterns work for professional service sites?" (semantic)
"Find designs that look like this screenshot" (visual)

Feedback Loop

When the designer improves a design, both the before and after states get dual embeddings. This means the proactive improvement check works two ways:

"This design description sounds like something we improved before" (semantic)
"This design looks like something we improved before" (visual)

The Rate Limit Reality

Visual embeddings via Voyage AI have rate limits:

Free tier (no payment method): 3 requests per minute
Free tier (with payment method): Standard rate limits with 200M free tokens

In practice, this means waiting 25 seconds between visual captures. This constraint actually helps — the forced pause creates time for writing more thoughtful descriptions.

Even on a paid tier, don't batch-capture without writing good descriptions. The semantic embedding is only as good as the text you give it.

Key Takeaway

Semantic search finds designs that mean the same thing. Visual search finds designs that look the same. Together they catch patterns that either alone would miss. Always capture both when screenshots are available.

← Lesson 1 | Next: Lesson 3 →

4.2 KiB Raw Blame History