Init
This commit is contained in:
287
AGENTS.md
Normal file
287
AGENTS.md
Normal file
@@ -0,0 +1,287 @@
|
|||||||
|
# AGENTS.md
|
||||||
|
|
||||||
|
## Purpose
|
||||||
|
This agent is used for **website QA/testing** of real pages in a browser environment.
|
||||||
|
|
||||||
|
The testing must be performed with **Playwright**.
|
||||||
|
Visual page inspection must be performed with the **playwright-screenshot-inspector** skill.
|
||||||
|
The agent must check the **real rendered page**, not just raw HTML.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Required tools and skills
|
||||||
|
- **Playwright** — for browser automation, page opening, interaction, form submission, and response inspection.
|
||||||
|
- **playwright-screenshot-inspector** — for screenshot-based visual inspection of the rendered page.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## General testing rules
|
||||||
|
1. Always test the **real live page URL** provided by the user.
|
||||||
|
2. Always test both:
|
||||||
|
- **Desktop / PC viewport**
|
||||||
|
- **Mobile viewport**
|
||||||
|
3. Always capture screenshots for both desktop and mobile.
|
||||||
|
4. Compare the actual rendered page against the screen/reference provided by the user.
|
||||||
|
5. **All interactive UI elements must be tested manually via interaction.**
|
||||||
|
6. Report all findings clearly and classify issues by severity:
|
||||||
|
- **Critical**
|
||||||
|
- **Major**
|
||||||
|
- **Minor**
|
||||||
|
7. Do not stop after the first issue. Complete the full checklist.
|
||||||
|
8. At the end of testing, create a **summary HTML file** with all findings.
|
||||||
|
9. All results and summaries sould be in Russian language.
|
||||||
|
10. Split form submission for 2 parts
|
||||||
|
First is forms submition summary results and accessability in table form
|
||||||
|
Second is details in hidden layout
|
||||||
|
|
||||||
|
### Configuration-driven page selection
|
||||||
|
- Use `pages.json` in the workspace root as the default source of primary pages and visual references.
|
||||||
|
- If the user asks to **just check the site** without specifying a single page, run the full list from `pages.json`.
|
||||||
|
- If the user provides a **specific page URL**, first prepare to test that exact URL.
|
||||||
|
- In that case, ask one short question: whether to also test the primary pages from `pages.json`.
|
||||||
|
- If the user answers **yes**, test:
|
||||||
|
- the explicit URL from the user request
|
||||||
|
- the pages from `pages.json`
|
||||||
|
- Deduplicate identical URLs before running checks.
|
||||||
|
- If the user answers **no**, test only the explicit URL from the request.
|
||||||
|
- For visual comparison, use the reference data from `pages.json` when present:
|
||||||
|
- `desktop_screenshot`
|
||||||
|
- `mobile_screenshot`
|
||||||
|
- `figma_url`
|
||||||
|
- If a page in `pages.json` has no reference assets yet, still test the real page and clearly note that visual comparison is limited by missing reference materials.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Required order of checks
|
||||||
|
|
||||||
|
### 1. Visual page and UI interaction check
|
||||||
|
The user will provide:
|
||||||
|
- a page URL
|
||||||
|
- a reference screen/mockup/screenshot
|
||||||
|
|
||||||
|
#### 1.1 Open and inspect the real page
|
||||||
|
- Open the provided page in Playwright.
|
||||||
|
- Wait until the page is fully loaded and visually stable.
|
||||||
|
- Test in:
|
||||||
|
- Desktop viewport
|
||||||
|
- Mobile viewport
|
||||||
|
|
||||||
|
#### 1.2 Capture screenshots
|
||||||
|
- Take full-page screenshots for both desktop and mobile.
|
||||||
|
- Use **playwright-screenshot-inspector** to inspect the screenshots.
|
||||||
|
|
||||||
|
#### 1.3 Compare with the provided reference
|
||||||
|
Check for differences between:
|
||||||
|
- the real page
|
||||||
|
- the provided screen/reference
|
||||||
|
|
||||||
|
Look for:
|
||||||
|
- incorrect layout structure
|
||||||
|
- broken blocks
|
||||||
|
- missing elements
|
||||||
|
- different text/content placement
|
||||||
|
- wrong fonts, sizes, spacing, alignment
|
||||||
|
- overlap or clipping
|
||||||
|
- responsive/mobile issues
|
||||||
|
- hidden or misplaced elements
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 1.4 UI interaction testing (MANDATORY)
|
||||||
|
If any interactive UI elements are present, they **must be tested**.
|
||||||
|
|
||||||
|
This includes (but is not limited to):
|
||||||
|
- buttons
|
||||||
|
- links (especially JS-triggered)
|
||||||
|
- popup / modal windows
|
||||||
|
- forms opened via buttons
|
||||||
|
- dropdowns
|
||||||
|
- accordions
|
||||||
|
- tabs
|
||||||
|
- sliders/carousels
|
||||||
|
- burger menus (mobile)
|
||||||
|
- tooltips
|
||||||
|
- filters/sort controls
|
||||||
|
|
||||||
|
For each UI element:
|
||||||
|
- trigger the interaction (click, hover, input, etc.)
|
||||||
|
- verify it works as expected
|
||||||
|
- verify correct content is shown
|
||||||
|
- verify no visual breakage occurs
|
||||||
|
- verify no JS errors appear
|
||||||
|
- verify behavior on both desktop and mobile
|
||||||
|
|
||||||
|
Check for:
|
||||||
|
- element not clickable
|
||||||
|
- no reaction on click
|
||||||
|
- wrong content shown
|
||||||
|
- broken animation or layout
|
||||||
|
- popup not opening/closing
|
||||||
|
- overlay issues (scroll lock, z-index bugs)
|
||||||
|
- elements hidden or cut off on mobile
|
||||||
|
|
||||||
|
Any broken interaction must be reported.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 1.5 Produce visual + UI summary
|
||||||
|
Create an **HTML summary file** containing:
|
||||||
|
- page URL
|
||||||
|
- test date/time
|
||||||
|
- desktop screenshot
|
||||||
|
- mobile screenshot
|
||||||
|
- list of visual differences
|
||||||
|
- list of UI interaction issues
|
||||||
|
- severity for each issue
|
||||||
|
- short conclusion
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 2. Server response and robots/indexing checks
|
||||||
|
|
||||||
|
#### 2.1 Check HTTP response
|
||||||
|
Inspect:
|
||||||
|
- HTTP status code
|
||||||
|
- redirects
|
||||||
|
- final URL
|
||||||
|
- response headers
|
||||||
|
|
||||||
|
#### 2.2 Check for critical indexing blockers
|
||||||
|
These are **Critical** if found unexpectedly:
|
||||||
|
- `X-Robots-Tag: noindex`
|
||||||
|
- `X-Robots-Tag: nofollow`
|
||||||
|
- `meta name="robots" content="noindex"`
|
||||||
|
- `meta name="robots" content="nofollow"`
|
||||||
|
- restrictive `googlebot` meta
|
||||||
|
- incorrect canonical
|
||||||
|
- unexpected redirects
|
||||||
|
|
||||||
|
Any directive blocking indexing must be marked **Critical**.
|
||||||
|
|
||||||
|
#### 2.3 Check meta and SEO basics
|
||||||
|
Inspect:
|
||||||
|
- `<title>`
|
||||||
|
- meta description
|
||||||
|
- robots meta tag
|
||||||
|
- canonical
|
||||||
|
- viewport
|
||||||
|
- hreflang (if present)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 3. Form testing
|
||||||
|
|
||||||
|
#### 3.1 Detect forms
|
||||||
|
Find all forms including:
|
||||||
|
- visible forms
|
||||||
|
- popup forms
|
||||||
|
- dynamically opened forms
|
||||||
|
|
||||||
|
#### 3.2 Test submission
|
||||||
|
For each form:
|
||||||
|
- fill required fields with valid data
|
||||||
|
- submit form
|
||||||
|
- observe frontend validation
|
||||||
|
- inspect server/network response
|
||||||
|
|
||||||
|
#### 3.3 Validate behavior
|
||||||
|
Check:
|
||||||
|
- submit works
|
||||||
|
- validation messages appear correctly
|
||||||
|
- incorrect input shows errors
|
||||||
|
- success message is shown
|
||||||
|
- no JS errors
|
||||||
|
- no duplicate requests
|
||||||
|
- backend response is correct
|
||||||
|
|
||||||
|
#### 3.4 Report results
|
||||||
|
For each form include:
|
||||||
|
- selector/name
|
||||||
|
- fields
|
||||||
|
- result (success / error / broken)
|
||||||
|
- server response
|
||||||
|
- severity
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Output format
|
||||||
|
|
||||||
|
### 1. Text summary
|
||||||
|
- overall result
|
||||||
|
- issue count by severity
|
||||||
|
- visual correctness
|
||||||
|
- UI interaction status
|
||||||
|
- indexing status
|
||||||
|
- form functionality status
|
||||||
|
|
||||||
|
### 2. HTML report
|
||||||
|
Must include:
|
||||||
|
- URL
|
||||||
|
- screenshots (desktop + mobile)
|
||||||
|
- visual comparison
|
||||||
|
- UI interaction results
|
||||||
|
- SEO/indexing checks
|
||||||
|
- form testing results
|
||||||
|
- issue list with severity
|
||||||
|
- conclusion
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Severity rules
|
||||||
|
|
||||||
|
### Critical
|
||||||
|
- noindex / nofollow / X-Robots-Tag blocking indexing
|
||||||
|
- page inaccessible
|
||||||
|
- wrong redirect
|
||||||
|
- core UI not working
|
||||||
|
- key buttons or popups broken
|
||||||
|
- key form not submitting
|
||||||
|
|
||||||
|
### Major
|
||||||
|
- broken UI interactions
|
||||||
|
- major layout mismatch
|
||||||
|
- mobile UX broken
|
||||||
|
- validation issues
|
||||||
|
- backend form errors
|
||||||
|
|
||||||
|
### Minor
|
||||||
|
- small visual differences
|
||||||
|
- spacing/alignment issues
|
||||||
|
- minor UI glitches
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Example workflow
|
||||||
|
1. Open page (desktop)
|
||||||
|
2. Screenshot
|
||||||
|
3. Open page (mobile)
|
||||||
|
4. Screenshot
|
||||||
|
5. Inspect with playwright-screenshot-inspector
|
||||||
|
6. Compare with reference
|
||||||
|
7. Test ALL UI interactions (buttons, popups, tabs, etc.)
|
||||||
|
8. Check response headers and meta tags
|
||||||
|
9. Detect and test all forms
|
||||||
|
10. Generate HTML report
|
||||||
|
11. Return summary
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Important constraints
|
||||||
|
- Do not skip UI interaction testing
|
||||||
|
- Do not skip mobile
|
||||||
|
- Do not skip screenshots
|
||||||
|
- Do not skip form submissions
|
||||||
|
- Always treat indexing blockers as **Critical**
|
||||||
|
- Always generate HTML report
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Example instruction for the agent
|
||||||
|
Test the provided page using Playwright.
|
||||||
|
Use playwright-screenshot-inspector for visual validation.
|
||||||
|
|
||||||
|
Follow this order:
|
||||||
|
1. Compare real page vs reference (desktop + mobile) and record differences.
|
||||||
|
2. Test ALL UI elements (buttons, popups, accordions, tabs, etc.).
|
||||||
|
3. Check server response, meta tags, and indexing blockers (Critical if present).
|
||||||
|
4. Find and submit all forms, validate responses and errors.
|
||||||
68
pages.json
Normal file
68
pages.json
Normal file
@@ -0,0 +1,68 @@
|
|||||||
|
{
|
||||||
|
"version": 1,
|
||||||
|
"description": "Primary pages and visual references for default website QA runs.",
|
||||||
|
"sites": [
|
||||||
|
{
|
||||||
|
"site_id": "prombez-good-production",
|
||||||
|
"site_name": "Prombez Good Production",
|
||||||
|
"base_url": "https://prombez.cp.good-production.xyz",
|
||||||
|
"pages": [
|
||||||
|
{
|
||||||
|
"id": "home",
|
||||||
|
"title": "Главная",
|
||||||
|
"url": "https://prombez.cp.good-production.xyz/",
|
||||||
|
"reference": {
|
||||||
|
"desktop_screenshot": null,
|
||||||
|
"mobile_screenshot": null,
|
||||||
|
"figma_url": null,
|
||||||
|
"notes": "Fill in approved reference assets for homepage."
|
||||||
|
}
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": "test-search-landing",
|
||||||
|
"title": "Лендинг test-search",
|
||||||
|
"url": "https://prombez.cp.good-production.xyz/test-search/",
|
||||||
|
"reference": {
|
||||||
|
"desktop_screenshot": null,
|
||||||
|
"mobile_screenshot": null,
|
||||||
|
"figma_url": null,
|
||||||
|
"notes": "Use the approved desktop/mobile references for this landing page when available."
|
||||||
|
}
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": "products",
|
||||||
|
"title": "Каталог продуктов",
|
||||||
|
"url": "https://prombez.cp.good-production.xyz/prod/",
|
||||||
|
"reference": {
|
||||||
|
"desktop_screenshot": null,
|
||||||
|
"mobile_screenshot": null,
|
||||||
|
"figma_url": null,
|
||||||
|
"notes": "Fill in approved reference assets for the products page."
|
||||||
|
}
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": "history",
|
||||||
|
"title": "Истории успеха",
|
||||||
|
"url": "https://prombez.cp.good-production.xyz/history/",
|
||||||
|
"reference": {
|
||||||
|
"desktop_screenshot": null,
|
||||||
|
"mobile_screenshot": null,
|
||||||
|
"figma_url": null,
|
||||||
|
"notes": "Fill in approved reference assets for the success stories page."
|
||||||
|
}
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": "contacts",
|
||||||
|
"title": "Контакты",
|
||||||
|
"url": "https://prombez.cp.good-production.xyz/company/contacts/",
|
||||||
|
"reference": {
|
||||||
|
"desktop_screenshot": null,
|
||||||
|
"mobile_screenshot": null,
|
||||||
|
"figma_url": null,
|
||||||
|
"notes": "Fill in approved reference assets for the contacts page."
|
||||||
|
}
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
434
skills/playwright-screenshot-inspector/SKILL.md
Normal file
434
skills/playwright-screenshot-inspector/SKILL.md
Normal file
@@ -0,0 +1,434 @@
|
|||||||
|
---
|
||||||
|
name: playwright-screenshot-inspector
|
||||||
|
description: LLM-powered visual testing expert for automated screenshot capture, analysis, and UI verification using Playwright with multimodal AI inspection.
|
||||||
|
metadata:
|
||||||
|
category: Testing
|
||||||
|
tags:
|
||||||
|
- playwright
|
||||||
|
- visual-testing
|
||||||
|
- screenshots
|
||||||
|
- ui-verification
|
||||||
|
- automation
|
||||||
|
pairs-with:
|
||||||
|
- skill: playwright-e2e-tester
|
||||||
|
reason: Visual regression testing extends E2E test suites with screenshot comparison
|
||||||
|
- skill: webapp-testing
|
||||||
|
reason: Screenshot inspection automates the visual verification that interactive testing does manually
|
||||||
|
- skill: color-contrast-auditor
|
||||||
|
reason: Automated screenshot analysis can detect contrast violations across UI states
|
||||||
|
---
|
||||||
|
|
||||||
|
# Playwright Screenshot Inspector
|
||||||
|
|
||||||
|
LLM-powered visual testing expert for automated screenshot capture, analysis, and UI verification using Playwright with multimodal AI inspection.
|
||||||
|
|
||||||
|
## Activation Triggers
|
||||||
|
|
||||||
|
**Activate on:**
|
||||||
|
- "screenshot test", "visual test", "screenshot inspection"
|
||||||
|
- "playwright headless", "playwright screenshot"
|
||||||
|
- "UI verification", "visual regression"
|
||||||
|
- "theme compliance test", "dark mode test", "light mode test"
|
||||||
|
- "automated screenshot", "capture and analyze"
|
||||||
|
- "compare screenshots", "visual diff"
|
||||||
|
|
||||||
|
**NOT for:**
|
||||||
|
- Simple one-off screenshots (use browser DevTools)
|
||||||
|
- Pixel-perfect comparison without AI (use native Playwright `toHaveScreenshot`)
|
||||||
|
- Non-web UI testing (use platform-specific tools)
|
||||||
|
- Performance testing (use Lighthouse/WebPageTest)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Core Philosophy
|
||||||
|
|
||||||
|
Traditional visual testing compares pixels. **LLM-powered visual testing understands semantics.**
|
||||||
|
|
||||||
|
Instead of "these 50 pixels changed", LLM inspection answers:
|
||||||
|
- "Is the content actually rendered?"
|
||||||
|
- "Does the theme switch correctly?"
|
||||||
|
- "Are interactive elements visible and properly styled?"
|
||||||
|
- "What's broken vs. what's just different?"
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## The Screenshot Inspection Loop
|
||||||
|
|
||||||
|
```
|
||||||
|
┌─────────────────────────────────────────────────────────────┐
|
||||||
|
│ LLM SCREENSHOT INSPECTION │
|
||||||
|
├─────────────────────────────────────────────────────────────┤
|
||||||
|
│ │
|
||||||
|
│ 1. CAPTURE (Playwright) │
|
||||||
|
│ └─► Wait for React hydration, not just network │
|
||||||
|
│ │
|
||||||
|
│ 2. READ (Claude vision) │
|
||||||
|
│ └─► Pass screenshot to LLM with specific questions │
|
||||||
|
│ │
|
||||||
|
│ 3. ANALYZE (Structured response) │
|
||||||
|
│ └─► Extract: content present? theme correct? errors? │
|
||||||
|
│ │
|
||||||
|
│ 4. ACT (Conditional logic) │
|
||||||
|
│ └─► Pass/fail based on semantic understanding │
|
||||||
|
│ │
|
||||||
|
└─────────────────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Critical: Waiting for React Content
|
||||||
|
|
||||||
|
**The #1 failure mode**: Taking screenshots before React hydrates.
|
||||||
|
|
||||||
|
### Anti-Pattern: Network Idle Alone
|
||||||
|
```python
|
||||||
|
# ❌ WRONG - React may not have rendered yet
|
||||||
|
page.goto(url)
|
||||||
|
page.wait_for_load_state('networkidle')
|
||||||
|
page.screenshot(path='broken.png') # Often blank!
|
||||||
|
```
|
||||||
|
|
||||||
|
### Correct Pattern: Wait for Actual Content
|
||||||
|
```python
|
||||||
|
# ✅ CORRECT - Wait for React to mount
|
||||||
|
page.goto(url, wait_until='domcontentloaded')
|
||||||
|
page.wait_for_load_state('networkidle')
|
||||||
|
|
||||||
|
# Give React time to hydrate
|
||||||
|
import time
|
||||||
|
time.sleep(0.5)
|
||||||
|
|
||||||
|
# Wait for actual content selector
|
||||||
|
page.wait_for_selector('.main-content, h1, [data-testid="app"]',
|
||||||
|
state='visible',
|
||||||
|
timeout=10000)
|
||||||
|
|
||||||
|
# Verify content exists
|
||||||
|
body_text = page.locator('body').inner_text()
|
||||||
|
if len(body_text) < 50:
|
||||||
|
time.sleep(2) # Extra wait for slow hydration
|
||||||
|
|
||||||
|
page.screenshot(path='good.png', full_page=True)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Content Verification Function
|
||||||
|
```python
|
||||||
|
def wait_for_react_content(page, selectors, timeout=10000):
|
||||||
|
"""Wait for React to hydrate by checking for actual content."""
|
||||||
|
page.wait_for_load_state('domcontentloaded')
|
||||||
|
page.wait_for_load_state('networkidle')
|
||||||
|
time.sleep(0.5) # React hydration buffer
|
||||||
|
|
||||||
|
for selector in selectors.split(','):
|
||||||
|
try:
|
||||||
|
locator = page.locator(selector.strip())
|
||||||
|
if locator.count() > 0:
|
||||||
|
locator.first.wait_for(state='visible', timeout=timeout)
|
||||||
|
return True
|
||||||
|
except:
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Fallback: wait for substantial body content
|
||||||
|
try:
|
||||||
|
page.wait_for_function(
|
||||||
|
'document.body.innerText.length > 100',
|
||||||
|
timeout=timeout
|
||||||
|
)
|
||||||
|
return True
|
||||||
|
except:
|
||||||
|
return False
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Headless Mode: Preventing Window Spam
|
||||||
|
|
||||||
|
**Always use `headless=True`** to prevent browser windows from spawning:
|
||||||
|
|
||||||
|
```python
|
||||||
|
from playwright.sync_api import sync_playwright
|
||||||
|
|
||||||
|
with sync_playwright() as p:
|
||||||
|
# CRITICAL: headless=True prevents visible browser windows
|
||||||
|
browser = p.chromium.launch(headless=True)
|
||||||
|
|
||||||
|
context = browser.new_context(
|
||||||
|
viewport={'width': 1280, 'height': 800},
|
||||||
|
color_scheme='dark' # Initial theme
|
||||||
|
)
|
||||||
|
page = context.new_page()
|
||||||
|
|
||||||
|
# ... your test logic ...
|
||||||
|
|
||||||
|
browser.close() # Always clean up
|
||||||
|
```
|
||||||
|
|
||||||
|
### Theme Testing Pattern
|
||||||
|
```python
|
||||||
|
# Dark mode screenshot
|
||||||
|
page.emulate_media(color_scheme='dark') # Note: on PAGE, not context
|
||||||
|
page.goto(url)
|
||||||
|
wait_for_react_content(page, '.app-container, main, h1')
|
||||||
|
page.screenshot(path='dark.png', full_page=True)
|
||||||
|
|
||||||
|
# Light mode screenshot
|
||||||
|
page.emulate_media(color_scheme='light')
|
||||||
|
page.reload()
|
||||||
|
wait_for_react_content(page, '.app-container, main, h1')
|
||||||
|
page.screenshot(path='light.png', full_page=True)
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## LLM Screenshot Analysis Patterns
|
||||||
|
|
||||||
|
### Pattern 1: Content Verification
|
||||||
|
```
|
||||||
|
Prompt: "Analyze this screenshot. Answer:
|
||||||
|
1. Is the main content rendered (not blank/loading)?
|
||||||
|
2. What major UI elements are visible?
|
||||||
|
3. Are there any error states or broken layouts?
|
||||||
|
4. Rate content completeness: FULL / PARTIAL / EMPTY"
|
||||||
|
```
|
||||||
|
|
||||||
|
### Pattern 2: Theme Compliance
|
||||||
|
```
|
||||||
|
Prompt: "This is a {dark/light} mode screenshot. Verify:
|
||||||
|
1. Background color matches expected theme (dark bg for dark mode)
|
||||||
|
2. Text has sufficient contrast against background
|
||||||
|
3. Interactive elements are visible and styled correctly
|
||||||
|
4. No theme leakage (dark elements on light bg or vice versa)"
|
||||||
|
```
|
||||||
|
|
||||||
|
### Pattern 3: Comparison Analysis
|
||||||
|
```
|
||||||
|
Prompt: "Compare these two screenshots (before/after). Identify:
|
||||||
|
1. What changed between them?
|
||||||
|
2. Are changes intentional (theme switch) or bugs?
|
||||||
|
3. Is any content missing in the 'after' version?
|
||||||
|
4. Rate similarity: IDENTICAL / MINOR_DIFF / MAJOR_DIFF / BROKEN"
|
||||||
|
```
|
||||||
|
|
||||||
|
### Pattern 4: Accessibility Check
|
||||||
|
```
|
||||||
|
Prompt: "Evaluate this screenshot for visual accessibility:
|
||||||
|
1. Is text readable (sufficient size and contrast)?
|
||||||
|
2. Are interactive elements clearly identifiable?
|
||||||
|
3. Is there visual hierarchy (headings, sections)?
|
||||||
|
4. Any elements that would fail WCAG contrast requirements?"
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Complete Test Script Template
|
||||||
|
|
||||||
|
```python
|
||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
LLM-Powered Screenshot Test Suite
|
||||||
|
Captures screenshots and uses Claude vision for semantic analysis.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from playwright.sync_api import sync_playwright
|
||||||
|
import os
|
||||||
|
import time
|
||||||
|
|
||||||
|
PAGES_TO_TEST = [
|
||||||
|
# (path, name, content_selectors)
|
||||||
|
('/', 'Home', '.hero, main, h1'),
|
||||||
|
('/about', 'About', '.about-content, main, h1'),
|
||||||
|
('/dashboard', 'Dashboard', '.dashboard, .stats, h1'),
|
||||||
|
]
|
||||||
|
|
||||||
|
BASE_URL = 'http://localhost:5173'
|
||||||
|
SCREENSHOT_DIR = '/tmp/visual-tests'
|
||||||
|
|
||||||
|
|
||||||
|
def wait_for_content(page, selectors, timeout=10000):
|
||||||
|
"""Wait for React/Vue/Svelte to hydrate."""
|
||||||
|
page.wait_for_load_state('domcontentloaded')
|
||||||
|
page.wait_for_load_state('networkidle')
|
||||||
|
time.sleep(0.5)
|
||||||
|
|
||||||
|
for selector in selectors.split(','):
|
||||||
|
try:
|
||||||
|
loc = page.locator(selector.strip())
|
||||||
|
if loc.count() > 0:
|
||||||
|
loc.first.wait_for(state='visible', timeout=timeout)
|
||||||
|
return True
|
||||||
|
except:
|
||||||
|
continue
|
||||||
|
|
||||||
|
try:
|
||||||
|
page.wait_for_function('document.body.innerText.length > 100', timeout=timeout)
|
||||||
|
return True
|
||||||
|
except:
|
||||||
|
return False
|
||||||
|
|
||||||
|
|
||||||
|
def capture_themed_screenshots(page, url, name, selectors):
|
||||||
|
"""Capture both dark and light mode screenshots."""
|
||||||
|
safe_name = name.lower().replace(' ', '-')
|
||||||
|
results = {'name': name, 'url': url}
|
||||||
|
|
||||||
|
for theme in ['dark', 'light']:
|
||||||
|
page.emulate_media(color_scheme=theme)
|
||||||
|
|
||||||
|
if theme == 'dark':
|
||||||
|
page.goto(url, wait_until='domcontentloaded')
|
||||||
|
else:
|
||||||
|
page.reload(wait_until='domcontentloaded')
|
||||||
|
|
||||||
|
content_loaded = wait_for_content(page, selectors)
|
||||||
|
|
||||||
|
if not content_loaded:
|
||||||
|
print(f" ⚠️ {theme} mode: Content slow to load, waiting...")
|
||||||
|
time.sleep(2)
|
||||||
|
|
||||||
|
screenshot_path = f'{SCREENSHOT_DIR}/{safe_name}-{theme}.png'
|
||||||
|
page.screenshot(path=screenshot_path, full_page=True)
|
||||||
|
|
||||||
|
# Check content length
|
||||||
|
body_text = page.locator('body').inner_text().strip()
|
||||||
|
results[f'{theme}_screenshot'] = screenshot_path
|
||||||
|
results[f'{theme}_content_length'] = len(body_text)
|
||||||
|
results[f'{theme}_has_content'] = len(body_text) > 50
|
||||||
|
|
||||||
|
print(f" {theme}: {'✅' if results[f'{theme}_has_content'] else '❌'} ({len(body_text)} chars)")
|
||||||
|
|
||||||
|
return results
|
||||||
|
|
||||||
|
|
||||||
|
def run_tests():
|
||||||
|
"""Run visual tests on all pages."""
|
||||||
|
os.makedirs(SCREENSHOT_DIR, exist_ok=True)
|
||||||
|
|
||||||
|
with sync_playwright() as p:
|
||||||
|
browser = p.chromium.launch(headless=True)
|
||||||
|
context = browser.new_context(
|
||||||
|
viewport={'width': 1280, 'height': 800},
|
||||||
|
color_scheme='dark'
|
||||||
|
)
|
||||||
|
page = context.new_page()
|
||||||
|
|
||||||
|
# Capture console errors
|
||||||
|
errors = []
|
||||||
|
page.on('console', lambda m: errors.append(m.text) if m.type == 'error' else None)
|
||||||
|
|
||||||
|
results = []
|
||||||
|
|
||||||
|
for path, name, selectors in PAGES_TO_TEST:
|
||||||
|
print(f"Testing {name}...")
|
||||||
|
url = f'{BASE_URL}{path}'
|
||||||
|
result = capture_themed_screenshots(page, url, name, selectors)
|
||||||
|
result['errors'] = list(errors)
|
||||||
|
errors.clear()
|
||||||
|
results.append(result)
|
||||||
|
|
||||||
|
browser.close()
|
||||||
|
|
||||||
|
# Summary
|
||||||
|
print("\n" + "=" * 50)
|
||||||
|
print("VISUAL TEST SUMMARY")
|
||||||
|
print("=" * 50)
|
||||||
|
|
||||||
|
passed = sum(1 for r in results
|
||||||
|
if r.get('dark_has_content') and r.get('light_has_content'))
|
||||||
|
print(f"\nPassed: {passed}/{len(results)}")
|
||||||
|
print(f"Screenshots: {SCREENSHOT_DIR}")
|
||||||
|
|
||||||
|
return results
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
run_tests()
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## MCP vs Native Playwright Decision Tree
|
||||||
|
|
||||||
|
```
|
||||||
|
What are you doing?
|
||||||
|
│
|
||||||
|
├─ Interactive debugging / exploring
|
||||||
|
│ └─► Playwright MCP (see live browser)
|
||||||
|
│
|
||||||
|
├─ Automated test suite
|
||||||
|
│ └─► Native Python Playwright (headless)
|
||||||
|
│
|
||||||
|
├─ CI/CD pipeline
|
||||||
|
│ └─► Native Python Playwright (headless)
|
||||||
|
│
|
||||||
|
├─ Screenshot capture for LLM analysis
|
||||||
|
│ └─► Native Python Playwright (headless)
|
||||||
|
│
|
||||||
|
└─ One-off inspection
|
||||||
|
└─► Either works, MCP is convenient
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Common Failures and Fixes
|
||||||
|
|
||||||
|
### Failure: Blank Screenshots
|
||||||
|
**Cause**: Screenshot taken before React hydrates
|
||||||
|
**Fix**: Wait for content selectors, add hydration buffer
|
||||||
|
|
||||||
|
### Failure: "Reconnecting..." Badge Visible
|
||||||
|
**Cause**: HMR/WebSocket not connected (cosmetic in tests)
|
||||||
|
**Fix**: This is often fine - focus on actual content
|
||||||
|
|
||||||
|
### Failure: Theme Not Applied
|
||||||
|
**Cause**: `emulate_media` called on context instead of page
|
||||||
|
**Fix**: Use `page.emulate_media(color_scheme='dark')`
|
||||||
|
|
||||||
|
### Failure: Browser Windows Spawning
|
||||||
|
**Cause**: `headless=False` or using MCP instead of native
|
||||||
|
**Fix**: Use `p.chromium.launch(headless=True)`
|
||||||
|
|
||||||
|
### Failure: Timeout on Content
|
||||||
|
**Cause**: Wrong selectors or page actually broken
|
||||||
|
**Fix**: Verify selectors exist, check console errors
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Integration with Claude Code
|
||||||
|
|
||||||
|
When Claude reads screenshots captured by this pattern:
|
||||||
|
|
||||||
|
1. **Request specific analysis**: Don't just show screenshot - ask targeted questions
|
||||||
|
2. **Provide context**: "This should be dark mode" or "This is the login page"
|
||||||
|
3. **Compare systematically**: Before/after, dark/light, desktop/mobile
|
||||||
|
4. **Trust semantic analysis**: LLM can tell "blank page" from "content loaded"
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## References
|
||||||
|
|
||||||
|
### Research Papers
|
||||||
|
- [Using Vision LLMs For UI Testing](https://courses.cs.washington.edu/courses/cse503/25wi/final-reports/Using%20Vision%20LLMs%20For%20UI%20Testing.pdf) - University of Washington
|
||||||
|
- [Vision-driven Automated Mobile GUI Testing](https://arxiv.org/html/2407.03037v1) - Multimodal LLM approach
|
||||||
|
- [ScreenLLM: Stateful Screen Schema](https://arxiv.org/html/2503.20978v1) - UI understanding framework
|
||||||
|
|
||||||
|
### Tools & Integrations
|
||||||
|
- [Building an AI QA Engineer with Claude + Playwright](https://alexop.dev/posts/building_ai_qa_engineer_claude_code_playwright/)
|
||||||
|
- [AI-Powered Visual Testing in Playwright](https://testrig.medium.com/ai-powered-visual-testing-in-playwright-from-pixels-to-perception-dd3ee49911d5)
|
||||||
|
- [Playwright Visual Regression Testing Guide](https://testgrid.io/blog/playwright-visual-regression-testing/)
|
||||||
|
|
||||||
|
### Official Documentation
|
||||||
|
- [Playwright Visual Comparisons](https://playwright.dev/docs/test-snapshots)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Version History
|
||||||
|
|
||||||
|
- **2026-01-23**: Initial skill creation
|
||||||
|
- Researched multimodal LLM screenshot analysis best practices
|
||||||
|
- Documented React hydration waiting patterns
|
||||||
|
- Added headless mode requirements
|
||||||
|
- Created complete test script template
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
**Core Insight**: The difference between useless and useful screenshot tests is waiting for content, not just network. LLMs can analyze semantics, but only if there's actually content to analyze.
|
||||||
Reference in New Issue
Block a user