Init
This commit is contained in:
287
AGENTS.md
Normal file
287
AGENTS.md
Normal file
@@ -0,0 +1,287 @@
|
||||
# AGENTS.md
|
||||
|
||||
## Purpose
|
||||
This agent is used for **website QA/testing** of real pages in a browser environment.
|
||||
|
||||
The testing must be performed with **Playwright**.
|
||||
Visual page inspection must be performed with the **playwright-screenshot-inspector** skill.
|
||||
The agent must check the **real rendered page**, not just raw HTML.
|
||||
|
||||
---
|
||||
|
||||
## Required tools and skills
|
||||
- **Playwright** — for browser automation, page opening, interaction, form submission, and response inspection.
|
||||
- **playwright-screenshot-inspector** — for screenshot-based visual inspection of the rendered page.
|
||||
|
||||
---
|
||||
|
||||
## General testing rules
|
||||
1. Always test the **real live page URL** provided by the user.
|
||||
2. Always test both:
|
||||
- **Desktop / PC viewport**
|
||||
- **Mobile viewport**
|
||||
3. Always capture screenshots for both desktop and mobile.
|
||||
4. Compare the actual rendered page against the screen/reference provided by the user.
|
||||
5. **All interactive UI elements must be tested manually via interaction.**
|
||||
6. Report all findings clearly and classify issues by severity:
|
||||
- **Critical**
|
||||
- **Major**
|
||||
- **Minor**
|
||||
7. Do not stop after the first issue. Complete the full checklist.
|
||||
8. At the end of testing, create a **summary HTML file** with all findings.
|
||||
9. All results and summaries sould be in Russian language.
|
||||
10. Split form submission for 2 parts
|
||||
First is forms submition summary results and accessability in table form
|
||||
Second is details in hidden layout
|
||||
|
||||
### Configuration-driven page selection
|
||||
- Use `pages.json` in the workspace root as the default source of primary pages and visual references.
|
||||
- If the user asks to **just check the site** without specifying a single page, run the full list from `pages.json`.
|
||||
- If the user provides a **specific page URL**, first prepare to test that exact URL.
|
||||
- In that case, ask one short question: whether to also test the primary pages from `pages.json`.
|
||||
- If the user answers **yes**, test:
|
||||
- the explicit URL from the user request
|
||||
- the pages from `pages.json`
|
||||
- Deduplicate identical URLs before running checks.
|
||||
- If the user answers **no**, test only the explicit URL from the request.
|
||||
- For visual comparison, use the reference data from `pages.json` when present:
|
||||
- `desktop_screenshot`
|
||||
- `mobile_screenshot`
|
||||
- `figma_url`
|
||||
- If a page in `pages.json` has no reference assets yet, still test the real page and clearly note that visual comparison is limited by missing reference materials.
|
||||
|
||||
---
|
||||
|
||||
## Required order of checks
|
||||
|
||||
### 1. Visual page and UI interaction check
|
||||
The user will provide:
|
||||
- a page URL
|
||||
- a reference screen/mockup/screenshot
|
||||
|
||||
#### 1.1 Open and inspect the real page
|
||||
- Open the provided page in Playwright.
|
||||
- Wait until the page is fully loaded and visually stable.
|
||||
- Test in:
|
||||
- Desktop viewport
|
||||
- Mobile viewport
|
||||
|
||||
#### 1.2 Capture screenshots
|
||||
- Take full-page screenshots for both desktop and mobile.
|
||||
- Use **playwright-screenshot-inspector** to inspect the screenshots.
|
||||
|
||||
#### 1.3 Compare with the provided reference
|
||||
Check for differences between:
|
||||
- the real page
|
||||
- the provided screen/reference
|
||||
|
||||
Look for:
|
||||
- incorrect layout structure
|
||||
- broken blocks
|
||||
- missing elements
|
||||
- different text/content placement
|
||||
- wrong fonts, sizes, spacing, alignment
|
||||
- overlap or clipping
|
||||
- responsive/mobile issues
|
||||
- hidden or misplaced elements
|
||||
|
||||
---
|
||||
|
||||
### 1.4 UI interaction testing (MANDATORY)
|
||||
If any interactive UI elements are present, they **must be tested**.
|
||||
|
||||
This includes (but is not limited to):
|
||||
- buttons
|
||||
- links (especially JS-triggered)
|
||||
- popup / modal windows
|
||||
- forms opened via buttons
|
||||
- dropdowns
|
||||
- accordions
|
||||
- tabs
|
||||
- sliders/carousels
|
||||
- burger menus (mobile)
|
||||
- tooltips
|
||||
- filters/sort controls
|
||||
|
||||
For each UI element:
|
||||
- trigger the interaction (click, hover, input, etc.)
|
||||
- verify it works as expected
|
||||
- verify correct content is shown
|
||||
- verify no visual breakage occurs
|
||||
- verify no JS errors appear
|
||||
- verify behavior on both desktop and mobile
|
||||
|
||||
Check for:
|
||||
- element not clickable
|
||||
- no reaction on click
|
||||
- wrong content shown
|
||||
- broken animation or layout
|
||||
- popup not opening/closing
|
||||
- overlay issues (scroll lock, z-index bugs)
|
||||
- elements hidden or cut off on mobile
|
||||
|
||||
Any broken interaction must be reported.
|
||||
|
||||
---
|
||||
|
||||
### 1.5 Produce visual + UI summary
|
||||
Create an **HTML summary file** containing:
|
||||
- page URL
|
||||
- test date/time
|
||||
- desktop screenshot
|
||||
- mobile screenshot
|
||||
- list of visual differences
|
||||
- list of UI interaction issues
|
||||
- severity for each issue
|
||||
- short conclusion
|
||||
|
||||
---
|
||||
|
||||
### 2. Server response and robots/indexing checks
|
||||
|
||||
#### 2.1 Check HTTP response
|
||||
Inspect:
|
||||
- HTTP status code
|
||||
- redirects
|
||||
- final URL
|
||||
- response headers
|
||||
|
||||
#### 2.2 Check for critical indexing blockers
|
||||
These are **Critical** if found unexpectedly:
|
||||
- `X-Robots-Tag: noindex`
|
||||
- `X-Robots-Tag: nofollow`
|
||||
- `meta name="robots" content="noindex"`
|
||||
- `meta name="robots" content="nofollow"`
|
||||
- restrictive `googlebot` meta
|
||||
- incorrect canonical
|
||||
- unexpected redirects
|
||||
|
||||
Any directive blocking indexing must be marked **Critical**.
|
||||
|
||||
#### 2.3 Check meta and SEO basics
|
||||
Inspect:
|
||||
- `<title>`
|
||||
- meta description
|
||||
- robots meta tag
|
||||
- canonical
|
||||
- viewport
|
||||
- hreflang (if present)
|
||||
|
||||
---
|
||||
|
||||
### 3. Form testing
|
||||
|
||||
#### 3.1 Detect forms
|
||||
Find all forms including:
|
||||
- visible forms
|
||||
- popup forms
|
||||
- dynamically opened forms
|
||||
|
||||
#### 3.2 Test submission
|
||||
For each form:
|
||||
- fill required fields with valid data
|
||||
- submit form
|
||||
- observe frontend validation
|
||||
- inspect server/network response
|
||||
|
||||
#### 3.3 Validate behavior
|
||||
Check:
|
||||
- submit works
|
||||
- validation messages appear correctly
|
||||
- incorrect input shows errors
|
||||
- success message is shown
|
||||
- no JS errors
|
||||
- no duplicate requests
|
||||
- backend response is correct
|
||||
|
||||
#### 3.4 Report results
|
||||
For each form include:
|
||||
- selector/name
|
||||
- fields
|
||||
- result (success / error / broken)
|
||||
- server response
|
||||
- severity
|
||||
|
||||
---
|
||||
|
||||
## Output format
|
||||
|
||||
### 1. Text summary
|
||||
- overall result
|
||||
- issue count by severity
|
||||
- visual correctness
|
||||
- UI interaction status
|
||||
- indexing status
|
||||
- form functionality status
|
||||
|
||||
### 2. HTML report
|
||||
Must include:
|
||||
- URL
|
||||
- screenshots (desktop + mobile)
|
||||
- visual comparison
|
||||
- UI interaction results
|
||||
- SEO/indexing checks
|
||||
- form testing results
|
||||
- issue list with severity
|
||||
- conclusion
|
||||
|
||||
---
|
||||
|
||||
## Severity rules
|
||||
|
||||
### Critical
|
||||
- noindex / nofollow / X-Robots-Tag blocking indexing
|
||||
- page inaccessible
|
||||
- wrong redirect
|
||||
- core UI not working
|
||||
- key buttons or popups broken
|
||||
- key form not submitting
|
||||
|
||||
### Major
|
||||
- broken UI interactions
|
||||
- major layout mismatch
|
||||
- mobile UX broken
|
||||
- validation issues
|
||||
- backend form errors
|
||||
|
||||
### Minor
|
||||
- small visual differences
|
||||
- spacing/alignment issues
|
||||
- minor UI glitches
|
||||
|
||||
---
|
||||
|
||||
## Example workflow
|
||||
1. Open page (desktop)
|
||||
2. Screenshot
|
||||
3. Open page (mobile)
|
||||
4. Screenshot
|
||||
5. Inspect with playwright-screenshot-inspector
|
||||
6. Compare with reference
|
||||
7. Test ALL UI interactions (buttons, popups, tabs, etc.)
|
||||
8. Check response headers and meta tags
|
||||
9. Detect and test all forms
|
||||
10. Generate HTML report
|
||||
11. Return summary
|
||||
|
||||
---
|
||||
|
||||
## Important constraints
|
||||
- Do not skip UI interaction testing
|
||||
- Do not skip mobile
|
||||
- Do not skip screenshots
|
||||
- Do not skip form submissions
|
||||
- Always treat indexing blockers as **Critical**
|
||||
- Always generate HTML report
|
||||
|
||||
---
|
||||
|
||||
## Example instruction for the agent
|
||||
Test the provided page using Playwright.
|
||||
Use playwright-screenshot-inspector for visual validation.
|
||||
|
||||
Follow this order:
|
||||
1. Compare real page vs reference (desktop + mobile) and record differences.
|
||||
2. Test ALL UI elements (buttons, popups, accordions, tabs, etc.).
|
||||
3. Check server response, meta tags, and indexing blockers (Critical if present).
|
||||
4. Find and submit all forms, validate responses and errors.
|
||||
68
pages.json
Normal file
68
pages.json
Normal file
@@ -0,0 +1,68 @@
|
||||
{
|
||||
"version": 1,
|
||||
"description": "Primary pages and visual references for default website QA runs.",
|
||||
"sites": [
|
||||
{
|
||||
"site_id": "prombez-good-production",
|
||||
"site_name": "Prombez Good Production",
|
||||
"base_url": "https://prombez.cp.good-production.xyz",
|
||||
"pages": [
|
||||
{
|
||||
"id": "home",
|
||||
"title": "Главная",
|
||||
"url": "https://prombez.cp.good-production.xyz/",
|
||||
"reference": {
|
||||
"desktop_screenshot": null,
|
||||
"mobile_screenshot": null,
|
||||
"figma_url": null,
|
||||
"notes": "Fill in approved reference assets for homepage."
|
||||
}
|
||||
},
|
||||
{
|
||||
"id": "test-search-landing",
|
||||
"title": "Лендинг test-search",
|
||||
"url": "https://prombez.cp.good-production.xyz/test-search/",
|
||||
"reference": {
|
||||
"desktop_screenshot": null,
|
||||
"mobile_screenshot": null,
|
||||
"figma_url": null,
|
||||
"notes": "Use the approved desktop/mobile references for this landing page when available."
|
||||
}
|
||||
},
|
||||
{
|
||||
"id": "products",
|
||||
"title": "Каталог продуктов",
|
||||
"url": "https://prombez.cp.good-production.xyz/prod/",
|
||||
"reference": {
|
||||
"desktop_screenshot": null,
|
||||
"mobile_screenshot": null,
|
||||
"figma_url": null,
|
||||
"notes": "Fill in approved reference assets for the products page."
|
||||
}
|
||||
},
|
||||
{
|
||||
"id": "history",
|
||||
"title": "Истории успеха",
|
||||
"url": "https://prombez.cp.good-production.xyz/history/",
|
||||
"reference": {
|
||||
"desktop_screenshot": null,
|
||||
"mobile_screenshot": null,
|
||||
"figma_url": null,
|
||||
"notes": "Fill in approved reference assets for the success stories page."
|
||||
}
|
||||
},
|
||||
{
|
||||
"id": "contacts",
|
||||
"title": "Контакты",
|
||||
"url": "https://prombez.cp.good-production.xyz/company/contacts/",
|
||||
"reference": {
|
||||
"desktop_screenshot": null,
|
||||
"mobile_screenshot": null,
|
||||
"figma_url": null,
|
||||
"notes": "Fill in approved reference assets for the contacts page."
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
434
skills/playwright-screenshot-inspector/SKILL.md
Normal file
434
skills/playwright-screenshot-inspector/SKILL.md
Normal file
@@ -0,0 +1,434 @@
|
||||
---
|
||||
name: playwright-screenshot-inspector
|
||||
description: LLM-powered visual testing expert for automated screenshot capture, analysis, and UI verification using Playwright with multimodal AI inspection.
|
||||
metadata:
|
||||
category: Testing
|
||||
tags:
|
||||
- playwright
|
||||
- visual-testing
|
||||
- screenshots
|
||||
- ui-verification
|
||||
- automation
|
||||
pairs-with:
|
||||
- skill: playwright-e2e-tester
|
||||
reason: Visual regression testing extends E2E test suites with screenshot comparison
|
||||
- skill: webapp-testing
|
||||
reason: Screenshot inspection automates the visual verification that interactive testing does manually
|
||||
- skill: color-contrast-auditor
|
||||
reason: Automated screenshot analysis can detect contrast violations across UI states
|
||||
---
|
||||
|
||||
# Playwright Screenshot Inspector
|
||||
|
||||
LLM-powered visual testing expert for automated screenshot capture, analysis, and UI verification using Playwright with multimodal AI inspection.
|
||||
|
||||
## Activation Triggers
|
||||
|
||||
**Activate on:**
|
||||
- "screenshot test", "visual test", "screenshot inspection"
|
||||
- "playwright headless", "playwright screenshot"
|
||||
- "UI verification", "visual regression"
|
||||
- "theme compliance test", "dark mode test", "light mode test"
|
||||
- "automated screenshot", "capture and analyze"
|
||||
- "compare screenshots", "visual diff"
|
||||
|
||||
**NOT for:**
|
||||
- Simple one-off screenshots (use browser DevTools)
|
||||
- Pixel-perfect comparison without AI (use native Playwright `toHaveScreenshot`)
|
||||
- Non-web UI testing (use platform-specific tools)
|
||||
- Performance testing (use Lighthouse/WebPageTest)
|
||||
|
||||
---
|
||||
|
||||
## Core Philosophy
|
||||
|
||||
Traditional visual testing compares pixels. **LLM-powered visual testing understands semantics.**
|
||||
|
||||
Instead of "these 50 pixels changed", LLM inspection answers:
|
||||
- "Is the content actually rendered?"
|
||||
- "Does the theme switch correctly?"
|
||||
- "Are interactive elements visible and properly styled?"
|
||||
- "What's broken vs. what's just different?"
|
||||
|
||||
---
|
||||
|
||||
## The Screenshot Inspection Loop
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────┐
|
||||
│ LLM SCREENSHOT INSPECTION │
|
||||
├─────────────────────────────────────────────────────────────┤
|
||||
│ │
|
||||
│ 1. CAPTURE (Playwright) │
|
||||
│ └─► Wait for React hydration, not just network │
|
||||
│ │
|
||||
│ 2. READ (Claude vision) │
|
||||
│ └─► Pass screenshot to LLM with specific questions │
|
||||
│ │
|
||||
│ 3. ANALYZE (Structured response) │
|
||||
│ └─► Extract: content present? theme correct? errors? │
|
||||
│ │
|
||||
│ 4. ACT (Conditional logic) │
|
||||
│ └─► Pass/fail based on semantic understanding │
|
||||
│ │
|
||||
└─────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Critical: Waiting for React Content
|
||||
|
||||
**The #1 failure mode**: Taking screenshots before React hydrates.
|
||||
|
||||
### Anti-Pattern: Network Idle Alone
|
||||
```python
|
||||
# ❌ WRONG - React may not have rendered yet
|
||||
page.goto(url)
|
||||
page.wait_for_load_state('networkidle')
|
||||
page.screenshot(path='broken.png') # Often blank!
|
||||
```
|
||||
|
||||
### Correct Pattern: Wait for Actual Content
|
||||
```python
|
||||
# ✅ CORRECT - Wait for React to mount
|
||||
page.goto(url, wait_until='domcontentloaded')
|
||||
page.wait_for_load_state('networkidle')
|
||||
|
||||
# Give React time to hydrate
|
||||
import time
|
||||
time.sleep(0.5)
|
||||
|
||||
# Wait for actual content selector
|
||||
page.wait_for_selector('.main-content, h1, [data-testid="app"]',
|
||||
state='visible',
|
||||
timeout=10000)
|
||||
|
||||
# Verify content exists
|
||||
body_text = page.locator('body').inner_text()
|
||||
if len(body_text) < 50:
|
||||
time.sleep(2) # Extra wait for slow hydration
|
||||
|
||||
page.screenshot(path='good.png', full_page=True)
|
||||
```
|
||||
|
||||
### Content Verification Function
|
||||
```python
|
||||
def wait_for_react_content(page, selectors, timeout=10000):
|
||||
"""Wait for React to hydrate by checking for actual content."""
|
||||
page.wait_for_load_state('domcontentloaded')
|
||||
page.wait_for_load_state('networkidle')
|
||||
time.sleep(0.5) # React hydration buffer
|
||||
|
||||
for selector in selectors.split(','):
|
||||
try:
|
||||
locator = page.locator(selector.strip())
|
||||
if locator.count() > 0:
|
||||
locator.first.wait_for(state='visible', timeout=timeout)
|
||||
return True
|
||||
except:
|
||||
continue
|
||||
|
||||
# Fallback: wait for substantial body content
|
||||
try:
|
||||
page.wait_for_function(
|
||||
'document.body.innerText.length > 100',
|
||||
timeout=timeout
|
||||
)
|
||||
return True
|
||||
except:
|
||||
return False
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Headless Mode: Preventing Window Spam
|
||||
|
||||
**Always use `headless=True`** to prevent browser windows from spawning:
|
||||
|
||||
```python
|
||||
from playwright.sync_api import sync_playwright
|
||||
|
||||
with sync_playwright() as p:
|
||||
# CRITICAL: headless=True prevents visible browser windows
|
||||
browser = p.chromium.launch(headless=True)
|
||||
|
||||
context = browser.new_context(
|
||||
viewport={'width': 1280, 'height': 800},
|
||||
color_scheme='dark' # Initial theme
|
||||
)
|
||||
page = context.new_page()
|
||||
|
||||
# ... your test logic ...
|
||||
|
||||
browser.close() # Always clean up
|
||||
```
|
||||
|
||||
### Theme Testing Pattern
|
||||
```python
|
||||
# Dark mode screenshot
|
||||
page.emulate_media(color_scheme='dark') # Note: on PAGE, not context
|
||||
page.goto(url)
|
||||
wait_for_react_content(page, '.app-container, main, h1')
|
||||
page.screenshot(path='dark.png', full_page=True)
|
||||
|
||||
# Light mode screenshot
|
||||
page.emulate_media(color_scheme='light')
|
||||
page.reload()
|
||||
wait_for_react_content(page, '.app-container, main, h1')
|
||||
page.screenshot(path='light.png', full_page=True)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## LLM Screenshot Analysis Patterns
|
||||
|
||||
### Pattern 1: Content Verification
|
||||
```
|
||||
Prompt: "Analyze this screenshot. Answer:
|
||||
1. Is the main content rendered (not blank/loading)?
|
||||
2. What major UI elements are visible?
|
||||
3. Are there any error states or broken layouts?
|
||||
4. Rate content completeness: FULL / PARTIAL / EMPTY"
|
||||
```
|
||||
|
||||
### Pattern 2: Theme Compliance
|
||||
```
|
||||
Prompt: "This is a {dark/light} mode screenshot. Verify:
|
||||
1. Background color matches expected theme (dark bg for dark mode)
|
||||
2. Text has sufficient contrast against background
|
||||
3. Interactive elements are visible and styled correctly
|
||||
4. No theme leakage (dark elements on light bg or vice versa)"
|
||||
```
|
||||
|
||||
### Pattern 3: Comparison Analysis
|
||||
```
|
||||
Prompt: "Compare these two screenshots (before/after). Identify:
|
||||
1. What changed between them?
|
||||
2. Are changes intentional (theme switch) or bugs?
|
||||
3. Is any content missing in the 'after' version?
|
||||
4. Rate similarity: IDENTICAL / MINOR_DIFF / MAJOR_DIFF / BROKEN"
|
||||
```
|
||||
|
||||
### Pattern 4: Accessibility Check
|
||||
```
|
||||
Prompt: "Evaluate this screenshot for visual accessibility:
|
||||
1. Is text readable (sufficient size and contrast)?
|
||||
2. Are interactive elements clearly identifiable?
|
||||
3. Is there visual hierarchy (headings, sections)?
|
||||
4. Any elements that would fail WCAG contrast requirements?"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Complete Test Script Template
|
||||
|
||||
```python
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
LLM-Powered Screenshot Test Suite
|
||||
Captures screenshots and uses Claude vision for semantic analysis.
|
||||
"""
|
||||
|
||||
from playwright.sync_api import sync_playwright
|
||||
import os
|
||||
import time
|
||||
|
||||
PAGES_TO_TEST = [
|
||||
# (path, name, content_selectors)
|
||||
('/', 'Home', '.hero, main, h1'),
|
||||
('/about', 'About', '.about-content, main, h1'),
|
||||
('/dashboard', 'Dashboard', '.dashboard, .stats, h1'),
|
||||
]
|
||||
|
||||
BASE_URL = 'http://localhost:5173'
|
||||
SCREENSHOT_DIR = '/tmp/visual-tests'
|
||||
|
||||
|
||||
def wait_for_content(page, selectors, timeout=10000):
|
||||
"""Wait for React/Vue/Svelte to hydrate."""
|
||||
page.wait_for_load_state('domcontentloaded')
|
||||
page.wait_for_load_state('networkidle')
|
||||
time.sleep(0.5)
|
||||
|
||||
for selector in selectors.split(','):
|
||||
try:
|
||||
loc = page.locator(selector.strip())
|
||||
if loc.count() > 0:
|
||||
loc.first.wait_for(state='visible', timeout=timeout)
|
||||
return True
|
||||
except:
|
||||
continue
|
||||
|
||||
try:
|
||||
page.wait_for_function('document.body.innerText.length > 100', timeout=timeout)
|
||||
return True
|
||||
except:
|
||||
return False
|
||||
|
||||
|
||||
def capture_themed_screenshots(page, url, name, selectors):
|
||||
"""Capture both dark and light mode screenshots."""
|
||||
safe_name = name.lower().replace(' ', '-')
|
||||
results = {'name': name, 'url': url}
|
||||
|
||||
for theme in ['dark', 'light']:
|
||||
page.emulate_media(color_scheme=theme)
|
||||
|
||||
if theme == 'dark':
|
||||
page.goto(url, wait_until='domcontentloaded')
|
||||
else:
|
||||
page.reload(wait_until='domcontentloaded')
|
||||
|
||||
content_loaded = wait_for_content(page, selectors)
|
||||
|
||||
if not content_loaded:
|
||||
print(f" ⚠️ {theme} mode: Content slow to load, waiting...")
|
||||
time.sleep(2)
|
||||
|
||||
screenshot_path = f'{SCREENSHOT_DIR}/{safe_name}-{theme}.png'
|
||||
page.screenshot(path=screenshot_path, full_page=True)
|
||||
|
||||
# Check content length
|
||||
body_text = page.locator('body').inner_text().strip()
|
||||
results[f'{theme}_screenshot'] = screenshot_path
|
||||
results[f'{theme}_content_length'] = len(body_text)
|
||||
results[f'{theme}_has_content'] = len(body_text) > 50
|
||||
|
||||
print(f" {theme}: {'✅' if results[f'{theme}_has_content'] else '❌'} ({len(body_text)} chars)")
|
||||
|
||||
return results
|
||||
|
||||
|
||||
def run_tests():
|
||||
"""Run visual tests on all pages."""
|
||||
os.makedirs(SCREENSHOT_DIR, exist_ok=True)
|
||||
|
||||
with sync_playwright() as p:
|
||||
browser = p.chromium.launch(headless=True)
|
||||
context = browser.new_context(
|
||||
viewport={'width': 1280, 'height': 800},
|
||||
color_scheme='dark'
|
||||
)
|
||||
page = context.new_page()
|
||||
|
||||
# Capture console errors
|
||||
errors = []
|
||||
page.on('console', lambda m: errors.append(m.text) if m.type == 'error' else None)
|
||||
|
||||
results = []
|
||||
|
||||
for path, name, selectors in PAGES_TO_TEST:
|
||||
print(f"Testing {name}...")
|
||||
url = f'{BASE_URL}{path}'
|
||||
result = capture_themed_screenshots(page, url, name, selectors)
|
||||
result['errors'] = list(errors)
|
||||
errors.clear()
|
||||
results.append(result)
|
||||
|
||||
browser.close()
|
||||
|
||||
# Summary
|
||||
print("\n" + "=" * 50)
|
||||
print("VISUAL TEST SUMMARY")
|
||||
print("=" * 50)
|
||||
|
||||
passed = sum(1 for r in results
|
||||
if r.get('dark_has_content') and r.get('light_has_content'))
|
||||
print(f"\nPassed: {passed}/{len(results)}")
|
||||
print(f"Screenshots: {SCREENSHOT_DIR}")
|
||||
|
||||
return results
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
run_tests()
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## MCP vs Native Playwright Decision Tree
|
||||
|
||||
```
|
||||
What are you doing?
|
||||
│
|
||||
├─ Interactive debugging / exploring
|
||||
│ └─► Playwright MCP (see live browser)
|
||||
│
|
||||
├─ Automated test suite
|
||||
│ └─► Native Python Playwright (headless)
|
||||
│
|
||||
├─ CI/CD pipeline
|
||||
│ └─► Native Python Playwright (headless)
|
||||
│
|
||||
├─ Screenshot capture for LLM analysis
|
||||
│ └─► Native Python Playwright (headless)
|
||||
│
|
||||
└─ One-off inspection
|
||||
└─► Either works, MCP is convenient
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Common Failures and Fixes
|
||||
|
||||
### Failure: Blank Screenshots
|
||||
**Cause**: Screenshot taken before React hydrates
|
||||
**Fix**: Wait for content selectors, add hydration buffer
|
||||
|
||||
### Failure: "Reconnecting..." Badge Visible
|
||||
**Cause**: HMR/WebSocket not connected (cosmetic in tests)
|
||||
**Fix**: This is often fine - focus on actual content
|
||||
|
||||
### Failure: Theme Not Applied
|
||||
**Cause**: `emulate_media` called on context instead of page
|
||||
**Fix**: Use `page.emulate_media(color_scheme='dark')`
|
||||
|
||||
### Failure: Browser Windows Spawning
|
||||
**Cause**: `headless=False` or using MCP instead of native
|
||||
**Fix**: Use `p.chromium.launch(headless=True)`
|
||||
|
||||
### Failure: Timeout on Content
|
||||
**Cause**: Wrong selectors or page actually broken
|
||||
**Fix**: Verify selectors exist, check console errors
|
||||
|
||||
---
|
||||
|
||||
## Integration with Claude Code
|
||||
|
||||
When Claude reads screenshots captured by this pattern:
|
||||
|
||||
1. **Request specific analysis**: Don't just show screenshot - ask targeted questions
|
||||
2. **Provide context**: "This should be dark mode" or "This is the login page"
|
||||
3. **Compare systematically**: Before/after, dark/light, desktop/mobile
|
||||
4. **Trust semantic analysis**: LLM can tell "blank page" from "content loaded"
|
||||
|
||||
---
|
||||
|
||||
## References
|
||||
|
||||
### Research Papers
|
||||
- [Using Vision LLMs For UI Testing](https://courses.cs.washington.edu/courses/cse503/25wi/final-reports/Using%20Vision%20LLMs%20For%20UI%20Testing.pdf) - University of Washington
|
||||
- [Vision-driven Automated Mobile GUI Testing](https://arxiv.org/html/2407.03037v1) - Multimodal LLM approach
|
||||
- [ScreenLLM: Stateful Screen Schema](https://arxiv.org/html/2503.20978v1) - UI understanding framework
|
||||
|
||||
### Tools & Integrations
|
||||
- [Building an AI QA Engineer with Claude + Playwright](https://alexop.dev/posts/building_ai_qa_engineer_claude_code_playwright/)
|
||||
- [AI-Powered Visual Testing in Playwright](https://testrig.medium.com/ai-powered-visual-testing-in-playwright-from-pixels-to-perception-dd3ee49911d5)
|
||||
- [Playwright Visual Regression Testing Guide](https://testgrid.io/blog/playwright-visual-regression-testing/)
|
||||
|
||||
### Official Documentation
|
||||
- [Playwright Visual Comparisons](https://playwright.dev/docs/test-snapshots)
|
||||
|
||||
---
|
||||
|
||||
## Version History
|
||||
|
||||
- **2026-01-23**: Initial skill creation
|
||||
- Researched multimodal LLM screenshot analysis best practices
|
||||
- Documented React hydration waiting patterns
|
||||
- Added headless mode requirements
|
||||
- Created complete test script template
|
||||
|
||||
---
|
||||
|
||||
**Core Insight**: The difference between useless and useful screenshot tests is waiting for content, not just network. LLMs can analyze semantics, but only if there's actually content to analyze.
|
||||
Reference in New Issue
Block a user