commit afb826e85f4a35d4e8cc2f5d5593ee59acad024f Author: Andrei Date: Thu Apr 23 18:13:42 2026 +0300 Init diff --git a/AGENTS.md b/AGENTS.md new file mode 100644 index 0000000..d6107f7 --- /dev/null +++ b/AGENTS.md @@ -0,0 +1,287 @@ +# AGENTS.md + +## Purpose +This agent is used for **website QA/testing** of real pages in a browser environment. + +The testing must be performed with **Playwright**. +Visual page inspection must be performed with the **playwright-screenshot-inspector** skill. +The agent must check the **real rendered page**, not just raw HTML. + +--- + +## Required tools and skills +- **Playwright** — for browser automation, page opening, interaction, form submission, and response inspection. +- **playwright-screenshot-inspector** — for screenshot-based visual inspection of the rendered page. + +--- + +## General testing rules +1. Always test the **real live page URL** provided by the user. +2. Always test both: + - **Desktop / PC viewport** + - **Mobile viewport** +3. Always capture screenshots for both desktop and mobile. +4. Compare the actual rendered page against the screen/reference provided by the user. +5. **All interactive UI elements must be tested manually via interaction.** +6. Report all findings clearly and classify issues by severity: + - **Critical** + - **Major** + - **Minor** +7. Do not stop after the first issue. Complete the full checklist. +8. At the end of testing, create a **summary HTML file** with all findings. +9. All results and summaries sould be in Russian language. +10. Split form submission for 2 parts +First is forms submition summary results and accessability in table form +Second is details in hidden layout + +### Configuration-driven page selection +- Use `pages.json` in the workspace root as the default source of primary pages and visual references. +- If the user asks to **just check the site** without specifying a single page, run the full list from `pages.json`. +- If the user provides a **specific page URL**, first prepare to test that exact URL. +- In that case, ask one short question: whether to also test the primary pages from `pages.json`. +- If the user answers **yes**, test: + - the explicit URL from the user request + - the pages from `pages.json` +- Deduplicate identical URLs before running checks. +- If the user answers **no**, test only the explicit URL from the request. +- For visual comparison, use the reference data from `pages.json` when present: + - `desktop_screenshot` + - `mobile_screenshot` + - `figma_url` +- If a page in `pages.json` has no reference assets yet, still test the real page and clearly note that visual comparison is limited by missing reference materials. + +--- + +## Required order of checks + +### 1. Visual page and UI interaction check +The user will provide: +- a page URL +- a reference screen/mockup/screenshot + +#### 1.1 Open and inspect the real page +- Open the provided page in Playwright. +- Wait until the page is fully loaded and visually stable. +- Test in: + - Desktop viewport + - Mobile viewport + +#### 1.2 Capture screenshots +- Take full-page screenshots for both desktop and mobile. +- Use **playwright-screenshot-inspector** to inspect the screenshots. + +#### 1.3 Compare with the provided reference +Check for differences between: +- the real page +- the provided screen/reference + +Look for: +- incorrect layout structure +- broken blocks +- missing elements +- different text/content placement +- wrong fonts, sizes, spacing, alignment +- overlap or clipping +- responsive/mobile issues +- hidden or misplaced elements + +--- + +### 1.4 UI interaction testing (MANDATORY) +If any interactive UI elements are present, they **must be tested**. + +This includes (but is not limited to): +- buttons +- links (especially JS-triggered) +- popup / modal windows +- forms opened via buttons +- dropdowns +- accordions +- tabs +- sliders/carousels +- burger menus (mobile) +- tooltips +- filters/sort controls + +For each UI element: +- trigger the interaction (click, hover, input, etc.) +- verify it works as expected +- verify correct content is shown +- verify no visual breakage occurs +- verify no JS errors appear +- verify behavior on both desktop and mobile + +Check for: +- element not clickable +- no reaction on click +- wrong content shown +- broken animation or layout +- popup not opening/closing +- overlay issues (scroll lock, z-index bugs) +- elements hidden or cut off on mobile + +Any broken interaction must be reported. + +--- + +### 1.5 Produce visual + UI summary +Create an **HTML summary file** containing: +- page URL +- test date/time +- desktop screenshot +- mobile screenshot +- list of visual differences +- list of UI interaction issues +- severity for each issue +- short conclusion + +--- + +### 2. Server response and robots/indexing checks + +#### 2.1 Check HTTP response +Inspect: +- HTTP status code +- redirects +- final URL +- response headers + +#### 2.2 Check for critical indexing blockers +These are **Critical** if found unexpectedly: +- `X-Robots-Tag: noindex` +- `X-Robots-Tag: nofollow` +- `meta name="robots" content="noindex"` +- `meta name="robots" content="nofollow"` +- restrictive `googlebot` meta +- incorrect canonical +- unexpected redirects + +Any directive blocking indexing must be marked **Critical**. + +#### 2.3 Check meta and SEO basics +Inspect: +- `` +- meta description +- robots meta tag +- canonical +- viewport +- hreflang (if present) + +--- + +### 3. Form testing + +#### 3.1 Detect forms +Find all forms including: +- visible forms +- popup forms +- dynamically opened forms + +#### 3.2 Test submission +For each form: +- fill required fields with valid data +- submit form +- observe frontend validation +- inspect server/network response + +#### 3.3 Validate behavior +Check: +- submit works +- validation messages appear correctly +- incorrect input shows errors +- success message is shown +- no JS errors +- no duplicate requests +- backend response is correct + +#### 3.4 Report results +For each form include: +- selector/name +- fields +- result (success / error / broken) +- server response +- severity + +--- + +## Output format + +### 1. Text summary +- overall result +- issue count by severity +- visual correctness +- UI interaction status +- indexing status +- form functionality status + +### 2. HTML report +Must include: +- URL +- screenshots (desktop + mobile) +- visual comparison +- UI interaction results +- SEO/indexing checks +- form testing results +- issue list with severity +- conclusion + +--- + +## Severity rules + +### Critical +- noindex / nofollow / X-Robots-Tag blocking indexing +- page inaccessible +- wrong redirect +- core UI not working +- key buttons or popups broken +- key form not submitting + +### Major +- broken UI interactions +- major layout mismatch +- mobile UX broken +- validation issues +- backend form errors + +### Minor +- small visual differences +- spacing/alignment issues +- minor UI glitches + +--- + +## Example workflow +1. Open page (desktop) +2. Screenshot +3. Open page (mobile) +4. Screenshot +5. Inspect with playwright-screenshot-inspector +6. Compare with reference +7. Test ALL UI interactions (buttons, popups, tabs, etc.) +8. Check response headers and meta tags +9. Detect and test all forms +10. Generate HTML report +11. Return summary + +--- + +## Important constraints +- Do not skip UI interaction testing +- Do not skip mobile +- Do not skip screenshots +- Do not skip form submissions +- Always treat indexing blockers as **Critical** +- Always generate HTML report + +--- + +## Example instruction for the agent +Test the provided page using Playwright. +Use playwright-screenshot-inspector for visual validation. + +Follow this order: +1. Compare real page vs reference (desktop + mobile) and record differences. +2. Test ALL UI elements (buttons, popups, accordions, tabs, etc.). +3. Check server response, meta tags, and indexing blockers (Critical if present). +4. Find and submit all forms, validate responses and errors. diff --git a/pages.json b/pages.json new file mode 100644 index 0000000..4549359 --- /dev/null +++ b/pages.json @@ -0,0 +1,68 @@ +{ + "version": 1, + "description": "Primary pages and visual references for default website QA runs.", + "sites": [ + { + "site_id": "prombez-good-production", + "site_name": "Prombez Good Production", + "base_url": "https://prombez.cp.good-production.xyz", + "pages": [ + { + "id": "home", + "title": "Главная", + "url": "https://prombez.cp.good-production.xyz/", + "reference": { + "desktop_screenshot": null, + "mobile_screenshot": null, + "figma_url": null, + "notes": "Fill in approved reference assets for homepage." + } + }, + { + "id": "test-search-landing", + "title": "Лендинг test-search", + "url": "https://prombez.cp.good-production.xyz/test-search/", + "reference": { + "desktop_screenshot": null, + "mobile_screenshot": null, + "figma_url": null, + "notes": "Use the approved desktop/mobile references for this landing page when available." + } + }, + { + "id": "products", + "title": "Каталог продуктов", + "url": "https://prombez.cp.good-production.xyz/prod/", + "reference": { + "desktop_screenshot": null, + "mobile_screenshot": null, + "figma_url": null, + "notes": "Fill in approved reference assets for the products page." + } + }, + { + "id": "history", + "title": "Истории успеха", + "url": "https://prombez.cp.good-production.xyz/history/", + "reference": { + "desktop_screenshot": null, + "mobile_screenshot": null, + "figma_url": null, + "notes": "Fill in approved reference assets for the success stories page." + } + }, + { + "id": "contacts", + "title": "Контакты", + "url": "https://prombez.cp.good-production.xyz/company/contacts/", + "reference": { + "desktop_screenshot": null, + "mobile_screenshot": null, + "figma_url": null, + "notes": "Fill in approved reference assets for the contacts page." + } + } + ] + } + ] +} diff --git a/skills/playwright-screenshot-inspector/SKILL.md b/skills/playwright-screenshot-inspector/SKILL.md new file mode 100644 index 0000000..aaeb0e0 --- /dev/null +++ b/skills/playwright-screenshot-inspector/SKILL.md @@ -0,0 +1,434 @@ +--- +name: playwright-screenshot-inspector +description: LLM-powered visual testing expert for automated screenshot capture, analysis, and UI verification using Playwright with multimodal AI inspection. +metadata: + category: Testing + tags: + - playwright + - visual-testing + - screenshots + - ui-verification + - automation + pairs-with: + - skill: playwright-e2e-tester + reason: Visual regression testing extends E2E test suites with screenshot comparison + - skill: webapp-testing + reason: Screenshot inspection automates the visual verification that interactive testing does manually + - skill: color-contrast-auditor + reason: Automated screenshot analysis can detect contrast violations across UI states +--- + +# Playwright Screenshot Inspector + +LLM-powered visual testing expert for automated screenshot capture, analysis, and UI verification using Playwright with multimodal AI inspection. + +## Activation Triggers + +**Activate on:** +- "screenshot test", "visual test", "screenshot inspection" +- "playwright headless", "playwright screenshot" +- "UI verification", "visual regression" +- "theme compliance test", "dark mode test", "light mode test" +- "automated screenshot", "capture and analyze" +- "compare screenshots", "visual diff" + +**NOT for:** +- Simple one-off screenshots (use browser DevTools) +- Pixel-perfect comparison without AI (use native Playwright `toHaveScreenshot`) +- Non-web UI testing (use platform-specific tools) +- Performance testing (use Lighthouse/WebPageTest) + +--- + +## Core Philosophy + +Traditional visual testing compares pixels. **LLM-powered visual testing understands semantics.** + +Instead of "these 50 pixels changed", LLM inspection answers: +- "Is the content actually rendered?" +- "Does the theme switch correctly?" +- "Are interactive elements visible and properly styled?" +- "What's broken vs. what's just different?" + +--- + +## The Screenshot Inspection Loop + +``` +┌─────────────────────────────────────────────────────────────┐ +│ LLM SCREENSHOT INSPECTION │ +├─────────────────────────────────────────────────────────────┤ +│ │ +│ 1. CAPTURE (Playwright) │ +│ └─► Wait for React hydration, not just network │ +│ │ +│ 2. READ (Claude vision) │ +│ └─► Pass screenshot to LLM with specific questions │ +│ │ +│ 3. ANALYZE (Structured response) │ +│ └─► Extract: content present? theme correct? errors? │ +│ │ +│ 4. ACT (Conditional logic) │ +│ └─► Pass/fail based on semantic understanding │ +│ │ +└─────────────────────────────────────────────────────────────┘ +``` + +--- + +## Critical: Waiting for React Content + +**The #1 failure mode**: Taking screenshots before React hydrates. + +### Anti-Pattern: Network Idle Alone +```python +# ❌ WRONG - React may not have rendered yet +page.goto(url) +page.wait_for_load_state('networkidle') +page.screenshot(path='broken.png') # Often blank! +``` + +### Correct Pattern: Wait for Actual Content +```python +# ✅ CORRECT - Wait for React to mount +page.goto(url, wait_until='domcontentloaded') +page.wait_for_load_state('networkidle') + +# Give React time to hydrate +import time +time.sleep(0.5) + +# Wait for actual content selector +page.wait_for_selector('.main-content, h1, [data-testid="app"]', + state='visible', + timeout=10000) + +# Verify content exists +body_text = page.locator('body').inner_text() +if len(body_text) < 50: + time.sleep(2) # Extra wait for slow hydration + +page.screenshot(path='good.png', full_page=True) +``` + +### Content Verification Function +```python +def wait_for_react_content(page, selectors, timeout=10000): + """Wait for React to hydrate by checking for actual content.""" + page.wait_for_load_state('domcontentloaded') + page.wait_for_load_state('networkidle') + time.sleep(0.5) # React hydration buffer + + for selector in selectors.split(','): + try: + locator = page.locator(selector.strip()) + if locator.count() > 0: + locator.first.wait_for(state='visible', timeout=timeout) + return True + except: + continue + + # Fallback: wait for substantial body content + try: + page.wait_for_function( + 'document.body.innerText.length > 100', + timeout=timeout + ) + return True + except: + return False +``` + +--- + +## Headless Mode: Preventing Window Spam + +**Always use `headless=True`** to prevent browser windows from spawning: + +```python +from playwright.sync_api import sync_playwright + +with sync_playwright() as p: + # CRITICAL: headless=True prevents visible browser windows + browser = p.chromium.launch(headless=True) + + context = browser.new_context( + viewport={'width': 1280, 'height': 800}, + color_scheme='dark' # Initial theme + ) + page = context.new_page() + + # ... your test logic ... + + browser.close() # Always clean up +``` + +### Theme Testing Pattern +```python +# Dark mode screenshot +page.emulate_media(color_scheme='dark') # Note: on PAGE, not context +page.goto(url) +wait_for_react_content(page, '.app-container, main, h1') +page.screenshot(path='dark.png', full_page=True) + +# Light mode screenshot +page.emulate_media(color_scheme='light') +page.reload() +wait_for_react_content(page, '.app-container, main, h1') +page.screenshot(path='light.png', full_page=True) +``` + +--- + +## LLM Screenshot Analysis Patterns + +### Pattern 1: Content Verification +``` +Prompt: "Analyze this screenshot. Answer: +1. Is the main content rendered (not blank/loading)? +2. What major UI elements are visible? +3. Are there any error states or broken layouts? +4. Rate content completeness: FULL / PARTIAL / EMPTY" +``` + +### Pattern 2: Theme Compliance +``` +Prompt: "This is a {dark/light} mode screenshot. Verify: +1. Background color matches expected theme (dark bg for dark mode) +2. Text has sufficient contrast against background +3. Interactive elements are visible and styled correctly +4. No theme leakage (dark elements on light bg or vice versa)" +``` + +### Pattern 3: Comparison Analysis +``` +Prompt: "Compare these two screenshots (before/after). Identify: +1. What changed between them? +2. Are changes intentional (theme switch) or bugs? +3. Is any content missing in the 'after' version? +4. Rate similarity: IDENTICAL / MINOR_DIFF / MAJOR_DIFF / BROKEN" +``` + +### Pattern 4: Accessibility Check +``` +Prompt: "Evaluate this screenshot for visual accessibility: +1. Is text readable (sufficient size and contrast)? +2. Are interactive elements clearly identifiable? +3. Is there visual hierarchy (headings, sections)? +4. Any elements that would fail WCAG contrast requirements?" +``` + +--- + +## Complete Test Script Template + +```python +#!/usr/bin/env python3 +""" +LLM-Powered Screenshot Test Suite +Captures screenshots and uses Claude vision for semantic analysis. +""" + +from playwright.sync_api import sync_playwright +import os +import time + +PAGES_TO_TEST = [ + # (path, name, content_selectors) + ('/', 'Home', '.hero, main, h1'), + ('/about', 'About', '.about-content, main, h1'), + ('/dashboard', 'Dashboard', '.dashboard, .stats, h1'), +] + +BASE_URL = 'http://localhost:5173' +SCREENSHOT_DIR = '/tmp/visual-tests' + + +def wait_for_content(page, selectors, timeout=10000): + """Wait for React/Vue/Svelte to hydrate.""" + page.wait_for_load_state('domcontentloaded') + page.wait_for_load_state('networkidle') + time.sleep(0.5) + + for selector in selectors.split(','): + try: + loc = page.locator(selector.strip()) + if loc.count() > 0: + loc.first.wait_for(state='visible', timeout=timeout) + return True + except: + continue + + try: + page.wait_for_function('document.body.innerText.length > 100', timeout=timeout) + return True + except: + return False + + +def capture_themed_screenshots(page, url, name, selectors): + """Capture both dark and light mode screenshots.""" + safe_name = name.lower().replace(' ', '-') + results = {'name': name, 'url': url} + + for theme in ['dark', 'light']: + page.emulate_media(color_scheme=theme) + + if theme == 'dark': + page.goto(url, wait_until='domcontentloaded') + else: + page.reload(wait_until='domcontentloaded') + + content_loaded = wait_for_content(page, selectors) + + if not content_loaded: + print(f" ⚠️ {theme} mode: Content slow to load, waiting...") + time.sleep(2) + + screenshot_path = f'{SCREENSHOT_DIR}/{safe_name}-{theme}.png' + page.screenshot(path=screenshot_path, full_page=True) + + # Check content length + body_text = page.locator('body').inner_text().strip() + results[f'{theme}_screenshot'] = screenshot_path + results[f'{theme}_content_length'] = len(body_text) + results[f'{theme}_has_content'] = len(body_text) > 50 + + print(f" {theme}: {'✅' if results[f'{theme}_has_content'] else '❌'} ({len(body_text)} chars)") + + return results + + +def run_tests(): + """Run visual tests on all pages.""" + os.makedirs(SCREENSHOT_DIR, exist_ok=True) + + with sync_playwright() as p: + browser = p.chromium.launch(headless=True) + context = browser.new_context( + viewport={'width': 1280, 'height': 800}, + color_scheme='dark' + ) + page = context.new_page() + + # Capture console errors + errors = [] + page.on('console', lambda m: errors.append(m.text) if m.type == 'error' else None) + + results = [] + + for path, name, selectors in PAGES_TO_TEST: + print(f"Testing {name}...") + url = f'{BASE_URL}{path}' + result = capture_themed_screenshots(page, url, name, selectors) + result['errors'] = list(errors) + errors.clear() + results.append(result) + + browser.close() + + # Summary + print("\n" + "=" * 50) + print("VISUAL TEST SUMMARY") + print("=" * 50) + + passed = sum(1 for r in results + if r.get('dark_has_content') and r.get('light_has_content')) + print(f"\nPassed: {passed}/{len(results)}") + print(f"Screenshots: {SCREENSHOT_DIR}") + + return results + + +if __name__ == '__main__': + run_tests() +``` + +--- + +## MCP vs Native Playwright Decision Tree + +``` +What are you doing? +│ +├─ Interactive debugging / exploring +│ └─► Playwright MCP (see live browser) +│ +├─ Automated test suite +│ └─► Native Python Playwright (headless) +│ +├─ CI/CD pipeline +│ └─► Native Python Playwright (headless) +│ +├─ Screenshot capture for LLM analysis +│ └─► Native Python Playwright (headless) +│ +└─ One-off inspection + └─► Either works, MCP is convenient +``` + +--- + +## Common Failures and Fixes + +### Failure: Blank Screenshots +**Cause**: Screenshot taken before React hydrates +**Fix**: Wait for content selectors, add hydration buffer + +### Failure: "Reconnecting..." Badge Visible +**Cause**: HMR/WebSocket not connected (cosmetic in tests) +**Fix**: This is often fine - focus on actual content + +### Failure: Theme Not Applied +**Cause**: `emulate_media` called on context instead of page +**Fix**: Use `page.emulate_media(color_scheme='dark')` + +### Failure: Browser Windows Spawning +**Cause**: `headless=False` or using MCP instead of native +**Fix**: Use `p.chromium.launch(headless=True)` + +### Failure: Timeout on Content +**Cause**: Wrong selectors or page actually broken +**Fix**: Verify selectors exist, check console errors + +--- + +## Integration with Claude Code + +When Claude reads screenshots captured by this pattern: + +1. **Request specific analysis**: Don't just show screenshot - ask targeted questions +2. **Provide context**: "This should be dark mode" or "This is the login page" +3. **Compare systematically**: Before/after, dark/light, desktop/mobile +4. **Trust semantic analysis**: LLM can tell "blank page" from "content loaded" + +--- + +## References + +### Research Papers +- [Using Vision LLMs For UI Testing](https://courses.cs.washington.edu/courses/cse503/25wi/final-reports/Using%20Vision%20LLMs%20For%20UI%20Testing.pdf) - University of Washington +- [Vision-driven Automated Mobile GUI Testing](https://arxiv.org/html/2407.03037v1) - Multimodal LLM approach +- [ScreenLLM: Stateful Screen Schema](https://arxiv.org/html/2503.20978v1) - UI understanding framework + +### Tools & Integrations +- [Building an AI QA Engineer with Claude + Playwright](https://alexop.dev/posts/building_ai_qa_engineer_claude_code_playwright/) +- [AI-Powered Visual Testing in Playwright](https://testrig.medium.com/ai-powered-visual-testing-in-playwright-from-pixels-to-perception-dd3ee49911d5) +- [Playwright Visual Regression Testing Guide](https://testgrid.io/blog/playwright-visual-regression-testing/) + +### Official Documentation +- [Playwright Visual Comparisons](https://playwright.dev/docs/test-snapshots) + +--- + +## Version History + +- **2026-01-23**: Initial skill creation + - Researched multimodal LLM screenshot analysis best practices + - Documented React hydration waiting patterns + - Added headless mode requirements + - Created complete test script template + +--- + +**Core Insight**: The difference between useless and useful screenshot tests is waiting for content, not just network. LLMs can analyze semantics, but only if there's actually content to analyze.