Skip to main content

Smarter Testing with GenAI: From Exploratory Ideas to Automated Scripts

Exploratory testing has always been one of the most creative and human-driven parts of software quality assurance. It's the stage where testers go beyond checklists and truly investigate how users might interact with the application. Instead of following predefined test cases, testers explore the software by asking questions, experimenting with inputs, and thinking critically about how things might go wrong.

With the rise of Generative AI in software testing, exploratory testing is becoming more efficient and more powerful. AI does not replace the tester’s instinct or experience. Instead, it works alongside testers to enhance their ability to uncover hidden bugs, identify usability issues, and think outside the box.

For example, generative AI can:

  • Analyze user stories and suggest edge cases

  • Review UI screenshots to highlight confusing layouts or misaligned components

  • Propose high-risk input combinations that might be overlooked early in development

One of the biggest advantages of using GenAI in testing is the ability to quickly turn test ideas into automated test scripts. You can write a Gherkin scenario, and AI can generate the corresponding Python or JavaScript test code. Testers still need to review and refine the generated code, but they spend far less time writing repetitive scripts. This allows them to focus more on the quality of test logic and on finding real-world issues.

AI Is a Collaborator, Not a Decision-Maker

Visual testing is another area where AI adds significant value. Traditional visual checks can be time-consuming to automate. Generative AI tools can scan screenshots and point out visual inconsistencies, missing accessibility labels, or unintentional UI shifts. These issues might not break functionality, but they can degrade the user experience. AI models bring both visual recognition and context-based reasoning into the process, helping testers spot things that simple assertions would miss.

Despite these advances, AI isn’t perfect, and using it requires caution. Generative models are not always accurate. It can make incorrect assumptions or overlook critical edge cases. That’s why human oversight remains essential. Testers must validate AI-generated suggestions carefully and treat AI as a collaborator, not a replacement.

Human testers need to validate AI-generated insights carefully. The key is to treat AI as a collaborator, not as a decision-maker.

When applied thoughtfully, AI speeds up the setup of test environments, increases test coverage, and supports a more creative approach to QA. It helps teams work faster while maintaining a high standard of quality.

Practical Prompts for Using AI in QA Testing

You don’t need advanced tools to benefit from AI. A few well-phrased prompts can help generate useful insights. Try asking:

  • What are some possible edge cases for this login functionality?

  • Which UI elements in this screenshot might be confusing, misaligned, or missing?

  • Based on this bug report, what areas of the app should be retested?

  • Simulate what might go wrong if network latency is high on this page.

  • Compare this version of the UI to the previous one. What changed visually?

  • Identify parts of this UI that might be difficult to use on mobile screens.

These types of prompts allow AI to support your thinking by offering fresh perspectives and helping uncover blind spots.

What AI Can Do in Exploratory and Visual Testing

AI-powered visual testing tools can detect subtle visual issues that are easy to miss during manual testing. The issues listed below are the kinds of issues that AI can dependably catch using screenshot comparison, visual diffing, and layout analysis particularly when there’s a known baseline version for reference:

  • Analyzing user stories and generating edge case suggestions

  • Inspecting UI screenshots to flag layout inconsistencies or confusing elements

  • Simulating test scenarios with high-risk input combinations

  • Summarizing bug reports into actionable test cases

  • Suggesting Gherkin scenarios and converting them into base-level test code

Introducing Real-World Use Cases

To illustrate how generative AI performs in exploratory and visual testing, let’s look at two high-stakes industries where UI accuracy and usability are critical: industrial systems and medical devices.

Industrial / Manufacturing Systems

AI detects layout and rendering issues, but cannot validate system logic or safety relevance.

  • Misaligned data displays on dashboards (e.g., temperature or pressure readings not in correct grid)

  • Disappearing UI elements after a mode switch (e.g., engineer vs. operator view)

  • Chart rendering errors (e.g., axis labels shifted or cropped)

  • Layout breaks in multilingual interfaces (e.g., German text overflowing buttons)

AdobeStock_1174565303
Medical Devices

AI can flag these, but testers must still validate whether a change is clinically meaningful or not.

  •  Layout inconsistencies in vital monitoring screens (e.g., ECG labels misaligned)
  • Low-contrast text that fails accessibility thresholds (e.g., gray on white)

  • Broken or missing buttons (e.g., confirmation buttons not appearing in a treatment workflow)

  • Font and colour inconsistencies across screens in EMR or diagnostic tools

    AdobeStock_199445561

But Here’s the Catch: You Still Need Specialized GUI Testing Tools 

While AI boosts productivity and creativity, it does not replace GUI automation tools especially for complex or domain-specific applications built with Qt, Windows, AndroidJava, or embedded systems.

Visual Testing: Use AI with the Right Tools

AI-powered visual diff tools are great at spotting UI regressions, color mismatches, spacing issues, and misaligned components. But in safety-critical industries like automotive, aerospace, and medical, visual detection isn’t sufficient. You need:

  • Baseline comparisons

  • Domain-specific validations

  • Manual review and human judgment

Real-World Workflow

As mentioned earlier, specialised automation GUI testing tools bring structured visual verification and accessibility checks into a framework that can be tested, repeated, and audited. It is something what AI alone can't guarantee.

For example, with Squish automation GUI testing tool, you can implement the following workflow to harness AI assistants like Cursor or Claude to efficiently generate, refine, and run Squish GUI test scripts, while maintaining full control and reliability:

  1. Define Style‑and‑Best‑Practice Guides
    Provide your AI assistant with a Squish Rules document that outlines preferred conventions, maintainability standards, and best practices. This prompts the model to output high-quality, team-aligned scripts.

  2. Feed the AI Up‑to‑Date Context
    Index key references—such as Squish developer documentation, your product specification (e.g. the AUT’s UX flow or UI metadata), and a recorded stub script plus object map into the AI. This retrieval-augmented context ensures the assistant generates valid Squish API calls and correctly references UI components.

  3. Use Squish Model Context Protocol (MCP)
    Leverage the Squish MCP server setup so the AI assistant can directly invoke Squish tests (via squishrunner CLI), receive result logs, and iteratively refine and rerun scripts within the same interaction.

  4. Generate & Execute Test Scripts Iteratively
    Prompt the AI to “create a script for adding a contact,” or “run suite_x and fix the failure,” and it generates the code, pushes it through Squish via MCP, analyzes errors, proposes edits, and retries, ensuring a continuous feedback loop.

  5. Maintain Flexibility with AI Assistants
    Because the approach isn’t tied to a single LLM provider, you can swap assistants (e.g. Cursor, Claude, Copilot) or adopt open-source models without disrupting your setup.

  6. Preserve and Reuse Test Assets
    Squish scripts remain your property. You can generate, maintain, or port them across frameworks—there’s no vendor lock-in, and your test suites stay usable for the long term.

This hybrid approach combines AI's agility with Squish’s depth and precision.

GenAI Is Powerful—But Works Best with QA Tools and Guardrails

GenAI can speed up test creation and automation, but it needs structure to deliver reliable results.

1. Validate against real system context
LLMs can misinterpret app behavior or generate incorrect test steps. Connecting GenAI with test tools like Squish and domain data (e.g. logs, UI maps) keeps outputs grounded and relevant.

2. Ensure repeatability
AI responses may vary over time. Use fixed prompts, versioned models, and human review to maintain traceability and reproducibility.

3. Use RAG to boost relevance
Feeding GenAI your test assets, requirements, changelogs, scripts, via retrieval-augmented generation (RAG) makes test suggestions more accurate and aligned with your system.

Bottom line: GenAI is most effective when embedded in your QA ecosystem, not when used in isolation.

Final Thoughts: The Future of Software Testing Is Collaborative

Generative AI is a powerful addition to the software testing toolbox. But it’s not a replacement for structured, tool-based test automation. When used correctly, AI enhances the creative and analytical strengths of human testers, while GUI testing tools like Squish provide the control, accuracy, and platform integration needed for serious test automation.

Whether you're focused on GUI test automation, Qt application testing, or exploratory test workflows, the most effective strategy is to combine AI-powered insights with reliable tools like Squish.

With Squish you will get what AI alone cannot offer:

  • Object-level UI interaction: Squish doesn’t just look at pixels. It interacts with UI components via their internal properties.

  • Support for Qt, QML, and embedded UIs: Squish is purpose-built for platforms where traditional automation tools or AI alone fall short.

  • Stable test execution: Squish test cases are repeatable and maintainable, even across UI changes.

  • Seamless Gherkin integration: Squish supports behavior-driven testing (BDD), making it easy to convert AI-generated Gherkin into structured automation.

What's Next

Read:

 A Practical Guide to Generating Squish Test Scripts with AI Assistants and watch a tutorial

Quality Assurance for Medical Software Solutions

Quality Assurance for Industrial Vehicles

Go to All Industries 

 

Panel SQF: Maximize the Potential of AI in Quality Assurance

Watch the Panel Discussion: Maximize the Potential of AI in Quality Assurance featuring the insights by the AI practitioners: Peter Schneider, Principal, Product Management at Qt Group
Maaret Pyhäjärvi, Director of Consulting at CGI
Felix Kortmann, CTO at Ignite by FORVIA HELLA

 

 

 

Comments