How Functionize Uses Data

This document reflects our current data practices as of the last update in August 2025. For questions about our data usage practices or to discuss specific data handling requirements, please contact your Functionize customer success representative.

Overview

Functionize takes data privacy seriously. We implement a strict separation between customer data and model training activities. We only use anonymized application metadata, such as element coordinates, structural positions, and visual styles, to train our propritary internal AI & ML models, ensuring that no confidential information from applications under test is ever used for model optimization. 

Customer data is secured using best-in-class techniques including AES-256 encryption at rest, TLS 1.3 for all data transfers, and logically partitioned storage in secure Google Cloud environments with strict access controls. We recommend customers use scrubbed or dummy data in testing environments to eliminate any risk of personal data exposure, and our application metadata used for training is automatically deleted after a maximum of 180 days, with only anonymized insights remaining embedded in our trained models.

This document outlines how we collect, use, and protect data within our platform ecosystem.

Data Categories and Usage

Customer Data

Definition: Customer Data comprises information that customers authorize Functionize to ingest into our platform, including data from customer websites, applications, networks, platforms, or environments. This explicitly excludes Usage Data (below). Customers should ensure that all Customer Data does not contain personally identifiable, confidential, or regulated information.

Usage: Customer Data is processed solely to provide testing services to the specific customer and is never used for training internal AI models or shared across customer boundaries.

Usage Data 

Definition: Usage Data encompasses all operational data generated through the use of Functionize AI Agents and platform operations, including:

  • Test case and workflow statistical metadata (timing data, execution states)
  • Technical metadata (path information, visibility states, scrolling data)
  • Structural metadata (element locations, relationships, CSS properties)
  • Administrative metadata and application characteristics
  • Screenshots and visual state information

AI Model Training Applications: We use aggregated and anonymized Usage Data to train and improve our internal-only AI models in several ways:

  • Element Identification Models: Usage Data helps train our proprietary neural networks to achieve 99.97% element identification accuracy across diverse application types
  • Self-Healing Capabilities: Application state changes and test adaptation patterns inform our self-healing test maintenance technology
  • Pattern Recognition: Structural and behavioral patterns improve our agents' ability to understand application workflows
  • Test Generation: Anonymized workflow patterns enhance our generative AI capabilities for autonomous test creation

Data Retention: Application metadata used for training or model optimization is stored for a maximum of 180 days and then deleted. Residual insights remain embedded in trained models as learning patterns.

Authorized User Data

Definition: Personal information used solely by authorized users to access customer accounts (login credentials, user email addresses).

Usage: This data is used exclusively for authentication and access control purposes and is never used for AI model training.

AI Model Architecture and External API Usage

Proprietary AI Models

Functionize operates multiple internally-developed AI models, including:

  • 500M parameter neural network for deterministic element identification
  • 40B to 100B parameter Create agent for zero-shot test generation
  • Specialized models for computer vision, intent parsing, analytics, and context memory
  • Multimodal model stack incorporating vision, data, and language processing

External LLM Calls and Data Sharing

While our core testing capabilities rely on proprietary models, we may make selective calls to external language models in the following specific scenarios, however all data stays within Functionize’s Google Compute Cloud:

When External LLM Calls Occur:

  • Natural language test description processing for complex user intents
  • Advanced error narrative generation for diagnostic reporting
  • Supplementary language understanding for non-standard application interfaces

Data Passed in External LLM Calls:

  • Anonymized application structure information (DOM elements, without customer data)
  • Generic test step descriptions (without specific customer workflows)
  • Error patterns and technical metadata (sanitized of customer identifiers)
  • Never included: Customer Data, personally identifiable information, proprietary business logic, or sensitive application content

Safeguards: All external API calls are filtered through our governance layer to ensure data privacy compliance and prevent sensitive information exposure. All data is retained in Functionize's GCP tenant and is not used by the model providers or other 3rd parties for training.

Data Security and Privacy Protections

Encryption and Storage

  • All data encrypted at rest using AES-256 encryption
  • All data transfer uses TLS 1.3 protocols
  • Data stored in secure Google Cloud buckets with customer-centric logical partitioning
  • Access controls strictly enforced with authorized personnel only

Privacy by Design

  • No Personal Data Collection: Functionize does not consume or store personal information from applications under test
  • Production Data Recommendations: We recommend customers use scrubbed or dummy data in testing environments to eliminate personal data exposure
  • Screenshot Safeguards: When screenshots are captured for computer vision or displaying test results, we recommend non-production environments to prevent inadvertent personal data inclusion

Compliance and Governance

  • SOC 2 Type II annual audits
  • ISO 27001 and COBIT framework alignment
  • NIST Cybersecurity Framework compliance
  • Comprehensive governance layer enforcing versioning, security, and audit controls across all AI agent boundaries

Data Minimization and Purpose Limitation

Collection Principles

We collect only the metadata necessary to:

  • Understand application state changes for accurate test execution
  • Enable self-healing test capabilities
  • Provide autonomous test generation and maintenance
  • Deliver comprehensive test coverage and reporting

Usage Limitations

  • Usage Data for model training is aggregated and anonymized
  • No cross-customer data sharing or model training
  • Customer Data remains isolated to specific customer environments
  • Anonymized training data automatically expires after 180 days maximum retention

Transparency and Control

Customer Rights

  • Complete visibility into data collection practices through this documentation
  • Ability to request data handling specifics through customer success teams
  • Option to implement increased data retention to meet compliance requirements

Continuous Improvement

Our AI models continuously improve through responsible use of anonymized Usage Data, enabling:

  • Enhanced element identification accuracy
  • Reduced test maintenance overhead (80% average reduction)
  • Improved autonomous test generation capabilities
  • Better self-healing and adaptation to application changes

Tests and their associated data are stored per the Data Retention policy.