How Functionize Uses Data

This document reflects our current data practices as of the last update in August 2025. For questions about our data usage practices or to discuss specific data handling requirements, please contact your Functionize customer success representative.

Overview

Functionize takes data privacy seriously. We implement a strict separation between customer data and model training activities. We only use anonymized application metadata, such as element coordinates, structural positions, and visual styles, to train our propritary internal AI & ML models, ensuring that no confidential information from applications under test is ever used for model optimization.

Customer data is secured using best-in-class techniques including AES-256 encryption at rest, TLS 1.3 for all data transfers, and logically partitioned storage in secure Google Cloud environments with strict access controls. We recommend customers use scrubbed or dummy data in testing environments to eliminate any risk of personal data exposure, and our application metadata used for training is automatically deleted after a maximum of 180 days, with only anonymized insights remaining embedded in our trained models.

This document outlines how we collect, use, and protect data within our platform ecosystem.

Data Categories and Usage

Customer Data

Definition: Customer Data comprises information that customers authorize Functionize to ingest into our platform, including data from customer websites, applications, networks, platforms, or environments. This explicitly excludes Usage Data (below). Customers should ensure that all Customer Data does not contain personally identifiable, confidential, or regulated information.

Usage: Customer Data is processed solely to provide testing services to the specific customer and is never used for training internal AI models or shared across customer boundaries.

Usage Data

Definition: Usage Data encompasses all operational data generated through the use of Functionize AI Agents and platform operations, including:

Test case and workflow statistical metadata (timing data, execution states)
Technical metadata (path information, visibility states, scrolling data)
Structural metadata (element locations, relationships, CSS properties)
Administrative metadata and application characteristics
Screenshots and visual state information

AI Model Training Applications: We use aggregated and anonymized Usage Data to train and improve our internal-only AI models in several ways:

Element Identification Models: Usage Data helps train our proprietary neural networks to achieve 99.97% element identification accuracy across diverse application types
Self-Healing Capabilities: Application state changes and test adaptation patterns inform our self-healing test maintenance technology
Pattern Recognition: Structural and behavioral patterns improve our agents' ability to understand application workflows
Test Generation: Anonymized workflow patterns enhance our generative AI capabilities for autonomous test creation

Data Retention: Application metadata used for training or model optimization is stored for a maximum of 180 days and then deleted. Residual insights remain embedded in trained models as learning patterns.

Authorized User Data

Definition: Personal information used solely by authorized users to access customer accounts (login credentials, user email addresses).

Usage: This data is used exclusively for authentication and access control purposes and is never used for AI model training.

AI Model Architecture and External API Usage

Proprietary AI Models

Functionize operates multiple internally-developed AI models, including:

500M parameter neural network for deterministic element identification
40B to 100B parameter Create agent for zero-shot test generation
Specialized models for computer vision, intent parsing, analytics, and context memory
Multimodal model stack incorporating vision, data, and language processing

External LLM Calls and Data Sharing

While our core testing capabilities rely on proprietary models, we may make selective calls to external language models in the following specific scenarios, however all data stays within Functionize’s Google Compute Cloud:

When External LLM Calls Occur:

Natural language test description processing for complex user intents
Advanced error narrative generation for diagnostic reporting
Supplementary language understanding for non-standard application interfaces

Data Passed in External LLM Calls:

Anonymized application structure information (DOM elements, without customer data)
Generic test step descriptions (without specific customer workflows)
Error patterns and technical metadata (sanitized of customer identifiers)
Never included: Customer Data, personally identifiable information, proprietary business logic, or sensitive application content

Safeguards: All external API calls are filtered through our governance layer to ensure data privacy compliance and prevent sensitive information exposure. All data is retained in Functionize's GCP tenant and is not used by the model providers or other 3rd parties for training.

Data Security and Privacy Protections

Encryption and Storage

All data encrypted at rest using AES-256 encryption
All data transfer uses TLS 1.3 protocols
Data stored in secure Google Cloud buckets with customer-centric logical partitioning
Access controls strictly enforced with authorized personnel only

Privacy by Design

No Personal Data Collection: Functionize does not consume or store personal information from applications under test
Production Data Recommendations: We recommend customers use scrubbed or dummy data in testing environments to eliminate personal data exposure
Screenshot Safeguards: When screenshots are captured for computer vision or displaying test results, we recommend non-production environments to prevent inadvertent personal data inclusion

Compliance and Governance

SOC 2 Type II annual audits
ISO 27001 and COBIT framework alignment
NIST Cybersecurity Framework compliance
Comprehensive governance layer enforcing versioning, security, and audit controls across all AI agent boundaries

Data Minimization and Purpose Limitation

Collection Principles

We collect only the metadata necessary to:

Understand application state changes for accurate test execution
Enable self-healing test capabilities
Provide autonomous test generation and maintenance
Deliver comprehensive test coverage and reporting

Usage Limitations

Usage Data for model training is aggregated and anonymized
No cross-customer data sharing or model training
Customer Data remains isolated to specific customer environments
Anonymized training data automatically expires after 180 days maximum retention

Transparency and Control

Customer Rights

Complete visibility into data collection practices through this documentation
Ability to request data handling specifics through customer success teams
Option to implement increased data retention to meet compliance requirements

Continuous Improvement

Our AI models continuously improve through responsible use of anonymized Usage Data, enabling:

Enhanced element identification accuracy
Reduced test maintenance overhead (80% average reduction)
Improved autonomous test generation capabilities
Better self-healing and adaptation to application changes

‍

Tests and their associated data are stored per the Data Retention policy.