License Type | SaaS |
Feature | Analysis |
Main Product Category | Traceable UI |
Sub Category | PII |
Question
How can I minimize data sent to the Traceable UI?
Answer
Traceable Data Processing Flow
Customers use Traceable to provide Visibility into the application context, API DNA and Data, Detection of abnormal activities and attacks, including blocking attackers and Analysis of user and data activity and data flows that are adjacent to abnormal activity.
The most detailed analytics and human-readable examples of API & data flows can be provided by ingesting full requests and responses to API calls, formed into distributed traces, into Traceable cloud.
This is exactly what happens when Traceable is in Advanced processing mode. Most API requests and responses including headers, query parameters, request and response bodies are sent to Traceable AI for further analytics, including session-based attack detection, user flow analytics, content anomalies detection and data queries.
At the same time, we are aware that many of our customers handle sensitive or proprietary data, which cannot be handed off to a subprocessor. To allow for the safe handling of these data and to ensure our customers can stay within the bounds of their compliance requirements, we have facilities for defining which data should never leave the customer premises, as described in the Sensitive Data Handling section.
Most data are stored for 8 days. Data in traces where malicious activity was detected is stored for 90 days.
Sensitive Data Handling
Before any data leaves Traceable Platform Agent (agent installed on the customer premises), the content of the trace are processed through the Data Identification module. API Endpoints protected by Traceable communicate via http, which means that the communications includes queries, requests with key-value pairs in the Headers sections and in the request body and responses also with headers and bodies.
Data identification module analyzes all the above fields for sensitive data. This module has default rules for identifying known sensitive data types, such as passwords, secrets and tokens, and can be extended with customer-defined sensitive data rules. Customers use this facility to identify PII data and other information that may be sensitive or confidential. The rules can specify the sensitive element by key, such as the name of the sensitive header or a sensitive response body attribute, or by value, such as a pattern that may be indicative of a credit card number.
Once sensitive data are identified, a data type is assigned for flow tracing and a separate process ensures that the content of the field does not contain malicious patterns. (A common example of a possible malicious pattern is an attempted SQL injection within a user password submission).
Sensitive data can be identified in headers or bodies of request and response, or in URL query parameters. Identified sensitive data values can either be marked for collection without modification, can be obfuscated (replaced with a hash) or can be redacted (replaced with “***” string). How the data of each type are treated is up to the customer.
Standard Operations Mode
Customers using a Team edition license, Free edition license or when selecting a specific lower risk service within their infrastructure may opt for Standard Operations Mode.
In Standard Operations Mode, only a small sample of traces is sent to Traceable AI. All traces are still processed locally for the signs of OWASP Top 10 attacks, using signature-based rules enhanced with Traceable ML.
Analytics features and many of the advanced processing features, including BOLA/IDOR, user session analytics, bot pattern detection and some others are not be available in this mode, however, users do enjoy full API Intelligence functionality and protection capabilities consistent with what is available from most NG WAFs.
Full Privacy with Standard Operations Mode
With the API Endpoints and Services protected by Standard Operations Mode, an additional Full Data Privacy setting is available.
In Settings > Operation Mode, there’s an option to turn on Full Data Privacy (It is off by default). When Full Data Privacy is on, all sampled traces from local processing that are sent to Traceable AI will only include keys across headers, bodies, and URL queries. All the values will be redacted (replaced with ‘***’).
The only exception to these redaction rules are specific headers, where the value identifies the source or the destination of the API request itself. The exceptions include IP address and URL patterns of the API but never user-identifiable information.
Below is a specific list of the headers that are never redacted:
- "x-real-ip"
- "forwarded"
- "x-forwarded-for"
- "x-proxy-user-ip"
- ":authority"
- "grpc-status"
- ":status"
- ":path"
- "content-length"
- "content-type"
- "host"
- "user-agent"