Skip to main content
Data Flow Discovery
N
Written by Nikhil Kukade
Updated this week

Privado enables companies to get complete visibility into how data flows across their infrastructure, externally to third parties, internal databases, and leakages to logs. With this visibility, companies can protect personal data and ensure data flows are compliant with consumer expectations and privacy laws.

Data Flows

To discover data flows, Privado does a code flow analysis inside a code repository. Privado builds a privacy graph using code semantics (AST, Program Dependencies, & Control Flow) and layers it with data elements (sources) and data destinations (sinks).

Sources

Privado tags all variables, classes, and objects that process personal data. These are called sources.

Sinks

Privado tags all destinations of data: third parties, APIs, databases, log functions, and messaging queues as sinks and attaches the processing tag to each of these.

Data Flow Analysis

Once sources and sinks are tagged, Privado runs a code flow analysis to find flows between them and shows them data flow diagrams.

Privado detects the following types of data flows:

  • Sharing: These are data flows to third parties via packages or APIs

  • Storage: These are data flows to datastores

  • Leakage: Any log leakages related flows

  • Collection: Data flows to & from API Rest End points

  • Processing: Operations other than above categories

Here is a typical flow, signup gets transformed into email which gets transformed into message which is sent to Slack.

Inventory

Privado also creates an inventory view where you can see all data elements flowing to a specific third party or a database.

Did this answer your question?