Data flow discovery
Written by Nikhil Kukade
Updated over a week ago

Privado enables companies to get complete visibility into how data flows across their infrastructure, externally to third parties, databases it is stored & leakages to logs. With this visibility, companies can protect personal data & ensure data flows are compliant with consumer expectations & privacy laws.

Privado leverages Data Discovery & Data Destination Discovery for building these data flows.

Data Flows

In order to detect data flows, Privado does a code flow analysis inside a code repository. Privado builds a privacy graph using code semantics(AST, Program Dependencies & Control Flow) & layer it with data elements(sources) & data destinations(sinks). Currently, the data flow functionality is available in JAVA and experimental support is available in Python, Javascript, C & Kotlin. Contact your account manager to get the experimental support enabled on your account.


Privado tags all variables, classes & objects that process personal data, these are called sources.


Privado tags all destinations of data - third parties, APIs, databases, log functions, and messaging queues as sinks and attaches the processing tag to each of these.

Data Flow Analysis

Once sources & sinks are tagged, Privado runs a code flow analysis to find flows between them and shows them on the dashboard.

Privado detects the following types of data flows:

  • Sharing: These are data flows to third parties via packages or APIs

  • Storage: These are data flows to datastores

  • Leakage: Any log leakages related flows

  • Collection: Data flows to & from API Rest End points

  • Processing: Operations other than above categories

Here is a typical flow, signup gets transformed into email which gets transformed into message which is sent to Slack.


Privado also creates an inventory view where you can see all data elements flowing to a specific third party or a database.

Did this answer your question?