Lessons Learned from applying right to be forgotten (RTBF) and consent across thousands of applications and databases
Meeting DPOs, CISOs and IT leadership teams, the main question that comes up is how to apply RTBF and consent in large and complex landscape, comprised of thousands of business applications, big data and enterprise data warehouses.
In this blog (and the following ones), I will explain the pros and cons of today two common approaches:
- Data-centric approach
- Application-centric approach
According to the Data-centric approach, which is heavily pushed by database vendors (Database Activity Monitoring (DAM) and data masking vendors), RTBF and consent should be handled on the database level.
That’s not very surprising – the evangelists of the data-centric approach latch on to the idea that protecting data needs to be done at source, where the personal data resides. The problem with this approach is that the GDPR is far more concerned with “data-flows and processes” than the data-source. Protection needs to be put in place to ensure access on a need-to-know basis with consent or legal basis.
This requires personal data access enforcement based on rich context, including end-user identity and request context. For instance, an email marketing campaign extraction requires consent or customer profiling batch request with legal basis or 3rd party extract – criteria based data set which is then forwarded through a process. The reason for this activity is not part of the criteria, so it will not help in any way for deciding which rule to apply in the data store.
Applying consent requires to get the context at the enforcement point (data store). The caveat is that central data stores, EDW and big-data environments are blind to the application end-user and request context/parameters (database login uses application “service accounts”, application-level caching, multiplexing and middleware like Tuxedo). They all hide user identity and process context from the data-layer. This blindness on the data-layer requires changing application source-code that will enable to impose the required personal data access protection. Changing source-code in hundreds of applications is not feasible.
Another caveat is found in the complexities of the data-sources, where central data stores can be interacting with a maze of business applications and processes. Applying any personal data access restrictions, data masking or anonymization in the data-source can cause corruption to the spaghetti of consuming applications.
In addition, other complexities of the data-store include Layered column aliases and views in code, functional and column concatenation, transient logical views and aggregated snapshots and materialized views, replications and clones created at run time to various consumption processes.
This is why we invented the application-centric approach, which we will be discussing in our next blog.