Using Encryption to make the Right to Be Forgotten Practical.
Article 17 paragraph 1 of the GDPR states "The data subject shall have the right to obtain from the controller the erasure of personal data concerning him or her without undue delay and the controller shall have the obligation to erase personal data without undue delay where one of the following grounds applies..." https://gdpr-info.eu/art-17-gdpr/
Article 17 paragraph 2 of the GDPR states "Where the controller has made the personal data public and is obliged pursuant to paragraph 1 to erase the personal data, the controller, taking account of available technology and the cost of implementation, shall take reasonable steps, including technical measures, to inform controllers which are processing the personal data that the data subject has requested the erasure by such controllers of any links to, or copy or replication of, those personal data."
It is important to pay attention to the bolded words above. Because currently available technology and cost of implementation prohibit full faithfulness to the right to be forgotten when it comes to backups of information.
Consider the following scenario. An information system legitimately processes personal information for an individual. They do so in a cloud environment enabling the individual to have a reliable record store of some of their most personal information. Cloud drives for example. The information system owner in an effort to ensure that this personal information is available to the user, makes backups and those backups are placed into cold storage. Let's say that some of that information is stored on tape drives and kept in a secure location such as iron mountain.
The individual decides to sever the relationship with the information system owner and requests that their information be erased. The real world logistics would make it cost prohibitive to remove their records from a backup in cold storage, yet it can't fairly be said that they have been truly forgotten so long as these backups exist. They represent a privacy vulnerability to the individual.
What if we could redefine the definition of erasure? What if the definition of erasure meant that there are no practical means to recover the data. What if the information owner used encryption to make the cost of "erasing" data practical? The information owner could generate a unique encryption key for each individual. They could encrypt that individuals data using that unique encryption key. Then when the individual requests to be forgotten the only thing that would need to happen is that the individual's unique encryption key is destroyed. From a privacy perspective truly forgetting an individual would become a whole lot easier and cheaper.
What about the case where an individual has information in their record that also has information about other people that they are associated with. For example, a parent has information in their gaming account about their children. The child comes of age and for whatever reason requests that they be forgotten. Again encryption becomes the only practical way to implement this. But how would that work?
Well, there is a technology called Shamir's Secret Sharing. The way this works is that an encryption key is generated and parts of the key are distributed to multiple owners. The only way to reconstruct the key is to put a minimum number of the parts together again. So in the case of the parent managing their child's accounts, the information system would generate a unique key with a secret partially "owned" by the parent and partially "owned" by the child. If the portion of the parent's record that contains PII about their child is encrypted with such a key. Then to forget the child without forgetting the parent, one would simply discard the child's key. The child's portion of the shared secret would then become lost and the key which encrypts the child's PII would no longer be able to be reconstructed. Thus that part of the parent's record would essentially become "erased" as soon as you delete the child's key.
The above is a high level sketch of how it becomes technologically feasible to implement strong privacy without sacrificing many of the features we value today such as high availability and cost effectiveness.
How can we make private data even more private? You can use Shamirs Secret Sharing to make data even more private by giving the person about whom you are storing data a key of their own. If you use application level encryption to store a person's data, you can generate 2 portions of a key and give the individual a third portion. One portion goes to a third party recovery agent who has no access to the encrypted PII, one key goes to the person about whom you are storing data and one key goes to the data custodian. When a person accesses their data, the data custodian's key and the private person, provide their portions of the keys. This is enough to construct a decryption key enabling the data to be accessed. If the data custodian or the private person loose their keys, the data can still be recovered by accessing the third key held by the recovery agent.
Now there needs to be some level of trust that the data custodian is not storing the customer's portion of the key if the data is being encrypted server side. The private person could always opt to decrypt client side, and the function of the data custodian's key simply becomes a backup. The data custodian would have to collude with the recovery agent to compromise a private person's privacy.
But aside from the possibility that secure symmetric encryption may one day be broken, if PII is stored on a key per individual basis, compliance costs can be made feasible and privacy will ultimately be stronger for the protected individuals.
Article 17 paragraph 2 of the GDPR states "Where the controller has made the personal data public and is obliged pursuant to paragraph 1 to erase the personal data, the controller, taking account of available technology and the cost of implementation, shall take reasonable steps, including technical measures, to inform controllers which are processing the personal data that the data subject has requested the erasure by such controllers of any links to, or copy or replication of, those personal data."
It is important to pay attention to the bolded words above. Because currently available technology and cost of implementation prohibit full faithfulness to the right to be forgotten when it comes to backups of information.
Consider the following scenario. An information system legitimately processes personal information for an individual. They do so in a cloud environment enabling the individual to have a reliable record store of some of their most personal information. Cloud drives for example. The information system owner in an effort to ensure that this personal information is available to the user, makes backups and those backups are placed into cold storage. Let's say that some of that information is stored on tape drives and kept in a secure location such as iron mountain.
The individual decides to sever the relationship with the information system owner and requests that their information be erased. The real world logistics would make it cost prohibitive to remove their records from a backup in cold storage, yet it can't fairly be said that they have been truly forgotten so long as these backups exist. They represent a privacy vulnerability to the individual.
What if we could redefine the definition of erasure? What if the definition of erasure meant that there are no practical means to recover the data. What if the information owner used encryption to make the cost of "erasing" data practical? The information owner could generate a unique encryption key for each individual. They could encrypt that individuals data using that unique encryption key. Then when the individual requests to be forgotten the only thing that would need to happen is that the individual's unique encryption key is destroyed. From a privacy perspective truly forgetting an individual would become a whole lot easier and cheaper.
What about the case where an individual has information in their record that also has information about other people that they are associated with. For example, a parent has information in their gaming account about their children. The child comes of age and for whatever reason requests that they be forgotten. Again encryption becomes the only practical way to implement this. But how would that work?
Well, there is a technology called Shamir's Secret Sharing. The way this works is that an encryption key is generated and parts of the key are distributed to multiple owners. The only way to reconstruct the key is to put a minimum number of the parts together again. So in the case of the parent managing their child's accounts, the information system would generate a unique key with a secret partially "owned" by the parent and partially "owned" by the child. If the portion of the parent's record that contains PII about their child is encrypted with such a key. Then to forget the child without forgetting the parent, one would simply discard the child's key. The child's portion of the shared secret would then become lost and the key which encrypts the child's PII would no longer be able to be reconstructed. Thus that part of the parent's record would essentially become "erased" as soon as you delete the child's key.
The above is a high level sketch of how it becomes technologically feasible to implement strong privacy without sacrificing many of the features we value today such as high availability and cost effectiveness.
How can we make private data even more private? You can use Shamirs Secret Sharing to make data even more private by giving the person about whom you are storing data a key of their own. If you use application level encryption to store a person's data, you can generate 2 portions of a key and give the individual a third portion. One portion goes to a third party recovery agent who has no access to the encrypted PII, one key goes to the person about whom you are storing data and one key goes to the data custodian. When a person accesses their data, the data custodian's key and the private person, provide their portions of the keys. This is enough to construct a decryption key enabling the data to be accessed. If the data custodian or the private person loose their keys, the data can still be recovered by accessing the third key held by the recovery agent.
Now there needs to be some level of trust that the data custodian is not storing the customer's portion of the key if the data is being encrypted server side. The private person could always opt to decrypt client side, and the function of the data custodian's key simply becomes a backup. The data custodian would have to collude with the recovery agent to compromise a private person's privacy.
But aside from the possibility that secure symmetric encryption may one day be broken, if PII is stored on a key per individual basis, compliance costs can be made feasible and privacy will ultimately be stronger for the protected individuals.
Comments
Post a Comment