The system enables a dataset and a learning algorithm to be encrypted and upload to cloud storage. Then, the training code can access the dataset in a trusted environment that keeps code and data confidential, generating a model that can be used without ever disclosing the original data. Finally, the access to the data and code are logged into the blockchain so that future uses of the model can be traced back to the original information and code.
Show that data could be used in a confidential and transparent fashion to train models.
Confidential data being used to train models and the tracking of the data used to generate the models. This tracking enables both provenance and data protection features, such as the right to be forgotten.
Sensitive data can be used to train models with transparency and confidentiality guarantees.
There was no approach to enforce this due to the possibility of breaking the integrity of the tracking process, consequence of the lack of trusted environments.
The trusted execution environments enable the enforcement of the tracking and the confidentiality of source data.
Application developer
Can use models (e.g., for classification) that use sensitive data, providing provenance and transparency.
Data scientist
Can protect the IP of his code or get access to sensitive data, which would not be provided to him otherwise.
Application manager
Has less pressure to secure his environment, since trusted execution environments reduce the need for the base infrastructure to be secure.
System administrator
The same as the application manager.
Data owner
Has guarantees that his data will be kept confidential, that he will track its usage, and that he can revoke consent.
More info soon
More info soon