The Problem

Up to 80% of Artificial Intelligence (AI), Machine Learning(ML) and Advanced Analytics workloads are Data Management e.g. data preparation, feature engineering, model training, and evaluation, etc. Emerging privacy legislation and heightened public awareness call for privacy protection, algorithmic transparency, and informed consent. Training and running machine learning and AI models on unrestricted personal data may be unethical or illegal. Companies need to be able to acquire user consent for a particular type of data processing, and enforce that any processing performed on that data, either by a human or machine, is compliant with obtained consent. Unless the platforms provide the tools for privacy management and workflow orchestration, the data scientist will need to keep re-implementing it in his application logic or in a scripting language.

The PHEMI Solution

PHEMI provides the tools to help with data curation, de-identification in accordance with a given risk of re-identification or specific privacy regulations and policies, feature engineering, and other data processing and logistic tasks. Workflow automation and orchestration tools simplify the development of complex processing pipelines, model deployment, evaluation, and lifecycle management, reducing the time requirement and coding errors. Access Controls, Provenance, and Audit log help ensure that processing remains compliant with customer consent and privacy regulations. Extensible Policy Engine is used for access control, metadata curation, data movement, lifecycle management, etc. greatly simplifying data governance.

Healthcare Opportunities

  • Improve internal processes and develop new products

Artificial Intelligence Case

  • PHEMI comes with an extendable library of de-identification functions that are applied consistently at derive time