PHEMI Health DataLab
A cloud-based system for
privacy, security & governance

Single Pane of Glass

PHEMI has one central system to manage data stored in many locations (cloud or hybrid cloud), process data in many locations (cloud or hybrid cloud) while controlling access and governing data all in one place.

Screenshoot of the platform with a woman in the background using the app
  • One access control policy

  • One place to manage all data

  • Centrally manage data stored in the cloud or hybrid cloud (multiple locations)

  • Abstract away platform and service integration complexities


Conventional data protection products simply lock down your data. PHEMI goes further.

Unlike most data management systems, PHEMI Health DataLab is built with Privacy-by-Design principles, not as an add-on. This means privacy and data governance are built-in from the ground up, providing you with distinct advantages:

  • Lets analysts work with data without breaching privacy guidelines
  • Includes a comprehensive, extensible library of de-identification algorithms to hide, mask, truncate, group, and anonymize data.
  • Creates dataset-specific or system-wide pseudonyms enabling linking and sharing of data without risking data leakage.
  • Collects audit logs concerning not only what changes were made to the PHEMI system, but also data access patterns.
  • Automatically generates human and machine-readable de- identification reports to meet your enterprise governance risk and compliance guidelines.
  • Rather than a policy per data access point, PHEMI gives you the advantage of one central policy for all access patterns, whether Spark, ODBC, REST, export, and more

Metadata Governs Access Control

Data access is driven by PHEMI’s Attribute-Based Access Control policy engine. Metadata attributes of the data, in combination with attributes of the user and their environment, are processed by the policy to dynamically determine access to data, providing contextual, scalable, and simpler access controls over traditional role-based access.

Metadata curation begins on ingest to immediately control data access, with data tagged with 42 out of the box attributes such as public, confidential, secret, and top secret. Metadata, and in particular security labels, follow the data as it is transformed and combined with other data sets. Once data enters the pipeline and new datasets are created, it is curated with more granular metadata at column, row, and even cell level to iteratively provide finer-grained access.

PHEMI’s Architectural Components

PHEMI Central Architectural Components

The PHEMI Health DataLab is a vital security asset that is available in the cloud, combining best-of-breed open source and custom software modules, all fully integrated using Privacy-by-Design principles.



Spark SQL

Azure Table Storage

Apache Airflow

Apache NiFi

Secure Landing Zone

Discovery Zone

Consumption zone

More Information

For more information about what PHEMI can do for you, check out the Data Sheets, and additional information on the website.


Sign Up For Everything
Data-Related:Tips, White Papers, Opinion Pieces, Webinar Invitations & News

    Email use governed by our Privacy Policy

    Sign Up for Big Data Newsletter