A large academic medical center deployed a Hadoop cluster to offload their data warehouse. The organization’s Chief Data Officer wanted to get more value from their data lake by creating an easy-to-use self-serve library of curated healthcare data for cohort analysis and research proposals.

Researchers needed to be able to work freely with de-identified data without the risk of re-identification, and then apply to an institutional review board for access to fully identified patient information.

  • Turn your data lake into an organized library of digital assets
  • Tag, de-identify, tokenize and link your sensitive information
  • Create intuitive point-and-click dashboards for self-serve access to information
  • Based on IRB approval, grant access to sensitive data

To find out more, please contact us


PHEMI worked with the customer to install PHEMI Central on their existing Hadoop system. Personal Health Information was inventoried, linked and de-identified. Access policies defined rules controlling access to information.

A clean user interface allowed researchers and analysts to simply point and click to build cohorts and freely export the information. PHEMI Central automatically measured and controlled the risk of re-identification.

Having built a customized and well-curated cohort, analysts could apply to their institutional review board for permission to view fully identified information.

With PHEMI Central, the academic medical center was able to build accurate cohorts in minutes, not months and quickly prepare research proposals while controlling access to sensitive data.

Build accurate cohorts in minutes, not months with controlled access to sensitive information.