From ee258177148eab134cec4e368411021e4bf0f9d3 Mon Sep 17 00:00:00 2001 From: Ruiyi Chen <95742792+RuiyiC@users.noreply.github.com> Date: Thu, 8 Dec 2022 13:53:54 +0800 Subject: [PATCH] Update readme adding the dicom part (#258) * Update Deployment.md * Update README.md * Update README.md * Update README.md * Update README.md * Update Deploy-DicomToDatalake.md * Update README.md --- FhirToDataLake/docs/Deploy-DicomToDatalake.md | 4 ++-- README.md | 16 +++++++++++++--- 2 files changed, 15 insertions(+), 5 deletions(-) diff --git a/FhirToDataLake/docs/Deploy-DicomToDatalake.md b/FhirToDataLake/docs/Deploy-DicomToDatalake.md index d55b882c..3e9d7e78 100644 --- a/FhirToDataLake/docs/Deploy-DicomToDatalake.md +++ b/FhirToDataLake/docs/Deploy-DicomToDatalake.md @@ -1,10 +1,10 @@ # DICOM to Synapse Sync Agent -DICOM to Synapse Sync Agent enables you to perform Analytics and Machine Learning on DICOM metadata by moving DICOM metadata to Azure Data Lake in near real time and making it available to a Synapse workspace. +DICOM to Synapse Sync Agent enables you to query directly and perform Analytics on DICOM metadata by moving DICOM metadata to Azure Data Lake in near real time and making it available to a Synapse workspace. It is an [Azure Container App](https://learn.microsoft.com/en-us/azure/container-apps/?ocid=AID3042118) that extracts data from a DICOM server using DICOM [Change Feed](https://learn.microsoft.com/en-us/azure/healthcare-apis/dicom/dicom-change-feed-overview) APIs, converts it to hierarchical Parquet files, and writes it to Azure Data Lake in near real time. This solution also contains a script to create External Table in Synapse Serverless SQL pool pointing to the DICOM metadata Parquet files. For more information about DICOM External Tables, see [Data mapping from DICOM to Synapse](./DICOM-Data-Mapping.md). -This solution enables you to query against the entire DICOM metadata with tools such as Synapse Studio, SSMS, and Power BI. You can also access the Parquet files directly from a Synapse Spark pool. You should consider this solution if you want to access all of your DICOM metadata in near real time, and want to defer custom transformation to downstream systems. +This solution enables you to query against the entire DICOM metadata with tools such as Synapse Studio in SQL. You can also access the Parquet files directly from a Synapse Spark pool. **Note**: An API usage charge will be incurred in the DICOM server if you use this tool to copy data from the DICOM server to Azure Data Lake. diff --git a/README.md b/README.md index 2aaa3c27..36d42d0a 100644 --- a/README.md +++ b/README.md @@ -1,12 +1,22 @@ -# FHIR Analytics Pipelines +# Health Data Analytics Pipelines - FHIR Analytics Pipelines is an open source project with the goal to help build components and pipelines for rectangularizing and moving FHIR data from Azure FHIR servers namely [Azure Healthcare APIs FHIR Server](https://docs.microsoft.com/en-us/azure/healthcare-apis/), [Azure API for FHIR](https://azure.microsoft.com/en-us/services/azure-api-for-fhir/), and [FHIR server for Azure](https://github.com/microsoft/fhir-server) to [Azure Data Lake](https://azure.microsoft.com/en-us/solutions/data-lake/) and thereby make it available for analytics with [Azure Synapse](https://azure.microsoft.com/en-us/services/synapse-analytics/), [Power BI](https://powerbi.microsoft.com/en-us/), and [Azure Machine Learning](https://azure.microsoft.com/en-us/services/machine-learning/). +Health Data Analytics Pipelines is an open source project with the goal to help build components and pipelines for transforming and moving FHIR and DICOM data from FHIR and DICOM servers to [Azure Data Lake](https://azure.microsoft.com/en-us/solutions/data-lake/) and thereby make it available for analytics with [Azure Synapse](https://azure.microsoft.com/en-us/services/synapse-analytics/), [Power BI](https://powerbi.microsoft.com/en-us/), and [Azure Machine Learning](https://azure.microsoft.com/en-us/services/machine-learning/). -This OSS project currently has the following two solutions: +This OSS project currently has the following solutions: 1. [FHIR to Synapse sync agent](FhirToDataLake/docs/Deploy-FhirToDatalake.md): This is an [Azure Container App](https://learn.microsoft.com/en-us/azure/container-apps/?ocid=AID3042118) that extracts data from a FHIR server using FHIR Resource APIs, converts it to hierarchial Parquet files, and writes it to Azure Data Lake in near real time. This also contains a [script](FhirToDataLake/scripts/Set-SynapseEnvironment.ps1) to create external tables and views in Synapse Serverless SQL pool pointing to the Parquet files. This solution enables you to query against the entire FHIR data with tools such as Synapse Studio, SSMS, and Power BI. You can also access the Parquet files directly from a Synapse Spark pool. You should consider this solution if you want to access all of your FHIR data in near real time, and want to defer custom transformation to downstream systems. + + Supported FHIR server: + [FHIR Service in Azure Health Data Services](https://learn.microsoft.com/en-us/azure/healthcare-apis/fhir/), [Azure API for FHIR](https://learn.microsoft.com/en-us/azure/healthcare-apis/azure-api-for-fhir/), [FHIR server for Azure](https://github.com/microsoft/fhir-server) + +1. [DICOM to Synapse sync agent](FhirToDataLake/docs/Deploy-DicomToDatalake.md): It is an [Azure Container App](https://learn.microsoft.com/en-us/azure/container-apps/?ocid=AID3042118) that extracts DICOM metadata from a DICOM server using DICOM [Change Feed](https://learn.microsoft.com/en-us/azure/healthcare-apis/dicom/dicom-change-feed-overview) APIs, converts it to hierarchical Parquet files, and writes it to Azure Data Lake in near real time. This solution also contains a script to create External Table in Synapse Serverless SQL pool pointing to the DICOM metadata Parquet files. For more information about DICOM External Tables, see [Data mapping from DICOM to Synapse](./DICOM-Data-Mapping.md). + + This solution enables you to query against the entire DICOM metadata with Synapse in SQL. You can also access the Parquet files directly from a Synapse Spark pool. + + Supported DICOM server: + [DICOM service in Azure Health Data Services](https://learn.microsoft.com/en-us/azure/healthcare-apis/dicom/), [DICOM server](https://github.com/microsoft/dicom-server) 1. [FHIR to CDM Pipeline Generator](FhirToCdm/docs/fhir-to-cdm.md): It is a tool to generate an ADF pipeline for moving a snapshot of data from a FHIR server using $export API to a [CDM folder](https://docs.microsoft.com/en-us/common-data-model/data-lake) in Azure Data Lake Storage Gen 2 in csv format. The tools requires a user-created configuration file containing instructions to project and flatten FHIR Resources and fields into tables. You can also follow the [instructions](FhirToCdm/docs/cdm-to-synapse.md) for creating a downstream pipeline in Synapse workspace to move data from CDM folder to Synapse dedicated SQL pool.