Getting started with JupyterHealthClient#
First, you’ll want to create a JupyterHealthClient.
In a managed deployment, credentials are typically loaded from the $JHE_TOKEN and $JHE_URL environment variables.
from jupyterhealth_client import Code, JupyterHealthClient
# use anonymize=True to allow output in documentation
jh_client = JupyterHealthClient(anonymize=True)
# or jh_client = JupyterHealthClient(url=url, token=token)
Retrieving information#
Getting the current user#
First, we can see who we are logged in as:
jh_client.get_user()
Getting study information#
We can list all the studies I currently have access to, including the organization they are associated with.
study_id will be useful for retrieving observations .ater.
print("All my studies:")
for study in jh_client.list_studies():
print(f" - [{study['id']}] {study['name']} org:{study['organization']['name']}")
And we can get a single study by id:
jh_client.get_study(study["id"])
Getting patient information#
We can list patients we have access to with list_patients(),
and see which studies they have shared data with using get_patient_consents.
list endpoints all return generators and should handle pagination automatically when there are a lot of results.
# show all the patients with study data I have access to:
print("Patients with data I have access to:")
for patient in jh_client.list_patients():
consents = jh_client.get_patient_consents(patient["id"])
if not consents["studies"] and not consents["studiesPendingConsent"]:
continue
print(
f"[{patient['id']}] {patient['nameFamily']}, {patient['nameGiven']} ({patient['telecomEmail']})"
)
for study in consents["studies"]:
for scope in study["scopeConsents"]:
if scope["consented"]:
# remember which patients have which data for later in the demo
if scope["code"]["codingCode"] == Code.BLOOD_GLUCOSE.value:
cgm_patient_id = patient["id"]
cgm_study_id = study["id"]
if scope["code"]["codingCode"] == Code.BLOOD_PRESSURE.value:
bp_patient_id = patient["id"]
bp_study_id = study["id"]
print(f" - [{study['id']}] {study['name']} ({scope['code']['text']})")
for study in consents["studiesPendingConsent"]:
print(f" - (not consented) [{study['id']}] {study['name']}")
Retrieving Observations#
list_observations_df retrieves all observations into a pandas
You can filter by:
study_id- fetch data authorized to a single studypatient_id- fetch data for a single patientcode- aCodefilter to select only a single measurement type (e.g.Code.BLOOD_PRESSURE)
At least one of study_id or patient_id must be specified.
code is always optional.
To get all blood pressure data for a single study:
bp_iter = jh_client.list_observations(study_id=bp_study_id, code=Code.BLOOD_PRESSURE)
bp_iter
observation = next(iter(bp_iter))
observation
The interesting data is in valueAttachment, which is a base64-encoded JSON blob. We can extract it:
import base64
import json
json.loads(base64.decodebytes(observation["valueAttachment"]["data"].encode()).decode())
Or we can use tidy_observation to turn the nested structure of an Observation into one more suitable for DataFrames.
tidy_observation takes nested fields and turns them into a single flat dictionary, so
{"a": "b": 5}}
becomes
{"a_b": 5}
tidy_observation also understands the structure of the valueAttachment, so it handles the base64/json bit, too:
from jupyterhealth_client import tidy_observation
tidy_observation(observation)
Loading observations into a DataFarme#
list_observations_df takes the same arguments as list_observations, but returns a DataFrame instead of a generator.
The observations are passed through tidy_observation, so the keys above are the columns of the DataFrame.
The same data:
# get all blood pressure data
full_bp = jh_client.list_observations_df(study_id=bp_study_id, code=Code.BLOOD_PRESSURE)
full_bp.columns
The data frame preserves all fields recorded by JHE, which is a lot. You can thin this out by selecting columns to make things more manageable.
Generally the most informative columns are:
code- the code identifying the data type for the row (ifcodeisn’t filtered; always matches the inputcode, if given)subject_reference- thePatient/$ididentifier (useful when you have retrieved data for multiple patients)effective_time_frame_date_time- the effective time of the Observation in UTC. Also available aseffective_time_frame_date_time_localif the local time-of-day at the time and place of measurement is useful.*_valuecolumns - the actual measurements, e.g.systolic_blood_pressure_value,blood_glucose_value, etc.
Now we can use that and groupby("subject_reference") in case we have more than one patient.
bp = full_bp[
[
"subject_reference",
"effective_time_frame_date_time",
"systolic_blood_pressure_value",
"diastolic_blood_pressure_value",
]
]
bp
bp.groupby("subject_reference").plot(
x="effective_time_frame_date_time",
y=["systolic_blood_pressure_value", "diastolic_blood_pressure_value"],
style="o",
)
Continuous Glucose Monitor (CGM) data for a single patient#
We can do the same with CGM data.
This time, we use patient_id and code to retrieve CGM data for a single patient.
# get all cgm data
full_cgm = jh_client.list_observations_df(
patient_id=cgm_patient_id, code=Code.BLOOD_GLUCOSE
)
full_cgm.columns
We can transform the data to have the columns expected by cgmquantify and plot it:
import cgmquantify
cgm = full_cgm.loc[:, ["effective_time_frame_date_time_local", "blood_glucose_value"]]
# define columns cgmquantify expects
cgm["Time"] = cgm.effective_time_frame_date_time_local
cgm["Glucose"] = cgm.blood_glucose_value
cgm["Day"] = cgm["Time"].dt.date
cgmquantify.plotglucosebounds(cgm)