Skip to content

Open BEC HDF5 Files with h5py

Overview

Open a BEC scan file with h5py, inspect its metadata and recorded datasets, and locate linked external files when present.

Prerequisites

  • h5py is installed in the Python environment you are using.
  • You know the path to the BEC HDF5 file you want to inspect.
  • The file is readable from your current machine.

Start with the BEC master file, usually named like S01234_master.h5. That file contains the standard BEC structure and can reference additional detector files.

1. Find the scan file

If you just ran the scan, copy the file path from the File: line in the scan report.

If you want to inspect an older scan, retrieve it from history first:

scan = bec.history[-1]
scan

Printing the scan container shows a summary that includes the file path.

2. Open the file with h5py

In Python, open the HDF5 file in read-only mode:

Open the HDF5 file in read-only mode
import h5py

path = "/path/to/S01234_master.h5"

with h5py.File(path, "r") as f:
    print(list(f.keys()))

For a standard BEC file, this usually starts with the top-level entry group.

3. Inspect the main BEC groups

The BEC-specific content is typically under /entry/collection.

Inspect the main BEC groups
path = "/path/to/S01234_master.h5"

with h5py.File(path, "r") as f:
    collection = f["entry"]["collection"]
    print(list(collection.keys()))

The groups that are usually most useful are:

  • metadata for scan metadata
  • readout_groups for recorded scan data grouped by readout priority
  • devices for device-oriented access
  • file_references for external file links

4. Read metadata and one dataset

To inspect metadata:

Inspect metadata keys
path = "/path/to/S01234_master.h5"

with h5py.File(path, "r") as f:
    metadata = f["entry"]["collection"]["metadata"]
    print(list(metadata.keys()))

To read one dataset into memory:

Read one dataset into memory
path = "/path/to/S01234_master.h5"

with h5py.File(path, "r") as f:
    data = f["entry"]["collection"]["readout_groups"]["monitored"]["samx"]["samx"]["value"][...]
    print(data)

The exact path below readout_groups depends on the devices and signals recorded in your scan. In the HDF5 file, the standard group names are monitored, baseline, and async.

5. Explore the tree when you do not know the exact path

If you want to inspect the structure first, walk the file:

Walk the HDF5 tree
import h5py

path = "/path/to/S01234_master.h5"


def show_tree(name, obj):
    print(name)

with h5py.File(path, "r") as f:
    f.visititems(show_tree)

This is useful when you know the file contains the data you need but do not yet know the exact dataset path.

6. Check external file references

If the scan includes detectors that write their own files, inspect file_references:

Inspect file references
path = "/path/to/S01234_master.h5"

with h5py.File(path, "r") as f:
    refs = f["entry"]["collection"]["file_references"]
    print(list(refs.keys()))

The master file is still the main entry point, but detector data can live in separate files linked from there.

Congratulations!

You have successfully opened a BEC scan file with h5py and inspected the main metadata and data groups.

Common pitfalls

  • Opening a detector sidecar file and expecting the full BEC structure there.
  • Hard-coding one dataset path and assuming every scan writes the same device and signal names.
  • Forgetting to use [...] to read the dataset into memory, which can lead to unexpected behavior if you try to access it later after closing the file.
  • Using [...] on a very large dataset without checking its shape first.
  • Modifying the file accidentally by opening it in a writable mode instead of "r".

Next steps