CHAP.saxswaxs package

PipelineItems unique to SAXSWAXS data processing workflows.

This module contains all the PipelineItems (Processors, Readers and Writers) that are unique to the SAXSWAXS workflow. Any of these PipelineItems can be used as items in a CHAP Pipeline or instantiated from a user Python script.

Note

Using the SAXSWAXS workflow pipeline items in a CHAP Pipeline and running it, requires a SAXSWAXS conda environent or access to the appropriate CHAP SAXSWAXS executable, see SAXS/WAXS module (CHAP.saxswaxs)

Submodules summary

processor

Processors unique to the SAXSWAXS workflow.

reader

Readers unique to the SAXSWAXS workflow.

writer

Writers unique to the SAXSWAXS workflow.

Submodules

CHAP.saxswaxs.processor module

Module for Processors unique to the SAXSWAXS workflow.

Add discription of SAXS/WAXS

class CfProcessor(*, root: Annotated[Path, PathType(path_type=dir)] | None = '/home/runner/work/ChessAnalysisPipeline/ChessAnalysisPipeline/docs', inputdir: Annotated[Path, PathType(path_type=dir)] | None = None, outputdir: Annotated[Path, PathType(path_type=dir)] | None = None, interactive: bool | None = False, log_level: Literal['DEBUG', 'INFO', 'WARNING', 'ERROR', 'CRITICAL'] | None = 'INFO')[source]

Bases: Processor

Processor to calculate the correction factor Cf that, when multiplied by appropriately processed SAXS/WAXS data, converts data to absolute cross-section / intensity in inverse cm.

model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_post_init(context: Any, /) None

This function is meant to behave like a BaseModel method to initialize private attributes.

It takes context as an argument since that’s what pydantic-core passes when calling it.

Args:

self: The BaseModel instance. context: The context.

process(data, interactive=False, save_figures=True, nxpath=None, radial_range=None, scan_step_indices=None, eps=1e-05)[source]

Compute the correction factor Cf and the configuration parameters.

Parameters:
  • data (list[PipelineData]) – Input data list containing the reference data labelled with ‘reference_data’ as well as the NeXus input data with the azimuthally integrated SAXS/WAXS data.

  • interactive (bool, optional) – Allows for user interactions, defaults to False.

  • save_figures (bool, optional) – Create Matplotlib correction factor image that can be saved to file downstream in the workflow, defaults to True.

  • nxpath (str, optional) – Path to a specific NeXus Style NXdata object in the input NeXus file tree to the measured data from.

  • radial_range (list[float, float] or tuple[float, float]), optional) – q-range used to compute Cf.

  • scan_step_indices (int, str, list[int], optional) – Optional scan step indices to use for the calculation. If not specified, the correction factor will be computed on the average of all data for the scan.

  • eps (float, optional) – Minimum plotting value of the corrected azimuthally integrated SAXS/WAXS data, defaults to 1.e-5.

Returns:

Computed correction factor Cf and the configuration parameters plus the optional correction factor image as a PipelineData object.

Return type:

dict or (dict, PipelineData)]

class FluxAbsorptionBackgroundCorrectionProcessor(*, root: Annotated[Path, PathType(path_type=dir)] | None = '/home/runner/work/ChessAnalysisPipeline/ChessAnalysisPipeline/docs', inputdir: Annotated[Path, PathType(path_type=dir)] | None = None, outputdir: Annotated[Path, PathType(path_type=dir)] | None = None, interactive: bool | None = False, log_level: Literal['DEBUG', 'INFO', 'WARNING', 'ERROR', 'CRITICAL'] | None = 'INFO')[source]

Bases: ExpressionProcessor

Processor for flux, absorption, and background correction as well as optional thickness correction.

model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_post_init(context: Any, /) None

This function is meant to behave like a BaseModel method to initialize private attributes.

It takes context as an argument since that’s what pydantic-core passes when calling it.

Args:

self: The BaseModel instance. context: The context.

process(data, presample_intensity_reference_rate=None, sample_thickness_cm=None, sample_mu_inv_cm=None, nxprocess=False)[source]

Given input data for ‘intensity’, ‘presample_intensity’, ‘postsample_intensity’, ‘background_presample_intensity’, ‘background_postsample_intensity’, ‘background_intensity’, and ‘dwell_time_actual’, return flux, absorption and background corrected intensity signal.

Parameters:
  • data (list[PipelineData]) – Input data list containing all necessary data labelled with their proper names.

  • presample_intensity_reference_rate (float, optional) – Reference counting rate for the ‘presample_intensity’ signal. If not specified, it will be calculated with ‘numpy.nanmean(presample_intensity / dwell_time_actual)’.

  • sample_thickness_cm (float, optional) – Sample thickness in centimeters. If specified, this processor will additionally perform thickness correction. Use of this parameter is mutualy exclusive with use of sample_mu_inv_cm.

  • sample_mu_inv_cm (float, optional) – Sample linear attenuation coefficient in inverse centimeters. If specified, this processor will additionally perform thickness correction. Use of this parameter is mutualy exclusive with use of sample_thickness_cm.

  • nxprocess (bool, optional) – Flag to indicate the flux, absorption, and background corrected data should be returned as a Nexus style NXobject object. Defaults to False.

Returns:

Flux, absorption and background corrected version of input ‘intensity’ data.

Return type:

Any

class FluxAbsorptionCorrectionProcessor(*, root: Annotated[Path, PathType(path_type=dir)] | None = '/home/runner/work/ChessAnalysisPipeline/ChessAnalysisPipeline/docs', inputdir: Annotated[Path, PathType(path_type=dir)] | None = None, outputdir: Annotated[Path, PathType(path_type=dir)] | None = None, interactive: bool | None = False, log_level: Literal['DEBUG', 'INFO', 'WARNING', 'ERROR', 'CRITICAL'] | None = 'INFO')[source]

Bases: ExpressionProcessor

Processor for flux and absorption correction.

model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_post_init(context: Any, /) None

This function is meant to behave like a BaseModel method to initialize private attributes.

It takes context as an argument since that’s what pydantic-core passes when calling it.

Args:

self: The BaseModel instance. context: The context.

process(data, presample_intensity_reference_rate=None, nxprocess=False)[source]

Given input data for ‘intensity’, ‘presample_intensity’, ‘postsample_intensity’, ‘background_presample_intensity’, ‘background_postsample_intensity’, and ‘dwell_time_actual’, compute the flux and absorption corrected intensity signal.

Parameters:
  • data (list[PipelineData]) – Input data list containing all necessary data labelled with their proper names.

  • presample_intensity_reference_rate (float, optional) – Reference counting rate for the ‘presample_intensity’ signal. If not specified, it will be calculated with ‘numpy.nanmean(presample_intensity / dwell_time_actual)’.

  • nxprocess (bool, optional) – Flag to indicate the flux and absorption corrected data should be returned as a NeXus style NXobject object. Defaults to False.

Returns:

Flux and absorption corrected version of input ‘intensity’ data.

Return type:

Any

class FluxCorrectionProcessor(*, root: Annotated[Path, PathType(path_type=dir)] | None = '/home/runner/work/ChessAnalysisPipeline/ChessAnalysisPipeline/docs', inputdir: Annotated[Path, PathType(path_type=dir)] | None = None, outputdir: Annotated[Path, PathType(path_type=dir)] | None = None, interactive: bool | None = False, log_level: Literal['DEBUG', 'INFO', 'WARNING', 'ERROR', 'CRITICAL'] | None = 'INFO')[source]

Bases: ExpressionProcessor

Processor for flux correction.

model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_post_init(context: Any, /) None

This function is meant to behave like a BaseModel method to initialize private attributes.

It takes context as an argument since that’s what pydantic-core passes when calling it.

Args:

self: The BaseModel instance. context: The context.

process(data, presample_intensity_reference_rate=None, nxprocess=False)[source]

Given input data for ‘intensity’, ‘presample_intensity’, and ‘dwell_time_actual’, compute the flux corrected intensity signal.

Parameters:
  • data (list[PipelineData]) – Input data list containing items with names ‘intensity’, ‘presample_intensity’, and ‘dwell_time_actual’ (if presample_intensity_reference_rate is not specified).

  • presample_intensity_reference_rate (float, optional) – Reference counting rate for the ‘presample_intensity’ signal. If not specified, it will set to ‘numpy.nanmean(presample_intensity / dwell_time_actual)’.

  • nxprocess (bool, optional) – Flag to indicate the flux corrected data should be returned as a NeXus style NXobject object. Defaults to False.

Returns:

Flux corrected version of input ‘intensity’ data.

Return type:

Any

class PyfaiIntegrationProcessor(*, root: Annotated[Path, PathType(path_type=dir)] | None = '/home/runner/work/ChessAnalysisPipeline/ChessAnalysisPipeline/docs', inputdir: Annotated[Path, PathType(path_type=dir)] | None = None, outputdir: Annotated[Path, PathType(path_type=dir)] | None = None, interactive: bool | None = False, log_level: Literal['DEBUG', 'INFO', 'WARNING', 'ERROR', 'CRITICAL'] | None = 'INFO')[source]

Bases: Processor

A processor for azimuthally integrating images.

Variables:

config (dict, optional) – Initialization parameters for an instance of PyfaiIntegrationConfig.

config: PyfaiIntegrationConfig
model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_post_init(context: Any, /) None

This function is meant to behave like a BaseModel method to initialize private attributes.

It takes context as an argument since that’s what pydantic-core passes when calling it.

Args:

self: The BaseModel instance. context: The context.

pipeline_fields: dict
process(data, idx_slices=None)[source]

Perform a set of integrations on 2D detector data.

Parameters:
  • data (list[PipelineData]) – Input data.

  • idx_slices (list[dict[str, int]], optional) – List of dictionaries identifying the sliced index at which the output data should be written in a dataset, defaults to [{‘start’:0, ‘step’: 1}].

Returns:

Integrated detector data ready for writing with ZarrResultsWriter or NexusResultsWriter.

Return type:

list[dict[str, Any]]

class SetupProcessor(*, root: Annotated[Path, PathType(path_type=dir)] | None = '/home/runner/work/ChessAnalysisPipeline/ChessAnalysisPipeline/docs', inputdir: Annotated[Path, PathType(path_type=dir)] | None = None, outputdir: Annotated[Path, PathType(path_type=dir)] | None = None, interactive: bool | None = False, log_level: Literal['DEBUG', 'INFO', 'WARNING', 'ERROR', 'CRITICAL'] | None = 'INFO')[source]

Bases: Processor

Convenience Processor for setting up a container for SAXS/WAXS experiments.

Variables:
  • map_config (dict, optional PyfaiIntegrationConfig.) – Map configuration.

  • pyfai_config (dict, optional) – Initialization parameters for an instance of PyfaiIntegrationConfig.

  • detectors (DetectorConfig) – Detector configurations.

  • dataset_shape (int or list[int]) – Shape of the completed dataset that will be processed later on (shape of the measurement itself, _not_ including the dimensions of any signals collected at each point in that measurement).

  • dataset_chunks (list[int] or Literal["auto"], optional) – Extent of chunks along each dimension of the completed dataset / measurement. Choose this according to how you will process your data – for example, if your dataset_shape is [m, n], and you are planning to process each of the m rows as chunks, dataset_chunks should be [1, n]. But if you plan to process each of the n columns as chunks, dataset_chunks should be [m, 1], defaults to “auto”.

  • num_chunk (int, optional) – Used only if dataset_chunks is “auto”. Preferred number of chunks in the dataset, defaults to 1.

  • raw_data (bool, optional) – Flag to indicate wether or not space for raw detector data should be included in the returned Zarr group, defaults to True.

dataset_chunks: Literal['auto'] | Annotated[list[Annotated[int, None, Interval(gt=0, ge=None, lt=None, le=None), None]], Len(min_length=1, max_length=None)] | None
dataset_shape: Annotated[list[Annotated[int, None, Interval(gt=0, ge=None, lt=None, le=None), None]], Len(min_length=1, max_length=None)] | None
detectors: Annotated[list[Detector], Len(min_length=1, max_length=None)]
map_config: MapConfig
model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_post_init(context: Any, /) None

This function is meant to behave like a BaseModel method to initialize private attributes.

It takes context as an argument since that’s what pydantic-core passes when calling it.

Args:

self: The BaseModel instance. context: The context.

num_chunk: Annotated[int, None, Interval(gt=0, ge=None, lt=None, le=None), None] | None
pipeline_fields: dict
process(data)[source]

Set up a container for SAXS/WAXS experiments.

Parameters:

data (list[PipelineData]) – Input data.

Returns:

Zarr group object representing a Zarr tree of the container contents.

Return type:

zarr.Group

pyfai_config: PyfaiIntegrationConfig
raw_data: bool | None
class SetupResultsProcessor(*, root: Annotated[Path, PathType(path_type=dir)] | None = '/home/runner/work/ChessAnalysisPipeline/ChessAnalysisPipeline/docs', inputdir: Annotated[Path, PathType(path_type=dir)] | None = None, outputdir: Annotated[Path, PathType(path_type=dir)] | None = None, interactive: bool | None = False, log_level: Literal['DEBUG', 'INFO', 'WARNING', 'ERROR', 'CRITICAL'] | None = 'INFO')[source]

Bases: Processor

Processor for creating an intital Zarr group object with empty datasets for filling in by PyfaiIntegrationProcessor and ZarrValuesWriter.

Variables:
  • config (dict) – Initialization parameters for an instance of PyfaiIntegrationConfig.

  • dataset_shape (int or list[int]) – Shape of the completed dataset that will be processed later on (shape of the measurement itself, _not_ including the dimensions of any signals collected at each point in that measurement).

  • dataset_chunks (list[int] or Literal["auto"], optional) – Extent of chunks along each dimension of the completed dataset / measurement. Choose this according to how you will process your data – for example, if your dataset_shape is [m, n], and you are planning to process each of the m rows as chunks, dataset_chunks should be [1, n]. But if you plan to process each of the n columns as chunks, dataset_chunks should be [m, 1], defaults to “auto”.

config: PyfaiIntegrationConfig
dataset_chunks: Literal['auto'] | Annotated[list[Annotated[int, None, Interval(gt=0, ge=None, lt=None, le=None), None]], Len(min_length=1, max_length=None)] | None
dataset_shape: Annotated[list[Annotated[int, None, Interval(gt=0, ge=None, lt=None, le=None), None]], Len(min_length=1, max_length=None)]
model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_post_init(context: Any, /) None

This function is meant to behave like a BaseModel method to initialize private attributes.

It takes context as an argument since that’s what pydantic-core passes when calling it.

Args:

self: The BaseModel instance. context: The context.

pipeline_fields: dict
process(data)[source]

Create and return a Zarr group object to hold processed SAXS/WAXS data processed by PyfaiIntegrationProcessor.

Parameters:

data (list[PipelineData]) – Input data.

Returns:

Empty structure for filling in SAXS/WAXS data.

Return type:

zarr.Group

zarr_setup(tree)[source]

Create a Zarr group object based on a dictionary representing a Zarr tree of groups and arrays.

Parameters:

tree (dict[str, Any]) – Nested dictionary representing a Zarr tree of groups and arrays.

Returns:

Zarr group corresponding to the contents of tree.

Return type:

zarr.Group

class UnstructuredToStructuredProcessor(*, root: Annotated[Path, PathType(path_type=dir)] | None = '/home/runner/work/ChessAnalysisPipeline/ChessAnalysisPipeline/docs', inputdir: Annotated[Path, PathType(path_type=dir)] | None = None, outputdir: Annotated[Path, PathType(path_type=dir)] | None = None, interactive: bool | None = False, log_level: Literal['DEBUG', 'INFO', 'WARNING', 'ERROR', 'CRITICAL'] | None = 'INFO')[source]

Bases: Processor

Processor to aggregate “unstructured” data into a single NeXus style NXdata object with a “structured” representation.

get_common_axes(signals)[source]

Determine the common leading axis shared by all signals.

This method computes the longest common prefix of axis names across all signal definitions. Only axes that appear in the same order at the beginning of each signal’s axes list are included in the result.

This is used to identify the shared coordinate dimensions for a structured NeXus style NXdata <https://manual.nexusformat.org/classes/base_classes/NXdata.html#index-0>`__`NXdata object.

Parameters:

signals (list[dict]) – Validated signal definitions. Each signal must define an "axes" key containing an ordered list of axis names.

Returns:

Axis names that form the common leading axis for all signals. Returns an empty list if no common prefix exists.

Return type:

list[str]

model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_post_init(context: Any, /) None

This function is meant to behave like a BaseModel method to initialize private attributes.

It takes context as an argument since that’s what pydantic-core passes when calling it.

Args:

self: The BaseModel instance. context: The context.

process(data, fields, name='data', attrs=None)[source]

Create and return a Nexus style NXdata object containing a single structured dataset composed from multiple unstructured input datasets.

This method validates the field configuration, validates and reshapes the input data, determines common axes across all signals, and constructs a NXdata group containing signal and axis fields.

Parameters:
  • data (list[PipelineData]) – Input data objects containing unstructured datasets.

  • fields (list[dict[str, Any]]) –

    Configuration describing how to structure the input data. This is a list of dictionaries. Each dictionary must contain the required keys:

    • "name": Name of the data item, which must correspond to the name field of an item in data.

    • "type": Either "signal" or "axis".

    • "axes": Required only for items where "type" is "signal". List of the names of the fields containing coordinate axes data for each dimension of the signal.

    Optional keys include: - "attrs": Dictionary of NeXus attributes to attach to.

  • name (str, optional) – Name of the resulting NXdata group, defaults to “data”.

  • attrs (dict[str, Any], optional) – Attributes to attach to the resulting NXdata group. The common axes determined during processing will be added to this dictionary under the "axes" key.

Returns:

A structured NXdata object containing all signals and axes defined by the configuration.

Return type:

nexusformat.nexus.NXdata

structure_signal_values(signals, axes, common_axes)[source]

Reshape and populate structured signal arrays using common axes.

This method determines computes index mappings from raw axis values to their unique sorted representations, and inserts each signal’s unstructured data into its preallocated structured array.

Only the common axes are used for structuring; any trailing, signal-specific axes are assumed to have already been handled when allocating the structured signal arrays.

Parameters:
  • signals (list[dict]) – Signal definitions with raw and preallocated structured data arrays.

  • axes (list[dict]) – Axis definitions containing raw values and unique values.

  • common_axes (list[str]) – Ordered list of the names of the dataset’s common axes.

Returns:

  • Updated signal definitions with populated structured arrays.

  • Unmodified axis definitions.

  • Common axis names shared by all signals.

Return type:

tuple[list[dict], list[dict], list[str]]

validate_config_fields(fields)[source]

Validate and normalize the field configuration.

This method separates the input field configuration into signal and axis definitions, performs basic validation, and ensures that all axes referenced by signals are defined as axis fields.

The returned signal and axis dictionaries are normalized into a consistent internal representation used by later processing stages.

Parameters:

fields (list[dict[str, Any]]) – Configuration describing how input data should be structured. Each item must define a "name" and "type" key, where "type" is either "signal" or "axis". Signal entries must additionally define an "axes" list.

Raises:

ValueError – If a signal references an axis that is not defined, or if a signal is defined before any axis exist.

Returns:

Validated signal and axis definitions.

Return type:

tuple[list[dict], list[dict]]

validate_data(data, signals, axes)[source]

Validate and normalize input data for axes and signals.

This method retrieves raw input data for each axis and signal, propagates metadata attributes, computes unique axis values, and allocates structured arrays for signal data.

For each axis:
  • Raw data is loaded.

  • Attributes are merged (without overwriting user-specified ones).

  • Unique axis values are computed.

For each signal:
  • Raw data is loaded.

  • Attributes are merged (without overwriting user-specified ones).

  • A structured output array is allocated based on its axes.

  • Total signal size is validated against the expected shape.

Parameters:
  • data (list[PipelineData]) – Input data.

  • signals (list[dict]) – Validated signal field definitions.

  • axes (list[dict]) – Validated axis field definitions.

Raises:

ValueError – If a signal’s data size does not match the expected size derived from its axes.

Returns:

Updated signal and axis definitions with populated values and derived metadata.

Return type:

tuple[list[dict], list[dict]]

class UpdateValuesProcessor(*, root: Annotated[Path, PathType(path_type=dir)] | None = '/home/runner/work/ChessAnalysisPipeline/ChessAnalysisPipeline/docs', inputdir: Annotated[Path, PathType(path_type=dir)] | None = None, outputdir: Annotated[Path, PathType(path_type=dir)] | None = None, interactive: bool | None = False, log_level: Literal['DEBUG', 'INFO', 'WARNING', 'ERROR', 'CRITICAL'] | None = 'INFO')[source]

Bases: Processor

Processes a slice of data for updating values in an existing container for a SAXS/WAXS experiment.

Variables:
  • map_config (dict dict, optional) – Map Configuration.

  • pyfai_config (dict, optional) – Initialization parameters for an instance of PyfaiIntegrationConfig.

  • spec_file (str) – SPEC file containing the scan from which to read and process a slice of raw data.

  • scan_number (int) – Scan number from which to read and process a slice of raw data.

  • detectors (list[CHAP.common.models.map.Detector]) – Detector configurations.

  • raw_data (bool, optional) – Flag to indicate wether or not space for raw detector data should be included in the values returned, defaults to True.

detectors: Annotated[list[Detector], Len(min_length=1, max_length=None)]
map_config: MapConfig
model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_post_init(context: Any, /) None

This function is meant to behave like a BaseModel method to initialize private attributes.

It takes context as an argument since that’s what pydantic-core passes when calling it.

Args:

self: The BaseModel instance. context: The context.

pipeline_fields: dict
process(data, idx_slice=None)[source]

Processes a slice of data for updating values in an existing container for a SAXS/WAXS experiment.

Parameters:
  • data (list[PipelineData]) – Input data.

  • idx_slice (dict[str, int], optional) – Dictionaries identifying the sliced index at which the output data should be written in a dataset, defaults to {‘start’:0, ‘step’: 1}.

Returns:

Integrated detector data ready for writing with ZarrResultsWriter or NexusResultsWriter.

Return type:

list[dict[str, Any]]

Returns:

Integrated detector data for updating values in an existing container for a SAXS/WAXS experiment.

Return type:

list[dict[str, Any]]

pyfai_config: PyfaiIntegrationConfig
raw_data: bool | None
scan_number: Annotated[int, None, Interval(gt=0, ge=None, lt=None, le=None), None]
spec_file: Annotated[Path, PathType(path_type=file)]

CHAP.saxswaxs.reader module

Module for Readers unique to the SAXSWAXS workflow.

CHAP.saxswaxs.writer module

Module for Writers unique to the SAXSWAXS workflow.