SAXS/WAXS module (CHAP.saxswaxs)
The CHAP.saxswaxs module contains processing tools unique to SAXS/WAXS processing. This document describes how to use them in a pipeline configuration YAML file so that SAXS/WAXS workflows can be run from the terminal on the CHESS Linux system.
Workflow description
General overview
Setup a Zarr container for the complete raw and processed dataset.
First, we set up a container to hold the raw and processed datasets before any processing begins. This step needs the following information to allocate an appropriate amount of space:
Dataset size
Detector size(s)
Integrated data size(s)
In practice, you should prepare two supplementary configuration files in addition to the parameters for the tools involved in this step: one for a
MapConfig, object, another for aPyFaiIntegrationProcessorConfigobject. This step also sets the number of chunks for each data array in the container. Selecting the right number of chunks is important for optimizing performance during the next step.Example pipeline configuration
config: root: . log_level: debug setup: # Step 1: read in configuration files - common.YAMLReader: filename: map_config.yaml schema: common.models.map.MapConfig - common.YAMLReader: filename: pyfai_integration_processor_config.yaml schema: common.models.integration.PyfaiIntegrationConfig # Step 2: set up Zarr container for processed data based on # configurations provided from Step 1 - saxswaxs.SetupProcessor: dataset_shape: - 50 dataset_chunks: - 10 detectors: - id: PIL9 shape: - 407 - 487 - id: PIL11 shape: - 407 - 487 # Step 3: Write the Zarr container to file - common.ZarrWriter: filename: data.zarr force_overwrite: true
Fill container with processed data
Next, we read in the raw data and perform the configured integration(s). To get the best performance for large datasets, this step should be performed across multiple pipeline processes runnning in parallel. Each process will handle reading, processing, and writing just one part of the whole dataset – exactly one chunk of each array in the container set up previously, to be precise. To fill the container from the previous step with processed data, each parallel process will need to know the following information:
The spec file for the scan whose data will be processed
The scan number for the scan whose data will be processed
Parameters indicating a specific slice of the scan to process
Example pipeline configuration
config: root: . log_level: debug update: # Step 1: read in configuration files - common.YAMLReader: filename: map_config.yaml schema: common.models.map.MapConfig - common.YAMLReader: filename: pyfai_integration_processor_config.yaml schema: common.models.integration.PyfaiIntegrationConfig # Step 2: read and process the data - saxswaxs.UpdateValuesProcessor: spec_file: spec_file scan_number: 1 detectors: - id: PIL9 shape: - 407 - 487 - id: PIL11 shape: - 407 - 487 idx_slice: start: 0 stop: 10 step: 1 # Step 3: write processed data to the container set up earlier - common.ZarrValuesWriter: filename: data.zarr
Perform final user adjustments
These “final adjustments” are all optional, but they can include the following:
converting from .zarr format to NeXus format
performing flux, absorption, and / or background corrections on integrated data
inserting links to coordinate axes next to processed data arrays
reshaping the dataset from an “unstructured” to a “structured” representation
Example format conversion pipeline configuration
config: root: . log_level: debug convert: # One tool only: ZarrToNexusProcessor takes care of reading old # zarr file and writing new NeXus file, too - common.ZarrToNexusProcessor: zarr_filename: data.zarr nexus_filename: data.nxs
Example flux correction pipeline configuration
config: root: . log_level: debug flux_correct: # Step 1: Read in uncorrected data and flux data to perform flux correction - common.NexusReader: filename: data.nxs nxpath: waxs_azimuthal/data/I nxmemory: 100000 name: intensity - common.NexusReader: filename: data.nxs nxpath: spec_file_012/scalar_data/presample_intensity nxmemory: 100000 name: presample_intensity # Step 2: Perform flux correction calculations - saxswaxs.FluxCorrectionProcessor: nxprocess: true presample_intensity_reference_rate: 50000 # Write flux corrected results back to original file as their own # new NXprocess group - common.NexusWriter: filename: data.nxs nxpath: /waxs_azimuthal_flux_corrected force_overwrite: true
Example coordinate linking pipeline configuration
config: root: . log_level: debug linkdims: # Step 1: Read in the Nexus file in which we're adding linked data fields - common.NexusReader: filename: data.nxs mode: rw # Step 2: Create new linked data arrays from all the groups listed # in `link_from` to all the existing data arrays listed in # `link_to` - common.NexusMakeLinkProcessor: link_from: - waxs_azimuthal/data - waxs_cake/data - waxs_radial/data - waxs_azimuthal_flux_abs_bg_corrected/data - waxs_azimuthal_flux_abs_corrected/data - waxs_azimuthal_flux_corrected/data - waxs_cake_flux_abs_bg_corrected/data - waxs_cake_flux_abs_corrected/data - waxs_cake_flux_corrected/data - waxs_radial_flux_abs_bg_corrected/data - waxs_radial_flux_abs_corrected/data - waxs_radial_flux_corrected/data link_to: - spec_file_012/independent_dimensions/samx - spec_file_012/independent_dimensions/samzmakelink # No writing tool necessary, `common.NexusMakeLinkProcessor` will # modilfy the file in place as long as it's read in with `mode: # rw`.
Example unstructured-to-structured pipeline configuration
config: root: . log_level: debug struct: # Step 1: Read in the data arrays to structure, and all their # coordinate axes arrays too - common.NexusReader: filename: data.nxs nxpath: spec_file_012/independent_dimensions/samx name: samx - common.NexusReader: filename: data.nxs nxpath: spec_file_012/independent_dimensions/samz name: samz - common.NexusReader: filename: data.nxs nxpath: waxs_azimuthal/data/q_A^-1 name: waxs_azimuthal_q - common.NexusReader: filename: data.nxs nxpath: waxs_radial/data/chi_deg name: waxs_radial_chi - common.NexusReader: filename: data.nxs nxpath: waxs_azimuthal/data/I name: waxs_azimuthal_I - common.NexusReader: filename: data.nxs nxpath: waxs_radial/data/I name: waxs_radial_I # Step 2: restructure signal arrays in a single shared NXdata object - saxswaxs.UnstructuredToStructuredProcessor: name: structured_data fields: - name: samx type: axis - name: samz type: axis - name: waxs_azimuthal_q type: axis - name: waxs_radial_chi type: axis - name: waxs_azimuthal_I axes: [samx, samz, waxs_azimuthal_q] type: signal - name: waxs_radial_I axes: [samx, samz, waxs_radial_chi] type: signal # Step 3: Write the structured data to a new group in the original NeXus file - common.NexusWriter: filename: data.nxs nxpath: /structured force_overwrite: true
Optimizing performance
Guide on selecting appropriate values for dataset_chunks and running multiple update jobs in parallel
Data in / output formats
CHESS data in, .zarr data out
Notes on corrections calculations
There are currently three convenience tools available for performing corrections: saxswaxs.FluxCorrectionProcessor, saxswaxs.FluxAbsorptionCorrectionProcessor, and saxswaxs.FluxAbsorptionBackrgroundCorrectionProcessor. The exact calculations that each ones performs are detailed below.
Definitions
\(I_{uncorrected}\) is the uncorrected, integrated dataset. When reading this data as input for a
saxswaxs.*CorrectionProcessor, usename: intensity.\(I_{incident}\) is the presample intensity dataset. When reading this data as input for a
saxswaxs.*CorrectionProcessor, usename: presample_intensity.\(I_{transmitted}\) is the postsample intensity dataset. When reading this data as input for a
saxswaxs.*CorrectionProcessor, usename: postsample_intensity.\(t_{dwell}\) is the scan’s dwell time at each point in the map configuration to which the correction tool is applied. When reading this data as input for a
saxswaxs.*CorrectionProcessor, usename: dwell_time_actual.\(I_{background}\) is the background integrated detector data. When reading this data as input for a
saxswaxs.*CorrectionProcessor, usename: background_intensity.\(I_{incident, background}\) is the background presample intensity. When reading this data as input for a
saxswaxs.*CorrectionProcessor, usename: background_presample_intensity.\(I_{transmitted, background}\) is the background postsample intensity. When reading this data as input for a
saxswaxs.*CorrectionProcessor, usename: background_postsample_intensity.\(\phi_{reference}\) is
presample_intensity_reference_ratein the correction tool’s own parameters. However, ifpresample_intensity_reference_ratewas not given in the tool’s configuration, a value for this quantity will be calculated with:
\(T = \frac{I_{transmitted}}{I_{incident}} / \frac{I_{transmitted, background}}{I_{incident, background}}\)
\(t\) is the sample thickness in \(cm\). This quantity only appears in
saxswaxs.FluxAbsorptionBackgroundCorrectionProcessor. The value of this parameter can be set in one of two ways:Use the
sample_thickness_cmparameter ofsaxswaxs.FluxAbsorptionBackgroundCorrectionProcessor, ORUse the
sample_mu_inv_cm(”\(\mu\)”) parameter ofsaxswaxs.FluxAbsorptionBackgroundCorrectionProcessor. \(\mu\) is known as the the linear attenuation coefficient of the sample, and is related to the mass attenuation coefficient, \((\mu/\rho)\) [cm\(^2\)/g], by the sample density. Tabulated values of the \((\mu/\rho)(E)\) for each element are available here: https://physics.nist.gov/PhysRefData/XrayMassCoef/tab3.html. For our purposes:
NB: When using
saxswaxs.FluxAbsorptionBackgroundCorrectionProcessor, do not use both thesample_thickness_cmandsample_mu_inv_cmparameters at the same time. Specifying both parameters makes the definition of the sample thickness ambiguous. The Processor will raise an error if both parameters are supplied.\(C_f\) is the scalar factor for putting flux, absorption, background, and thickness corrected data into absolute intensity units. Taken directly from the
absolute_intensity_scalarparameter from the correction tool config file.
saxswaxs.FluxCorrectionProcessor
saxswaxs.FluxAbsorptionCorrectionProcessor
saxswaxs.FluxAbsoprtionBackgroundCorrectionProcessor
This tool functions differently depending on what parameters are provided:
If neither
sample_thickness_cmorsample_mu_inv_cmare provided:
If either
sample_thickness_cmorsample_mu_inv_cmare provided:
Running a workflow
At CHESS
Use environment: source /nfs/chess/sw/miniforge3_chap/bin/activate; conda activate CHAP_saxswaxs
And/or add this to your ~/.bashrc: alias CHAP_saxswaxs='/nfs/chess/sw/miniforge3_chap/envs/CHAP_saxswaxs/bin/CHAP'
Batch jobs
Drop in & replace: PyfaiIntegrationProcessorConfig idx_slice on update jobs will need adjustments to drop in & replace: MapConfig Script for constructuing and / or sending off multiple update jobs in parallel