BcdiPipeline#
- class cdiutils.pipeline.BcdiPipeline(params=None, param_file_path=None)[source]#
Bases:
PipelineA class to handle the BCDI workflow, from pre-processing to post-processing, including phase retrieval (using PyNX package). Provide either a path to a parameter file or directly the parameter dictionary.
- Parameters:
param_file_path (str, optional) – the path to the parameter file. Defaults to None.
parameters (dict, optional) – the parameter dictionary. Defaults to None.
Main Methods
preprocess(**params)Preprocess BCDI detector data for phase retrieval.
phase_retrieval([jump_to_cluster, ...])Execute phase retrieval using PyNX.
postprocess(**params)Postprocess phase retrieval results to extract physical properties.
Configuration
update_from_file(path)Update pipeline instance with parameters from CXI file.
Visualisation
Launch the interactive phase retrieval GUI.
Show plotly interactive figure of the final post-processed reconstruction.
Inherited from Pipeline
- voxel_pos = ('ref', 'max', 'com')#
- class_isosurface = 0.1#
- __init__(params=None, param_file_path=None)[source]#
Initialisation method.
- Parameters:
param_file_path (str, optional) – the path to the parameter file. Defaults to None.
parameters (dict, optional) – the parameter dictionary. Defaults to None.
- update_from_file(path)[source]#
Update pipeline instance with parameters from CXI file.
Loads parameters and SpaceConverter from a CXI file and updates the current instance. Useful for resuming analysis from saved reconstruction results.
- Parameters:
path (str) – path to CXI file.
- Raises:
ValueError – if file format is unsupported or q_lab_ref is missing in parameters.
- classmethod load_from_cxi(path)[source]#
Load pipeline parameters and SpaceConverter from CXI file.
Extracts stored parameters and reconstruction metadata from a CXI file, rebuilding the SpaceConverter for coordinate transformations.
- Parameters:
path (str) – path to CXI file.
- Returns:
- extracted parameters and
configured SpaceConverter instance.
- Return type:
tuple[dict, SpaceConverter]
- Raises:
ValueError – if path does not end with ‘.cxi’.
- preprocess(**params)[source]#
Preprocess BCDI detector data for phase retrieval.
Handles complete preprocessing workflow: data loading, Bragg peak centring, cropping, filtering (hot pixels, flat-field, background subtraction), and Q-space coordinate system initialisation.
- Parameters:
**params – optional parameters to override instance params. Common overrides include ‘preprocess_shape’, ‘hot_ pixel_filter’, ‘flat_field’, ‘background_level’.
- Side effects:
Updates instance attributes: detector_data, cropped_ detector_data, mask, voi (voxels of interest), converter, q_lab_pos, atomic_params.
initialisation.
- Raises:
ValueError – if the requested shape and the voxel reference are not compatible.
- phase_retrieval_gui()[source]#
Launch the interactive phase retrieval GUI.
This method lazily imports and launches the PhaseRetrievalGUI from cdiutils.interactive.phase_retrieval. The lazy import avoids requiring the optional GUI-related dependencies (pynx and ipywidgets) unless this method is invoked.
The GUI is initialized with the pipeline instance and the pipeline’s pynx_phasing_dir as the working directory, and it searches for CXI files matching the pattern “Run.cxi”. After initialization, the GUI’s show() method is called to display the interface.
- Raises:
ImportError – If the PhaseRetrievalGUI cannot be imported because the required dependencies (‘pynx’ and ‘ipywidgets’) are not installed. The raised error suggests installing them via: pip install pynx ipywidgets
- Returns:
None
- Return type:
None
Example
pipeline.phase_retrieval_gui()
- phase_retrieval(jump_to_cluster=False, pynx_slurm_file_template=None, clear_former_results=False, cmd=None, search_pattern='*Run*.cxi', **pynx_params)[source]#
Execute phase retrieval using PyNX.
Runs PyNX either locally (direct subprocess) or on a SLURM cluster. Generates PyNX input file from parameters and manages job submission/monitoring if cluster execution is requested.
- Parameters:
jump_to_cluster (bool, optional) – submit job to SLURM cluster. Defaults to False (local execution).
pynx_slurm_file_template (str, optional) – path to SLURM script template. Defaults to None (uses built-in template).
clear_former_results (bool, optional) – delete previous reconstruction CXI files. Defaults to False.
cmd (str, optional) – command for local PyNX execution. Defaults to None (uses “pynx-cdi-id01 pynx-cdi- inputs.txt”).
search_pattern (str, optional) – glob pattern for finding result CXI files. Defaults to “Run.cxi”.
**pynx_params – PyNX parameters (e.g., nb_run, nb_raar, support_threshold). Override defaults.
- Raises:
PyNXScriptError – if PyNX execution fails.
subprocess.CalledProcessError – if subprocess commandlate (str, optional): the template for the pynx slurm file. Defaults to None.
clear_former_results (bool, optional) – whether ti clear the former results. Defaults to False.
cmd (str, optional) – the command to run when running pynx on the current machine. Defaults to None.
**pynx_params – additional pynx parameters.
PyNXScriptError – if PyNX execution fails.
subprocess.CalledProcessError – if subprocess command fails
- analyse_phasing_results(sorting_criterion='mean_to_max', search_pattern='*Run*.cxi', plot=True, plot_phasing_results=True, plot_phase=False, init_analyser=True)[source]#
Analyse and sort phase retrieval results by quality metrics.
Wrapper for PhasingResultAnalyser that evaluates reconstruction quality using various criteria. Sorts results and generates comparison plots.
- Parameters:
sorting_criterion (str, optional) –
quality metric for sorting. Options: - ‘mean_to_max’: amplitude homogeneity (Gaussian mean
vs. max)
’sharpness’: sum of amplitude^4 within support
’std’: amplitude standard deviation
’llk’: log-likelihood
’llkf’: free log-likelihood
Defaults to “mean_to_max”.
search_pattern (str, optional) – glob pattern for CXI files. Defaults to “Run.cxi”.
plot (bool, optional) – enable/disable all plots. Defaults to True.
plot_phasing_results (bool, optional) – plot result comparisons. Defaults to True.
plot_phase (bool, optional) – plot phase (with amplitude as opacity) instead of amplitude. Defaults to False.
init_analyser (bool, optional) – force reinitialisation of PhasingResultAnalyser. Defaults to True.
- Raises:
ValueError – if sorting_criterion is unknown.
- generate_support_from(run='best', output_path=None, fill=False, verbose=True, search_pattern='*Run*.cxi')[source]#
Extract and save support from a specific reconstruction run.
Generates a support mask from a phase retrieval result and saves it as a CXI file. Can be used directly in subsequent phasing by setting: support = <output_path> in PyNX params.
- Parameters:
run (int | str, optional) – run selection. Use “best” for top-ranked result or integer for specific run number. Defaults to “best”.
output_path (str, optional) – save path for support CXI file. If None, saves to pynx_phasing_dir/support.cxi. Defaults to None.
fill (bool, optional) – fill holes in support using morphological operations. Defaults to False.
verbose (bool, optional) – print info and plot support. Defaults to True.
search_pattern (str, optional) – glob pattern for CXI files. Defaults to “Run.cxi”.
- select_best_candidates(nb_of_best_sorted_runs=None, best_runs=None, search_pattern='*Run*.cxi')[source]#
Select best phase retrieval candidates for mode decomposition.
Wrapper for PhasingResultAnalyser.select_best_candidates. Choose candidates either by count (top N sorted) or explicit run numbers.
- Parameters:
nb_of_best_sorted_runs (int, optional) – number of top-sorted runs to select. Requires prior call to analyse_phasing_ results(). Defaults to None.
best_runs (list[int], optional) – explicit list of run numbers (e.g., [2, 5, 7]). Defaults to None.
search_pattern (str, optional) – glob pattern for CXI files. Defaults to “Run.cxi”.
- Raises:
ValueError – if result_analyser not initialised (call analyse_phasing_results() first)
ValueError – If the results have not been analysed yet.
- mode_decomposition(cmd=None, search_pattern='*Run*.cxi')[source]#
Perform mode decomposition on selected reconstruction candidates.
Extracts principal modes from multiple phase retrieval results using PyNX’s pynx-cdi-analysis (similar to PCA). Modes represent consistent features across reconstructions.
- Parameters:
cmd (str, optional) – command for mode decomposition if PyNX unavailable locally. Defaults to None (uses “pynx-cdi- analysis candidate_*.cxi –modes 1 –modes_output mode.h5”).
search_pattern (str, optional) – glob pattern for candidate CXI files. Defaults to “Run.cxi”.
- Side effects:
Saves modes to S{scan}_pynx_reconstruction_mode.cxi in dump_dir.
- postprocess(**params)[source]#
Postprocess phase retrieval results to extract physical properties.
Comprehensive workflow: loads reconstruction, orthogonalises to lab frame, optionally flips/apodizes, estimates support isosurface, and computes structural properties (phase, displacement, strain, d-spacing, lattice parameter).
- Parameters:
**params – optional parameters to override instance params. Common overrides: - ‘voxel_size’: target voxel size (nm) - ‘isosurface’: support threshold (0-1) - ‘apodize’: window function (‘blackman’, ‘hann’, etc.) - ‘flip’: flip reconstruction (complex conjugate) - ‘convention’: ‘xu’ or ‘cxi’ - ‘handle_defects’: enable defect-aware processing
- Side effects:
Updates instance attributes: reconstruction, structural_ props, extra_info. Generates amplitude distribution plot.
- Raises:
ValueError – if unrecognised parameter provided.
- show_3d_final_result()[source]#
Show plotly interactive figure of the final post-processed reconstruction.
- cancel_job(job_id)#
Cancel running SLURM job via scancel.
- Parameters:
job_id (str) – SLURM job ID to cancel.
- Raises:
subprocess.CalledProcessError – if scancel command fails.
- get_job_state(job_id)#
Retrieve SLURM job state and exit code via sacct.
Queries sacct for job status information and parses output to extract state (e.g., COMPLETED, FAILED, RUNNING) and exit code (format: signal:status).
- Parameters:
job_id (str) – SLURM job ID to query.
- Returns:
- job state and exit code (e.g.,
(‘COMPLETED’, ‘0:0’)).
- Return type:
tuple[str, str]
- Raises:
ValueError – if job ID not found in sacct output.
subprocess.CalledProcessError – if sacct command fails.
- is_job_running(job_id)#
Check if SLURM job is currently running.
Queries squeue for job presence. Job is considered running if its ID appears in squeue output.
- Parameters:
job_id (str) – SLURM job ID to check.
- Returns:
True if job is in queue, False otherwise.
- Return type:
bool
- Raises:
subprocess.CalledProcessError – if squeue command fails.
- load_parameters(file_path=None)#
Load pipeline parameters from YAML configuration file.
Uses yaml.full_load() to support Python-specific types like tuples that are serialised by yaml.dump().
- Parameters:
file_path (str, optional) – path to YAML parameter file. Defaults to None (uses self.param_file_path).
- Returns:
loaded parameter dictionary.
- Return type:
dict
- Raises:
FileNotFoundError – if parameter file does not exist.
yaml.YAMLError – if file contains invalid YAML.
- make_dump_dir()#
Create output directory specified in params[‘dump_dir’].
- Raises:
ValueError – if dump_dir parameter is None.
- monitor_job(job_id, output_file, retries=10, delay=1)#
Monitor SLURM job and verify final completion status.
Streams job output in real-time and validates final state via sacct after job leaves queue. Retries state check if job shows RUNNING but is not in squeue (handles race conditions).
- Parameters:
job_id (str) – SLURM job ID to monitor.
output_file (str) – path to slurm-{job_id}.out file.
retries (int) – number of sacct retries for lingering RUNNING state. Defaults to 10.
delay (int) – seconds between retries. Defaults to 1.
- Raises:
JobFailedError – if job terminates with FAILED state or non-zero exit code.
Notes
Successfully completed jobs have state=’COMPLETED’ and exit_code=’0:0’. Other terminal states log a warning but do not raise exceptions.
- static pretty_print(text, max_char_per_line=79, do_print=True, return_text=False)#
Format text with decorative star border.
Creates a framed message with star borders and centred text wrapped to specified line width. Useful for logging section headers or important messages.
- Parameters:
text (str) – text to format.
max_char_per_line (int) – maximum line width including border. Defaults to 79.
do_print (bool) – whether to print formatted text. Defaults to True.
return_text (bool) – whether to return formatted string. Defaults to False.
- Returns:
formatted text if return_text=True, else None.
- Return type:
None | str
Examples
>>> pretty_print("Hello World", max_char_per_line=30) ****************************** * Hello World * ******************************
- static process(func)#
Decorate pipeline methods to add logging and error handling.
Wraps process methods with file logging, stdout redirection, and structured error reporting. Creates process-specific log files in dump_dir with format {func_name}_output.log.
- Parameters:
func (Callable) – pipeline method to decorate.
- Returns:
wrapped function with logging infrastructure.
- Return type:
Callable
- Raises:
Exception – re-raises any exception from decorated function after logging.
Notes
Temporarily redirects sys.stdout to logger during execution to capture print statements. Original stdout is always restored in finally block.
- stream_job_output(job_id, output_file)#
Stream SLURM job output in real-time.
Waits for output file creation, then continuously reads and logs new lines until job stops running or interrupted flag is set. Logs at JOB level (custom level between INFO and WARNING).
- Parameters:
job_id (str) – SLURM job ID being monitored.
output_file (str) – path to slurm-{job_id}.out file.
- Raises:
FileNotFoundError – if output file cannot be accessed after creation.
Notes
Checks file existence every 0.5s until found. Polls running status and reads new lines with 0.5s interval. Respects self.interrupted flag for early termination.
- submit_job(job_file, working_dir)#
Submit SLURM job and return job ID with output file path.
Executes sbatch command in bash login shell to ensure proper environment loading. Sets up keyboard interrupt handler for job cancellation.
- Parameters:
job_file (str) – path to SLURM batch script.
working_dir (str) – directory to execute sbatch from.
- Returns:
- job ID and absolute path to output file
(slurm-{job_id}.out).
- Return type:
tuple[str, str]
- Raises:
subprocess.CalledProcessError – if sbatch command fails.
ValueError – if job ID cannot be extracted from sbatch output.
Notes
Registers SIGINT handler that calls _handle_interrupt with job_id when Ctrl+C is pressed.
Examples#
Basic usage:
from cdiutils.pipeline import BcdiPipeline
# Create pipeline from configuration file
pipeline = BcdiPipeline(param_file_path="config.yml")
# Run complete workflow
pipeline.preprocess()
pipeline.phase_retrieval()
pipeline.postprocess()
See Also#
Pipeline : Base pipeline class
PyNXPhaser : Phase retrieval engine
PostProcessor : Post-processing tools