Simplifying Pipelines#
Have uv? ⚡
If you have uv installed, you can instantly open this page as a Jupyter notebook using opennb:
uvx --with "pipefunc[docs]" opennb pipefunc/pipefunc/docs/source/concepts/simplifying-pipelines.md
This command creates an ephemeral environment with all dependencies and launches the notebook in your browser in 1 second - no manual setup needed! ✨.
Alternatively, run:
uv run https://raw.githubusercontent.com/pipefunc/pipefunc/refs/heads/main/get-notebooks.py
to download all documentation as Jupyter notebooks.
This section is about pipefunc.Pipeline.simplified_pipeline(), which is a convenient way to simplify a pipeline by merging multiple nodes into a single node (creating a pipefunc.NestedPipeFunc).
Consider the following pipeline (look at the visualize() output to see the structure of the pipeline):
from pipefunc import Pipeline
def f1(a, b, c, d):
return a + b + c + d
def f2(a, b, e):
return a + b + e
def f3(a, b, f1):
return a + b + f1
def f4(f1, f3):
return f1 + f3
def f5(f1, f4):
return f1 + f4
def f6(b, f5):
return b + f5
def f7(a, f2, f6):
return a + f2 + f6
# If the functions are not decorated with @pipefunc,
# they will be wrapped and the output_name will be the function name
pipeline_complex = Pipeline([f1, f2, f3, f4, f5, f6, f7])
pipeline_complex("f7", a=1, b=2, c=3, d=4, e=5)
pipeline_complex.visualize_matplotlib(
color_combinable=True,
) # combinable functions have the same color
In the example code above, the complex pipeline composed of multiple functions (f1, f2, f3, f4, f5, f6, f7) can be simplified by merging the nodes f1, f3, f4, f5, f6 into a single node.
This merging process simplifies the pipeline and allows to reduce the number of functions that need to be cached/saved.
The method reduced_pipeline from the Pipeline class is used to generate this simplified version of the pipeline.
simplified_pipeline_complex = pipeline_complex.simplified_pipeline("f7")
simplified_pipeline_complex.visualize() # A `NestedPipeFunc` will have a red edge
However, simplifying a pipeline comes with a trade-off. The simplification process removes intermediate nodes that may be necessary for debugging or inspection.
For instance, if a developer wants to monitor the output of f3 while processing the pipeline, they would not be able to do so in the simplified pipeline as f3 has been merged into a pipefunc.NestedPipeFunc.
The simplified pipeline now contains a pipefunc.NestedPipeFunc object, which is a subclass of PipeFunc but contains an internal pipeline.
simplified_pipeline_complex.functions
[PipeFunc(f2),
PipeFunc(f7),
NestedPipeFunc(pipefuncs=[PipeFunc(f6), PipeFunc(f1), PipeFunc(f3), PipeFunc(f4), PipeFunc(f5)])]
nested_func = simplified_pipeline_complex.functions[-1]
print(f"{nested_func.parameters=}, {nested_func.output_name=}, {nested_func(a=1, b=2, c=3, d=4)=}")
nested_func.pipeline.visualize()
nested_func.parameters=('a', 'b', 'c', 'd'), nested_func.output_name='f6', nested_func(a=1, b=2, c=3, d=4)=35