Add a pre- and post-execution hooks #22

krassowski · 2024-07-22T11:57:42Z

Problem

When the notebook is closed the metadata for cell timing are not updated
When the notebook is closed and execution of a long-running cell completes, there is no way for extensions to offer notifications (e.g. sending an email)

Proposed Solution

Allow extensions to register pre- and post-execution functions to run which would be allowed to populate the cell metadata.

Maybe special-case execution timing as a hook that is available by default, and enabled based on an argument passed to the JSON API?

Additional context

The timing metadata population is implemented in JupyterLab here:

https://github.com/jupyterlab/jupyterlab/blob/ef5a055976e02fee6c4d877a6a6df756f3323f79/packages/cells/src/widget.ts#L1736-L1787

I see there is a TODO note in this repository about it:

jupyter-server-nbmodel/src/executor.ts

Line 122 in da00ed8

// FIXME quid of deletedCells and timing record

krassowski · 2025-01-07T16:18:49Z

To give some more context here:

the pre-post execution hooks should be implemented in jupyter-server extension to allow to perform a side-effect even if the browser is not connected (for frontend there are signals already)
there is a couple of options for how to enable the registration of hooks
- via an entry points mechanism similar to how jupyter-ai uses for model providers
- via a Python API on the jupyter-server extension, similarly to how jupyter-fileid exposes a small public API or how jupyter-server-ydoc does
- via configuration as jupyter-server does for pre_save_hook and post_save_hook

timing could be implemented either using such hooks, or directly in _execute_snippet by measuring time before and after client.execute_interactive call:

jupyter-server-nbmodel/jupyter_server_nbmodel/handlers.py

Lines 203 to 249 in ef8181c

    
           async def _execute_snippet( 
        
               client: jupyter_client.asynchronous.client.AsyncKernelClient, 
        
               ydoc: jupyter_server_ydoc.app.YDocExtension | None, 
        
               snippet: str, 
        
               metadata: dict | None, 
        
               stdin_hook: t.Callable[[dict], None] | None, 
        
           ) -> dict[str, t.Any]: 
        
               """Snippet executor 
        
               Args: 
        
                   client: Kernel client 
        
                   ydoc: Jupyter server YDoc extension 
        
                   snippet: The code snippet to execute 
        
                   metadata: The code snippet metadata; e.g. to define the snippet context 
        
                   stdin_hook: The stdin message callback 
        
               Returns: 
        
                   The execution status and outputs. 
        
               """ 
        
               ycell = None 
        
               if metadata is not None: 
        
                   ycell = await _get_ycell(ydoc, metadata) 
        
                   if ycell is not None: 
        
                       # Reset cell 
        
                       with ycell.doc.transaction(): 
        
                           del ycell["outputs"][:] 
        
                           ycell["execution_count"] = None 
        
                           ycell["execution_state"] = "running" 
        
               outputs = [] 
        
               # FIXME we don't check if the session is consistent (aka the kernel is linked to the document) 
        
               #   - should we? 
        
               reply = await ensure_async( 
        
                   client.execute_interactive( 
        
                       snippet, 
        
                       # FIXME stream partial results 
        
                       output_hook=partial(_output_hook, outputs, ycell), 
        
                       stdin_hook=stdin_hook if client.allow_stdin else None, 
        
                   ) 
        
               ) 
        
               reply_content = reply["content"] 
        
               if ycell is not None: 
        
                   with ycell.doc.transaction(): 
        
                       ycell["execution_count"] = reply_content.get("execution_count") 
        
                       ycell["execution_state"] = "idle"

special-casing execution timing would be desirable for two reasons:
- if another pre/post execution hook is added which takes non-negligible time to execute, it could be omitted from timing if we special case it (otherwise the timing results could depend on order of hook registration which I think would be confusing)
- we can add an argument to the REST API to enable it, improving the feature parity with the default code execution mechanism if the runCell in typescript automatically passes this argument if user enabled the recordTiming option.

krassowski · 2025-01-08T11:41:58Z

@fcollonval @echarles any preferences on how we should implement hooks? On frontend these are just ISignals. Let me number the options I listed above:

a) via an entry points mechanism similar to how jupyter-ai uses for model providers
b) via a Python API on the jupyter-server extension, similarly to how jupyter-fileid exposes a small public API or how jupyter-server-ydoc does
c) via configuration as jupyter-server does for pre_save_hook and post_save_hook

My own thoughs:

(a) is convenient to use once you know how entry points work:
- requires writing a python package
- the hook implementation can be a pure Python function
- possibly slightly harder to test on CI
(b) seems easier from maintenance point as this is just an extra method on extension app but documented as public:
- requires writing a server extension to get access to this extension's API
- thus this also requires creating a python package
(c) would be in harmony with how jupyter-server does hooks:
- can be used without requiring a python package,
- could use a Set or List traitlet to allow registering multiple hooks

fcollonval · 2025-01-08T13:03:19Z

Thanks for raising the question @krassowski

Small question related to the need and setting aside the specific case of timing, would using the Jupyter Server events answer the need or do you see other cases like the timing that the reaction should be called immediately?

krassowski · 2025-01-08T13:15:35Z

Great point! For the use case of "send me an email when execution finished", I think Jupyter Server event could be good enough.

For feature parity: frontend has onCellExecuted, selectionExecuted, onCellExecutionScheduled signals which can perform side effects on the cell or notebook, but cannot modify the code being executed. Jupyter Server event needs to be serialized to JSON, so these could not include the equivalent cell or notebook object, but I think it would be fine if the event contained:

cellId
documentId
success
kernelError (if any)

It could be interesting to explore pre-execution hooks that would allow to modify the code but I think this is a larger conversation and not required right now.

Shall we just add event emitter then?

fcollonval · 2025-01-08T13:21:03Z

Shall we just add event emitter then?

I would go that road as it is anyway a good addition. But let's keep this issue opened for further exploration.

Darshan808 mentioned this issue Jan 7, 2025

Add execution start and end time metadata for code cells #29

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a pre- and post-execution hooks #22

Add a pre- and post-execution hooks #22

krassowski commented Jul 22, 2024

krassowski commented Jan 7, 2025

krassowski commented Jan 8, 2025

fcollonval commented Jan 8, 2025

krassowski commented Jan 8, 2025 •

edited

Loading

fcollonval commented Jan 8, 2025

Add a pre- and post-execution hooks #22

Add a pre- and post-execution hooks #22

Comments

krassowski commented Jul 22, 2024

Problem

Proposed Solution

Additional context

krassowski commented Jan 7, 2025

krassowski commented Jan 8, 2025

fcollonval commented Jan 8, 2025

krassowski commented Jan 8, 2025 • edited Loading

fcollonval commented Jan 8, 2025

krassowski commented Jan 8, 2025 •

edited

Loading