Skip to content

A lot of threads and performance hits for executing a few notebooks simultaneously #326

@Jacob-Stevens-Haas

Description

@Jacob-Stevens-Haas

When I run a notebooks using nbclient.NotebookClient().execute(), each spawns a lot of threads (137), top/bpytop/htop shows that all cores are at 100% work, and ssh grows noticeably laggy when typing. This is on an AMD Ryzen Threadripper with 32 cores, running a maximum of eight simultaneous notebooks. None of the notebooks include any multiprocessing, threading, or async; it's a lot of work, but its all numpy, scipy, etc. I AM writing to a file, if that matters (the process adds a logging.FileHandler to the root logger and pickles some output to a file).

I don't necessarily know much about how to troubleshoot this kind of a problem, so I'm just going to start by sharing assumptions I have that people with more experience can correct. I thought that:

  • non-interactive python will only ever take up around 100% of CPU because it is single-threaded, and therefore can only execute on a single core.
  • if some python code would normally take up around 100% of CPU, running it as a notebook would only take up a small amount of overhead on other cores
  • The difference in overhead for executing via nbclient should be less than or equal to the overhead when running a notebook interactively, because nbclient is less (not at all) interactive.

I've seen a few jupyter-themed questions about threads, e.g., but none that I found use nbclient. I'm just trying to capture the text and image output of a block of code to a formatted html file, for which I'm using nbconvert.exporters.HTMLExporter and nbconvert.writers.files.FilesWriter. If there's a different way to get that result using the jupyter ecosystem, please let me know.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions