Propagate enable_profiling through ProcessedContainerSettings#483
Propagate enable_profiling through ProcessedContainerSettings#483jhiemstrawisc wants to merge 3 commits into
Conversation
ProcessedContainerSettings.from_container_settings() rebuilds the processed container settings from the raw ContainerSettings, but it never copied the enable_profiling field, so the processed settings always used the dataclass default of False. run_container_singularity() reads the flag off the processed settings (config.enable_profiling), so the profiling branch never ran: no peer cgroup was created, cgroup_wrapper.sh never executed, and no usage-profile.tsv was written, regardless of the user's containers.enable_profiling config value. This regressed in Reed-CompBio#390, which moved enable_profiling from a top-level config field to the nested container settings and added the field to both ContainerSettings and ProcessedContainerSettings, but missed wiring it through the conversion in from_container_settings(). Add the missing assignment, plus a regression test in test_config.py asserting that enable_profiling survives the full config-parsing path.
agitter
left a comment
There was a problem hiding this comment.
The code changes look good. I have not tested it yet, which one of us may want to do before merging. I tried testing in CHTC but hit other problems described in #459 (review)
|
I updated my code on my current CHTC set up with these changes. I am currently waiting for my jobs to get scheduled to test if this fixed the bug. |
|
I accidentally approved this. I still need to test this on CHTC, I realized that the SPRAS Docker image I was using wasn't including this code change so it still looked like a bug. |
|
I'm testing this now, and I've found a few more subtle issues -- don't merge yet! |
subprocess.run rejects capture_output=True together with an explicit stderr argument (capture_output is shorthand for stdout=PIPE, stderr=PIPE). In the profiling/cgroup branch of run_container_singularity this raised "ValueError: stdout and stderr arguments may not be used with capture_output." on every Apptainer run with enable_profiling=true. Set stdout=PIPE and stderr=STDOUT explicitly so the container's stderr is merged into the captured stdout, preserving the original intent of capturing combined output via proc.stdout.
The profiling file is written next to raw-pathway.txt inside the reconstruct rule, but it was never declared as a Snakemake output. With a shared filesystem this is harmless, but under shared-fs-usage: none (e.g. HTCondor) Snakemake only transfers declared outputs back from execute nodes, so the profiling data was created on the EP and then discarded. Declare usage-profile.tsv as an additional reconstruct output, gated on enable_profiling and the singularity/apptainer framework so the output is only required when it is actually produced (run_container_singularity), matching the existing warn-only behavior for profiling under docker.
|
The most recent commits address two other issues I ran into while testing. The first fixes a runtime error that I think I introduced at some point related to the way I used At this point I've tested profiling end to end and this works for me. |
|
I tested this update and was able to get the profiling files. However, every value inside the files is N/A: |
|
Looking at one of the reconstruct stdout outputs I get: |
the outputs for the profiling are all NA, seems like the cgroup files do not exist
ntalluri
left a comment
There was a problem hiding this comment.
Something is broken where the profiling files all have N/A values.
|
The error Lines 23 to 32 in 11091cd Did that work previously and no longer does? Should we return None or fail differently in that block of code if we can't create the directory? |
ProcessedContainerSettings.from_container_settings()rebuilds the processed container settings from the rawContainerSettings, but it never copied theenable_profilingfield, so the processed settings always used the dataclass default of False.run_container_singularity()reads the flag off the processed settings (config.enable_profiling), so the profiling branch never ran: no peer cgroup was created, cgroup_wrapper.sh never executed, and no usage-profile.tsv was written, regardless of the user'scontainers.enable_profilingconfig value.This regressed in #390, which moved
enable_profilingfrom a top-level config field to the nested container settings and added the field to bothContainerSettingsandProcessedContainerSettings, but missed wiring it through the conversion infrom_container_settings().Add the missing assignment, plus a regression test asserting that
enable_profilingsurvives the full config-parsing path.