Implementing EPW file management (renaming/moving) #2597
Replies: 3 comments 13 replies
-
Thanks for your post. I'm out of town through Jan 3rd but will aim to reply shortly after I return. File handling is tricky... but looks like you have given this quite a bit of thought! |
Beta Was this translation helpful? Give feedback.
-
@hironori-kondo Just to make sure, is this the Python script in question? |
Beta Was this translation helpful? Give feedback.
-
@hironori-kondo: Thank you for the detailed writeup. I have a question that may help me provide better feedback. Is there an issue with having the post-processing step be its own job? This job would be very short and simple since it is just calling I am imagining something like:
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Happy New Year! I'm returning to the task of implementing EPW in
quacc
, and I was hoping to consult you on how to implement the file management, @Andrew-S-Rosen & @tomdemeyere.The EPW code interfaces with Espresso outputs, but the files are expected to follow a slightly different organization/naming scheme. Two previous jobs are required: an nscf job and a phonon job. After running the phonon job, we run a post-processing Python script (which ships with EPW). This script performs a couple of checks on how the phonon job was run, then copies/renames the necessary files into a new directory that the EPW job reads.
I've read through some of the
quacc
code, and my thought is to modifyquacc.utils.files.copy_decompress_files()
. The current implementation accepts a single destination directory (fed in as the tmp folder), to which the filenames are appended. I propose the following modifications toquacc
's file management behavior:output_filenames
tocopy_decompress_files()
. The default behavior would be unchanged, using the existingfilenames
argument for the output. Shouldoutput_filenames
be specified, however, this behavior would be overridden, resulting in changed file paths.output_filenames
toquacc.runners.prep.calc_setup()
output_filenames
toquacc.runners._base.BaseRunner.setup()
output_filenames
toquacc.runners.ase.Runner.__init__()
output_filenames
toquacc.recipes.espresso._base.run_and_summarize()
If I haven't missed anything, the above should enable
output_filenames
to be optionally specified for any given Espresso job, enabling file movement/renaming during the copying step. The EPW file management would then look like the following:copy_files
argument.quacc.calculators.espresso.utils.prepare_copy_files()
. Let's call this outputupdated_copy_files
.dict
of destination filenames for the above files. Let's call this outputoutput_filenames
.run_and_summarize()
withupdated_copy_files
andoutput_filenames
as arguments.This approach is a tad convoluted because the destination directory used by
copy_decompress_files()
(i.e., the tmp directory for the job) is not created untilquacc.runners.prep.calc_setup()
is called during runner initialization. As such, the desired behavior has to be inserted more deeply, somewhere between the directory's creation and the calculator call. My breakdown of some pros/cons of this approach:Pros
copy_decompress_files()
has a neat argument structure: source directory, source filenames, destination directory, destination filenames.Cons
How does the above sound? Do you have any wisdom/suggestions/preferences?
Beta Was this translation helpful? Give feedback.
All reactions