Skip to content

[Bug]: e3sm_to_cmip raises an obscure concurrency error when the grid is invalid #257

@forsyth2

Description

@forsyth2

What happened?

I ran into this error:

2024-04-19 19:12:28,039_039:INFO:cmorize:pr: creating CMOR variable with CMOR axis objects.
  File "/lcrc/soft/climate/e3sm-unified/base/envs/e3sm_unified_1.10.0rc1_chrysalis/lib/python3.10/site-packages/e3sm_to_cmip/__main__.py", line 931, in _run_parallel
    out = res.result()
  File "/lcrc/soft/climate/e3sm-unified/base/envs/e3sm_unified_1.10.0rc1_chrysalis/lib/python3.10/concurrent/futures/_base.py", line 458, in result
    return self.__get_result()
  File "/lcrc/soft/climate/e3sm-unified/base/envs/e3sm_unified_1.10.0rc1_chrysalis/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
    raise self._exception
^M 50%|█████     | 1/2 [00:00<00:00,  5.38it/s]  File "/lcrc/soft/climate/e3sm-unified/base/envs/e3sm_unified_1.10.0rc1_chrysalis/lib/python3.10/site-packages/e3sm_to_cmip/__main__.py", line 931, in _run_parallel
    out = res.result()
  File "/lcrc/soft/climate/e3sm-unified/base/envs/e3sm_unified_1.10.0rc1_chrysalis/lib/python3.10/concurrent/futures/_base.py", line 451, in result
    return self.__get_result()
  File "/lcrc/soft/climate/e3sm-unified/base/envs/e3sm_unified_1.10.0rc1_chrysalis/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
    raise self._exception
^M100%|██████████| 2/2 [00:00<00:00, 10.74it/s]

...

mv: cannot stat '/lcrc/group/e3sm/ac.forsyth2/zppy_p1_output/v3.LR.piControl/post/atm/native/cmip_ts/monthly/tmp_ts_atm_monthly_0001-0005-0005/CMIP6/CMIP/*/*/*/*/*/*/*/*/*.nc': No such file or directory

because I missed specifying a mapping_file in zppy. See E3SM-Project/zppy#549 (comment) and E3SM-Project/zppy#549 (comment) for original comments.

What did you expect to happen? Are there are possible answers you came across?

It would be good to catch the grid error earlier and raise that instead of letting e3sm_to_cmip error out on a concurrency bug that isn't particularly informative. E.g., is it easy to say "oh zppy's mapping_file includes a certain substring and therefore will/won't be compatible"?

@TonyB9000 has a table to keep track of mapping file information, (/p/user_pub/e3sm/staging/resource/derivatives.conf on acme1, /lcrc/group/e3sm2/DSM/Staging/Resource/derivatives.conf on chrysalis), which has "selections based upon realm, resolution, and model_version. It can be USED programmatically, but cannot be MAINTAINED programmatically"

Minimal Complete Verifiable Example (MVCE)

# Relevant section of ts_atm_monthly_0001-0005-0005.bash
  srun -N 1 e3sm_to_cmip \
  --output-path \
  ${dest_cmip}/${tmp_dir} \
  --var-list \
  'pr, tas, rsds, rlds, rsus' \
  --realm \
  atm \
  --input-path \
  ${input_dir} \
  --user-metadata \
  /lcrc/group/e3sm/ac.forsyth2/zppy_p1_output/v3.LR.piControl/post/scripts/${workdir}/default_metadata.json \
  --num-proc \
  12 \
  --tables-path \
  ${cmortables_dir}

Relevant log output

No response

Anything else we need to know?

The zppy cfg:

[default]
input = /lcrc/group/e3sm2/ac.golaz/E3SMv3/v3.LR.piControl
output = /lcrc/group/e3sm/ac.forsyth2/zppy_p1_output/v3.LR.piControl
case = v3.LR.piControl
www = /lcrc/group/e3sm/public_html/diagnostic_output/ac.forsyth2/zppy_p1_www
partition = compute
environment_commands = "source /lcrc/soft/climate/e3sm-unified/test_e3sm_unified_1.10.0rc1_chrysalis.sh"

[ts]
active = True
walltime = "00:50:00"

  [[ atm_monthly ]]
  frequency = "monthly"
  input_files = "eam.h0"
  input_subdir = "archive/atm/hist"
  ts_fmt = "cmip"
  years = "0001:0020:5",

  [[ land_monthly ]]
  extra_vars = "landfrac"
  frequency = "monthly"
  input_files = "elm.h0"
  input_subdir = "archive/lnd/hist"
  ts_fmt = "cmip"
  vars = "FSH,RH2M,LAISHA,LAISUN"
  years = "0001:0020:5",

  [[ atm_monthly_glb ]]
  input_subdir = "archive/atm/hist"
  input_files = "eam.h0"
  frequency = "monthly"
  mapping_file = "glb"
  years = "0001:0020:10",

  [[ lnd_monthly_glb ]]
  input_subdir = "archive/lnd/hist"
  input_files = "elm.h0"
  frequency = "monthly"
  mapping_file = "glb"
  vars = "FSH,RH2M,LAISHA,LAISUN"
  years = "0001:0020:10",

Notice, no mapping_file specified for the ts_atm_monthly subtask

Environment

source /lcrc/soft/climate/e3sm-unified/test_e3sm_unified_1.10.0rc1_chrysalis.sh -> e3sm_unified_1.10.0rc1_login

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingloggingRelated to logging output, either in the console or log file.

    Type

    No type

    Projects

    Status

    To do

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions