Skip to content

Commit 47409b0

Browse files
scheibelpslabasanpearce8Riyaz Haquerfhaque
authored
benchpark mirror (#620)
* command skeleton * utility functionality * everything else that needs to be copied, except for ramble workspace mirror * update run_command to allow for dumping output directly into file * update invocation * update dry run * various syntax errors and minor bugs * rm format indicator from non-format string * the ramble workspace stores a copy of the installed software, omit that for mirror * another approach to copying a git dir with all of its uncommitted state, but not including e.g. opt in spack * new strat for copying subset of git repo also does not work * more successful copying of tracked files (but hangs on .communicate) * proper workspace mirror command * rm debug line * script dir location calc fails when sourced: do source-friendly location * style fix * more style edit * dont copy logs from ramble workspace * need to copy benchpark application repo and modifier repo; partial work to rewrite abspaths in the ramble/spack configs * more on collecting abspaths * more partial work on sed replacement of paths (I think I can avoid this entirely though by reconstructing repo references via ramble/spack) * expanded first-time script, and thereby avoid a whole bunch of sed * dont delete defaults, just non-default site scope * resolve relative script invocation path into absolute path script location * copied spack still has its db (and the db has stale references) * remove unnecessary imports * generate bootstrap mirror for spack * first-time setup script adds bootstrap mirror * style fix * add docs * ramble workspace mirror seems to require non-symlink * packages that use SCM branch tips need to be mirrored differently * style fixes * disable codespell for external property reference * style fix * another attempt to sidestep codespell * try again * Revert "try again" This reverts commit 302c66c. * try again * try again * try again * yamlfix? * Revert "yamlfix?" This reverts commit 433e9c4. * yaml was invalid, introduced outside of this pr but triggered since I modify a yaml file in this pr * there was since a change to this same file, that yet again fails yamlfix in a different way * docs check also runs codespell on files --------- Co-authored-by: Stephanie Brink <brink2@llnl.gov> Co-authored-by: pearce8 <pearce8@llnl.gov> Co-authored-by: Riyaz Haque <your-email@example.com> Co-authored-by: Riyaz Haque <5333387+rfhaque@users.noreply.github.com>
1 parent de0e139 commit 47409b0

File tree

8 files changed

+336
-2
lines changed

8 files changed

+336
-2
lines changed

.codespellignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
cachable

.github/workflows/docs.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ jobs:
3131
make html WORKSPACE_PATH=/tmp/workspace
3232
- name: Check for Typos using Codespell
3333
run: |
34-
codespell
34+
codespell --ignore-words=.codespellignore
3535
- name: Upload artifact
3636
uses: actions/upload-pages-artifact@56afc609e74202658d3ffba0e8f6dda462b719fa
3737
if: github.ref == 'refs/heads/develop'

.github/workflows/style.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ jobs:
1818
- name: Lint and Format Check
1919
run: |-
2020
black --diff --check .
21-
codespell
21+
codespell --ignore-words=.codespellignore
2222
isort
2323
flake8
2424
yamlfix --check --verbose .

docs/create-mirror.rst

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
.. Copyright 2023 Lawrence Livermore National Security, LLC and other
2+
Benchpark Project Developers. See the top-level COPYRIGHT file for details.
3+
4+
SPDX-License-Identifier: Apache-2.0
5+
6+
==================================
7+
Create a mirror for another system
8+
==================================
9+
10+
If you build a benchmark on a networked system, you can use `benchpark mirror`
11+
to create a directory that bundles all necessary resources to install and run
12+
that benchmark on another system, such that the destination system does not
13+
need network access.
14+
15+
On the networked system, if you created/built the benchmark with::
16+
17+
benchpark experiment init --dest=def-raja-perf raja-perf
18+
benchpark system init --dest=def-ruby llnl-cluster cluster=ruby compiler=gcc
19+
benchpark setup def-raja-perf/ def-ruby/ workspace/
20+
. `pwd`/workspace/setup.sh
21+
ramble --disable-progress-bar --workspace-dir `pwd`/workspace/def-raja-perf/def-ruby/workspace workspace setup
22+
23+
You can then create a directory that bundles all the resources needed to build
24+
that benchmark with::
25+
26+
benchpark mirror create `pwd`/workspace/def-raja-perf/def-ruby/workspace/ test-benchpark-mirror/
27+
28+
You can copy `test-bencpark-mirror/` to another system, and on that system,
29+
within that directory you can do::
30+
31+
python3 -m venv mirror-env && . mirror-env/bin/activate
32+
pip install --no-index --find-links=pip-cache pip-cache/*
33+
bash first-time.sh
34+
. `pwd`/setup.sh
35+
ramble --workspace-dir `pwd`/def-raja-perf/def-ruby/workspace/ workspace setup
36+
37+
this will install the benchmark on the new system, and also configure
38+
Ramble to use mirror resources that were bundled in `test-benchmark-mirror/`
39+
(so it does not need internet access to build the benchmark).
40+
41+
Limitations
42+
-----------
43+
44+
For now, benchpark can only create mirrors that are useful for destination
45+
systems that match the host system in terms of:
46+
47+
* available compilers
48+
* provided external software
49+
50+
Also, `benchpark mirror` can only create mirrors for benchmarks that have been
51+
built on the source system.

docs/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,7 @@
4242
modifiers
4343
set-of-experiments
4444
run-binary
45+
create-mirror
4546

4647
.. toctree::
4748
:maxdepth: 1

lib/benchpark/cmd/mirror.py

Lines changed: 252 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,252 @@
1+
# Copyright 2023 Lawrence Livermore National Security, LLC and other
2+
# Benchpark Project Developers. See the top-level COPYRIGHT file for details.
3+
#
4+
# Copyright 2013-2023 Spack Project Developers.
5+
#
6+
# SPDX-License-Identifier: Apache-2.0
7+
8+
import os
9+
import os.path
10+
import pathlib
11+
import re
12+
import shutil
13+
import tempfile
14+
15+
import benchpark.paths
16+
from benchpark.runtime import run_command, working_dir
17+
18+
19+
def _dry_run_command(cmd, *args, **kwargs):
20+
print(cmd)
21+
if args:
22+
print(f"\n\t{args}")
23+
if kwargs:
24+
print(f"\n\t{kwargs}")
25+
26+
27+
def copytree_part_of(basedir, dest, include):
28+
def _ignore(dirpath, dirlist):
29+
if pathlib.Path(dirpath) == pathlib.Path(basedir):
30+
return sorted(set(dirlist) - set(include))
31+
else:
32+
return []
33+
34+
shutil.copytree(basedir, dest, ignore=_ignore)
35+
36+
37+
def delete_configs_in(basedir):
38+
collected = []
39+
for fname in os.listdir(basedir):
40+
if fname.endswith(".yaml"):
41+
collected.append(os.path.join(basedir, fname))
42+
for path in collected:
43+
run_command(f"rm {path}")
44+
45+
46+
def copytree_tracked(basedir, dest):
47+
tracked = set()
48+
with working_dir(basedir):
49+
if not os.path.isdir(os.path.join(basedir, ".git")):
50+
raise RuntimeError(f"Not a git repo: {basedir}")
51+
with tempfile.TemporaryDirectory() as tempdir:
52+
results_path = os.path.join(tempdir, "output.txt")
53+
with open(results_path, "w") as f:
54+
run_command("git ls-files", stdout=f)
55+
with open(results_path, "r") as f:
56+
for line in f.readlines():
57+
tracked.add(pathlib.Path(line.strip()).parts[0])
58+
59+
tracked = sorted(tracked)
60+
copytree_part_of(basedir, dest, include=tracked + [".git"])
61+
62+
63+
def locate_benchpark_workspace_parent_of_ramble_workspace(ramble_workspace_dir):
64+
ramble_workspace = pathlib.Path(ramble_workspace_dir)
65+
found_parent = None
66+
for parent in ramble_workspace.parents:
67+
if {"setup.sh", "spack", "ramble"} <= set(os.listdir(parent)):
68+
found_parent = parent
69+
break
70+
if not found_parent:
71+
raise RuntimeError(
72+
"Cannot locate Benchpark workspace as a parent of Ramble workspace"
73+
)
74+
return found_parent, ramble_workspace.relative_to(found_parent)
75+
76+
77+
def find_one(basedir, basename):
78+
for root, dirs, files in os.walk(basedir):
79+
for x in dirs + files:
80+
if re.match(basename, x):
81+
return os.path.join(root, x)
82+
83+
84+
_CACHE_MARKER = ".benchpark-mirror-dir"
85+
86+
87+
def mirror_create(args):
88+
if args.dry_run:
89+
global run_command
90+
run_command = _dry_run_command
91+
92+
dest = os.path.abspath(args.destdir)
93+
marker = os.path.join(dest, _CACHE_MARKER)
94+
95+
ramble_workspace = os.path.realpath(os.path.abspath(args.workspace))
96+
97+
workspace, ramble_workspace_relative = (
98+
locate_benchpark_workspace_parent_of_ramble_workspace(ramble_workspace)
99+
)
100+
spack_instance = os.path.join(workspace, "spack")
101+
ramble_instance = os.path.join(workspace, "ramble")
102+
103+
if not os.path.isdir(workspace):
104+
raise RuntimeError(f"{workspace} does not exist")
105+
106+
if not os.path.exists(dest):
107+
os.makedirs(dest)
108+
with open(marker, "w"):
109+
pass
110+
elif not os.path.isdir(dest):
111+
raise RuntimeError(f"{dest} is not a directory")
112+
elif not os.path.exists(marker):
113+
raise RuntimeError(
114+
f"{dest} was not created by `benchpark mirror` (no {marker})"
115+
)
116+
117+
cache_storage = os.path.join(dest, "pip-cache")
118+
ramble_pip_reqs = os.path.join(benchpark.paths.benchpark_root, "requirements.txt")
119+
if not os.path.exists(cache_storage):
120+
run_command(f"pip download -r {ramble_pip_reqs} -d {cache_storage}")
121+
122+
ramble_workspace_dest = os.path.join(dest, ramble_workspace_relative)
123+
penultimate = pathlib.Path(*pathlib.Path(ramble_workspace_dest).parts[:-1])
124+
os.makedirs(penultimate, exist_ok=True)
125+
126+
def _ignore(path, dir_list):
127+
if pathlib.Path(path) == pathlib.Path(ramble_workspace):
128+
# The ramble workspace contains a copy of the experiment binaries
129+
# in 'software/', and also puts dynamically generated logs for
130+
# workspace commands in 'logs/' (if the latter is not removed,
131+
# it generates an error on the destination)
132+
return ["software", "logs"]
133+
else:
134+
return []
135+
136+
if not os.path.exists(ramble_workspace_dest):
137+
shutil.copytree(ramble_workspace, ramble_workspace_dest, ignore=_ignore)
138+
139+
spack_dest = os.path.join(dest, "spack")
140+
if not os.path.exists(spack_dest):
141+
copytree_tracked(spack_instance, spack_dest)
142+
143+
ramble_dest = os.path.join(dest, "ramble")
144+
if not os.path.exists(ramble_dest):
145+
copytree_tracked(ramble_instance, ramble_dest)
146+
147+
setup_dest = os.path.join(dest, "setup.sh")
148+
if not os.path.exists(setup_dest):
149+
with open(setup_dest, "w", encoding="utf-8") as f:
150+
f.write(
151+
"""\
152+
if [ -n "${_BENCHPARK_INITIALIZED:-}" ]; then
153+
return 0
154+
fi
155+
156+
this_script_dir=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)
157+
158+
. $this_script_dir/spack/share/spack/setup-env.sh
159+
. $this_script_dir/ramble/share/ramble/setup-env.sh
160+
161+
export SPACK_DISABLE_LOCAL_CONFIG=1
162+
163+
export _BENCHPARK_INITIALIZED=true
164+
"""
165+
)
166+
167+
env_dir = os.path.dirname(find_one(ramble_workspace, "spack.yaml"))
168+
git_repo_dst = os.path.join(dest, "git-repos")
169+
repo_copy_script = os.path.join(
170+
benchpark.paths.benchpark_root, "lib", "scripts", "env-collect-branch-tips.py"
171+
)
172+
out, err = run_command(
173+
f"spack -e {env_dir} python {repo_copy_script} {git_repo_dst}"
174+
)
175+
copied_pkgs = out.strip().split("\n")
176+
git_redirects = list()
177+
for pkg_name in copied_pkgs:
178+
git_url = f"$this_script_dir/git-repos/{pkg_name}"
179+
git_redirects.append(
180+
f"spack config --scope=site add packages:{pkg_name}:package_attributes:git:{git_url}"
181+
)
182+
git_redirects = "\n".join(git_redirects)
183+
184+
delete_configs_in(os.path.join(spack_dest, "etc", "spack"))
185+
delete_configs_in(os.path.join(ramble_dest, "etc", "ramble"))
186+
first_time_dest = os.path.join(dest, "first-time.sh")
187+
if not os.path.exists(first_time_dest):
188+
with open(first_time_dest, "w", encoding="utf-8") as f:
189+
f.write(
190+
f"""\
191+
this_script_dir=$(cd "$(dirname "${{BASH_SOURCE[0]}}")" && pwd)
192+
193+
. $this_script_dir/setup.sh
194+
195+
spack uninstall -ay
196+
spack repo add --scope=site $this_script_dir/repo
197+
spack config --scope=site add "config:misc_cache:$this_script_dir/spack-misc-cache"
198+
spack bootstrap add --scope=site --trust local-sources "$this_script_dir/spack-bootstrap-mirror/metadata/sources/"
199+
# We store local copies of git repos for packages that install branch tips
200+
{git_redirects}
201+
202+
# We deleted the repo config because it may have absolute paths;
203+
# it is reinstantiated here
204+
ramble repo add --scope=site $this_script_dir/repo
205+
ramble repo add -t modifiers --scope=site $this_script_dir/modifiers
206+
ramble config --scope=site add "config:disable_progress_bar:true"
207+
ramble config --scope=site add \"config:spack:global:args:'-d'\"
208+
"""
209+
)
210+
211+
modifiers_dest = os.path.join(dest, "modifiers")
212+
modifiers_src = os.path.join(benchpark.paths.benchpark_root, "modifiers")
213+
if not os.path.exists(modifiers_dest):
214+
shutil.copytree(modifiers_src, modifiers_dest)
215+
216+
bp_repo_dest = os.path.join(dest, "repo")
217+
bp_repo_src = os.path.join(benchpark.paths.benchpark_root, "repo")
218+
if not os.path.exists(bp_repo_dest):
219+
shutil.copytree(bp_repo_src, bp_repo_dest)
220+
221+
spack_bootstrap_mirror_dest = os.path.join(dest, "spack-bootstrap-mirror")
222+
if not os.path.exists(spack_bootstrap_mirror_dest):
223+
run_command(f"spack bootstrap mirror {spack_bootstrap_mirror_dest}")
224+
225+
ramble_workspace_mirror_dest = os.path.join(dest, "ramble-workspace-mirror")
226+
if not os.path.exists(ramble_workspace_mirror_dest):
227+
run_command(
228+
f"ramble --disable-progress-bar --workspace-dir {ramble_workspace} workspace mirror -d file://{ramble_workspace_mirror_dest}"
229+
)
230+
231+
232+
def setup_parser(root_parser):
233+
mirror_subparser = root_parser.add_subparsers(dest="system_subcommand")
234+
235+
create_parser = mirror_subparser.add_parser("create")
236+
create_parser.add_argument(
237+
"--dry-run", action="store_true", default=False, help="For debugging"
238+
)
239+
create_parser.add_argument(
240+
"workspace", help="A benchpark workspace you want to copy"
241+
)
242+
create_parser.add_argument("destdir", help="Put all needed resources here")
243+
244+
245+
def command(args):
246+
actions = {
247+
"create": mirror_create,
248+
}
249+
if args.system_subcommand in actions:
250+
actions[args.system_subcommand](args)
251+
else:
252+
raise ValueError(f"Unknown subcommand for 'system': {args.system_subcommand}")

lib/main.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,7 @@
4545
import benchpark.cmd.setup # noqa: E402
4646
import benchpark.cmd.show_build # noqa: E402
4747
import benchpark.cmd.unit_test # noqa: E402
48+
import benchpark.cmd.mirror # noqa: E402
4849
import benchpark.cmd.info # noqa: E402
4950
import benchpark.cmd.list # noqa: E402
5051
import benchpark.paths # noqa: E402
@@ -186,6 +187,9 @@ def init_commands(subparsers, actions_dict):
186187
)
187188
benchpark.cmd.audit.setup_parser(audit_parser)
188189

190+
mirror_parser = subparsers.add_parser("mirror", help="Copy a benchpark workspace")
191+
benchpark.cmd.mirror.setup_parser(mirror_parser)
192+
189193
info_parser = subparsers.add_parser(
190194
"info", help="Get information about Systems and Experiments"
191195
)
@@ -211,6 +215,7 @@ def init_commands(subparsers, actions_dict):
211215
actions_dict["setup"] = benchpark.cmd.setup.command
212216
actions_dict["unit-test"] = benchpark.cmd.unit_test.command
213217
actions_dict["audit"] = benchpark.cmd.audit.command
218+
actions_dict["mirror"] = benchpark.cmd.mirror.command
214219
actions_dict["info"] = benchpark.cmd.info.command
215220
actions_dict["show-build"] = benchpark.cmd.show_build.command
216221
actions_dict["list"] = benchpark.cmd.list.command
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
import spack.environment as ev
2+
from spack.fetch_strategy import GitFetchStrategy
3+
import sys
4+
import os
5+
import shutil
6+
7+
8+
def main():
9+
destination = sys.argv[1]
10+
11+
e = ev.active_environment()
12+
for _, spec in e.concretized_specs():
13+
df = spec.package.stage[0].default_fetcher
14+
if not df.cachable and isinstance(df, GitFetchStrategy):
15+
df.get_full_repo = True
16+
pkg_dst = os.path.join(destination, spec.name)
17+
if not os.path.exists(pkg_dst):
18+
spec.package.stage.fetch()
19+
shutil.move(spec.package.stage.source_path, pkg_dst)
20+
print(f"{spec.name}")
21+
22+
23+
if __name__ == "__main__":
24+
main()

0 commit comments

Comments
 (0)