-
Notifications
You must be signed in to change notification settings - Fork 242
Rework sharding tests #4293
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
simone-silvestri
wants to merge
95
commits into
main
Choose a base branch
from
ss/fix-sharding-tests
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Rework sharding tests #4293
Changes from 47 commits
Commits
Show all changes
95 commits
Select commit
Hold shift + click to select a range
3c64dfe
Extend set_to_function for non-sharded field
glwagner 474db60
Bugfix in stencil not using default FloatType
glwagner 3239d42
Import R_Earth
c110972
Fix broken import
glwagner 1b051be
Merge branch 'tpu-fixes' of https://github.com/CliMA/Oceananigans.jl …
glwagner 66efc57
Fix sharded grids
f4ea5c0
Rm shows
3b435db
Merge branch 'main' into tpu-fixes
simone-silvestri 34163cd
build grid on CPU and switch it to sharded ractant
simone-silvestri 9f89b14
try removing the xla forcing
simone-silvestri 8633c24
correct architecture for lat lon
simone-silvestri 239e3b4
make sure sharding is initialized
simone-silvestri 4c2197d
fix lat lon grid
simone-silvestri 093de34
import r_Earth
simone-silvestri 2fbd8eb
bugfix
simone-silvestri c5d7576
try running with IFRT
simone-silvestri 4e8d811
quite a large bug
simone-silvestri f436384
a little cleanup
simone-silvestri cf14457
we get to compiling of the first timestep
simone-silvestri 8bab678
reduce the MPI show madness
simone-silvestri ab58214
Change to constant_with_arch
glwagner 649b448
create grid
simone-silvestri db7e49a
remove comment
simone-silvestri 2122d08
remove the tripolar shard
simone-silvestri afd8e7d
add some info
simone-silvestri 552736e
use (Base.julia_md())
simone-silvestri 750392b
add replicate in z
simone-silvestri 65e08b1
add sharding to the clock
simone-silvestri 456e299
sharding the z direction
simone-silvestri 919823b
add a comment
simone-silvestri 58f43a5
different tests
simone-silvestri 1f8acad
sharding tests
simone-silvestri 9f6854b
we don't need preferences for the moment
simone-silvestri 7d245bd
Merge branch 'main' into ss/fix-sharding-tests
simone-silvestri 33d4476
see where it runs from
simone-silvestri 4d890ce
another check
simone-silvestri 43547e7
LocalPreferences in the correct folder
simone-silvestri d4037db
Merge branch 'main' into ss/fix-sharding-tests
simone-silvestri 016fa6c
remove the reactant test
simone-silvestri 2cd1416
one host 4 devices
simone-silvestri 76a4cc7
Merge branch 'ss/fix-sharding-tests' of github.com:CliMA/Oceananigans…
simone-silvestri cd574b8
improve tests
simone-silvestri d699a42
Merge branch 'main' into ss/fix-sharding-tests
simone-silvestri 8fea6d5
fix a couple of bugs
simone-silvestri 86c0fbd
Merge branch 'ss/fix-sharding-tests' of github.com:CliMA/Oceananigans…
simone-silvestri a89f44e
some improvements
simone-silvestri 5e57a3f
and add the bottom height
simone-silvestri d46b324
remove immersed boundary for now
simone-silvestri 4e4acc8
at least fix this issue
simone-silvestri 5385609
fix latitude longitude coordinates
simone-silvestri 9a9bd5f
run the tests
simone-silvestri 6fbf1bf
Merge branch 'main' into ss/fix-coordinates
simone-silvestri 41de112
these should pass now if everything is correct
simone-silvestri a5ad48f
Merge branch 'main' into ss/fix-sharding-tests
simone-silvestri 082bb1d
add sharding lat lon
simone-silvestri 7e0a004
add the tripolar test
simone-silvestri e5c6886
back to 5 minutes timestep
simone-silvestri 5903257
add sharding tests
simone-silvestri 1f9a7c2
Merge branch 'main' into ss/fix-sharding-tests
simone-silvestri 37b6c42
Merge remote-tracking branch 'origin/ss/fix-coordinates' into ss/fix-…
simone-silvestri 557da39
correct stuff
simone-silvestri 02187a9
Merge branch 'ss/fix-sharding-tests' of github.com:CliMA/Oceananigans…
simone-silvestri 0eda368
Merge branch 'main' into ss/fix-sharding-tests
simone-silvestri aeea174
try this
simone-silvestri ec728d3
Merge branch 'ss/fix-sharding-tests' of github.com:CliMA/Oceananigans…
simone-silvestri bcf2a04
MPITripolarGrid
simone-silvestri 9c9e0c0
Merge branch 'main' into ss/fix-sharding-tests
simone-silvestri 0fa87f5
try without immersed boundary
simone-silvestri de178c9
Merge branch 'ss/fix-sharding-tests' of github.com:CliMA/Oceananigans…
simone-silvestri 0ea86ad
reinclude everything
simone-silvestri d16f536
Merge branch 'main' into ss/fix-sharding-tests
simone-silvestri e5cd0a9
try a new arch
simone-silvestri 583f2df
Merge branch 'main' into ss/fix-sharding-tests
simone-silvestri 6c94f19
Merge branch 'main' into ss/fix-sharding-tests
simone-silvestri 088a944
try like this
simone-silvestri f7f24a9
Merge branch 'ss/fix-sharding-tests' of github.com:CliMA/Oceananigans…
simone-silvestri 7c8ef5d
also for this
simone-silvestri 93cc68c
Merge branch 'main' into ss/fix-sharding-tests
simone-silvestri c256996
remove the test for the moment
simone-silvestri e10b020
Merge branch 'ss/fix-sharding-tests' of github.com:CliMA/Oceananigans…
simone-silvestri d249017
Merge branch 'main' into ss/fix-sharding-tests
simone-silvestri 3e4ac64
Merge branch 'main' into ss/fix-sharding-tests
simone-silvestri ed0b1fa
Merge branch 'main' into ss/fix-sharding-tests
simone-silvestri 811dda2
Merge branch 'main' into ss/fix-sharding-tests
simone-silvestri 2ede33c
bugfix
simone-silvestri 55d8a44
Merge branch 'main' into ss/fix-sharding-tests
simone-silvestri 2767749
Merge branch 'main' into ss/fix-sharding-tests
simone-silvestri fafdff1
Merge branch 'main' into ss/fix-sharding-tests
simone-silvestri 5de9004
Merge branch 'main' into ss/fix-sharding-tests
simone-silvestri dcd53d2
Merge branch 'main' into ss/fix-sharding-tests
simone-silvestri 343c164
Merge branch 'main' into ss/fix-sharding-tests
simone-silvestri b84bec2
Merge branch 'main' into ss/fix-sharding-tests
simone-silvestri 24a0ab1
Merge branch 'main' into ss/fix-sharding-tests
simone-silvestri bef9073
remove distributed
simone-silvestri c7e933a
Merge branch 'ss/fix-sharding-tests' of github.com:CliMA/Oceananigans…
simone-silvestri File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
# We need to initiate MPI for sharding because we are using a multi-host implementation: | ||
# i.e. we are launching the tests with `mpiexec` and on Github actions the default MPI | ||
# implementation is MPICH which requires calling MPI.Init(). In the case of OpenMPI, | ||
# MPI.Init() is not necessary. | ||
|
||
using MPI | ||
MPI.Init() | ||
include("distributed_tests_utils.jl") | ||
|
||
if Base.ARGS[1] == "tripolar" | ||
run_function = run_distributed_tripolar_grid | ||
suffix = "trg" | ||
else | ||
run_function = run_distributed_latitude_longitude_grid | ||
suffix = "llg" | ||
end | ||
|
||
Reactant.Distributed.initialize(; single_gpu_per_process=false) | ||
|
||
arch = Distributed(ReactantState(), partition = Partition(4, 1)) | ||
filename = "distributed_xslab_$(suffix).jld2" | ||
run_function(arch, filename) | ||
|
||
arch = Distributed(ReactantState(), partition = Partition(1, 4)) | ||
filename = "distributed_yslab_$(suffix).jld2" | ||
run_function(arch, filename) | ||
|
||
arch = Distributed(ReactantState(), partition = Partition(2, 2)) | ||
filename = "distributed_pencil_$(suffix).jld2" | ||
run_function(arch, filename) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a reason not to save
parent
for all`?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, I have removed that
b
field.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could save parent.