Skip to content

Use testthat 3e and rely on posterior for ESS #289

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Jun 25, 2025

Conversation

VisruthSK
Copy link
Collaborator

@VisruthSK VisruthSK commented Jun 12, 2025

One test was failing due to minor numerical differences in algorithms: average diff: 0.00975, largest diff: 0.01472. I updated the expected test result at reference-results/relative_eff.rds.

posterior::autovariance is already used.

Starts work on #249

@VisruthSK VisruthSK linked an issue Jun 12, 2025 that may be closed by this pull request
Copy link
Member

@jgabry jgabry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like it should be fine to make this change. There are additional tests failing due to differences with the saved reference objects. They seem small, but would be good for @avehtari to confirm that these differences are reasonable to expect due to (I assume) minor differences between this older ESS code and the ESS algorithm used in posterior.

@avehtari
Copy link
Member

Differences in ESS with magnitude in order of 1e-2 are small and can be ignored.

Instead of posterior::ess_basic(), I would use posterior::ess_mean() as the name makes it more explicit what is computed. They use the same approach (but ess_basic() allows also additional argument which can be used to turn off the splitting).

@VisruthSK
Copy link
Collaborator Author

Is there a neat way to update all expected outputs to the new values so that tests pass?

@jgabry
Copy link
Member

jgabry commented Jun 13, 2025

I think if we updated to using the newer testthat::expect_snapshot() then there's a way to update all of them at one. The older expect_equal_to_reference() doesn't seem to have that ability unfortunately. At some point we should update to snapshot testing (it wasn't part of testthat when I wrote these tests a long time ago).

If you want to do that as part of this PR we could, but also fine to hold off on that and just update the reference files.

@jgabry
Copy link
Member

jgabry commented Jun 13, 2025

If you don't want to update to snapshot testing then I think the east way might be to just delete the out of date reference files and then run the tests locally on your local machine and it should generate new files. But I think you need to run the tests interactively for it to generate the new files. That is, you can't just do devtools::test(), you need to open the relevant test file and run all the tests (you can run all of them, you don't need to do it one by one).

@VisruthSK
Copy link
Collaborator Author

I think swapping to snapshots is worthwhile, so I'll get started on that. It'll make this PR easier so might as well include it here. I briefly tried updating the reference files but I wasn't able to get the reference files to update easily.

@avehtari
Copy link
Member

Looks good to me.

I assume you update only the snapshots if the differences were small? And in case of big differences, investigate the reason first?

@codecov-commenter
Copy link

codecov-commenter commented Jun 21, 2025

Codecov Report

Attention: Patch coverage is 66.66667% with 1 line in your changes missing coverage. Please review.

Project coverage is 92.78%. Comparing base (b94b2b1) to head (c3107f0).
Report is 2 commits behind head on master.

Files with missing lines Patch % Lines
R/effective_sample_sizes.R 66.66% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #289      +/-   ##
==========================================
+ Coverage   92.54%   92.78%   +0.23%     
==========================================
  Files          31       31              
  Lines        3017     2964      -53     
==========================================
- Hits         2792     2750      -42     
+ Misses        225      214      -11     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@jgabry
Copy link
Member

jgabry commented Jun 23, 2025

Thanks! Is this ready for me to take another look or are you still working on this? Either way is fine, just checking.

@VisruthSK
Copy link
Collaborator Author

VisruthSK commented Jun 23, 2025

Yup it should be ready for a new review, sorry for not requesting.

There is one thing--I haven't updated the relative_eff with multiple cores runs test and don't understand how it is currently set up. That test is failing due to the move to posterior, and I want to change the test but I'm not sure how to exactly.

@jgabry
Copy link
Member

jgabry commented Jun 23, 2025

There is one thing--I haven't updated the relative_eff with multiple cores runs test and don't understand how it is currently set up. That test is failing due to the move to posterior, and I want to change the test but I'm not sure how to exactly.

When I switch to your branch and run that test it passes for me. It also seems to be passing on GitHub Actions, right? It's failing for you locally?

We're talking about this test, right?

test_that("relative_eff with multiple cores runs", {
  skip_on_cran()
  source(test_path("data-for-tests/function_method_stuff.R"))
  dim(llmat_from_fn) <- c(nrow(llmat_from_fn), 1, ncol(llmat_from_fn))
  r_eff_arr <- relative_eff(llmat_from_fn, cores = 2)
  r_eff_fn <-
    relative_eff(
      llfun,
      chain_id = rep(1, nrow(draws)),
      data = data,
      draws = draws,
      cores = 2
    )
  expect_identical(r_eff_arr, r_eff_fn)
})

@VisruthSK
Copy link
Collaborator Author

VisruthSK commented Jun 24, 2025

Yes it's failing locally. I thought the action might have skipped no Cran tests so it passed, but if it worked locally for you that's interesting. Maybe I have the wrong version of posterior?

Edit: for future reference, the test was failing because I was running that specific file test_active_file() which loaded the loo package at the start. I didn't build and install the development version of loo so the test wouldn't pass. Running check() or installing the local version of the package solved the issue.

@jgabry jgabry changed the title Removed unused fft function and relying on posterior Use testthat 3e and rely on posterior for ESS Jun 25, 2025
@jgabry
Copy link
Member

jgabry commented Jun 25, 2025

I think this looks good. Will merge now!

@jgabry jgabry merged commit ca283a8 into master Jun 25, 2025
6 checks passed
@jgabry jgabry deleted the 249-use-more-functions-from-the-posterior-package branch June 25, 2025 17:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants