Skip to content

Use merge commits for atime tests + cleanup#7701

Open
tdhock wants to merge 13 commits intomasterfrom
atime-merge
Open

Use merge commits for atime tests + cleanup#7701
tdhock wants to merge 13 commits intomasterfrom
atime-merge

Conversation

@tdhock
Copy link
Copy Markdown
Member

@tdhock tdhock commented Apr 8, 2026

Closes #7363

Change an atime historical commit from a commit inside a PR branch, to the merge commit of that PR, and deleted branches, in

also some more general atime cleanup which should not affect test results:

  • reduce density of elements in N (reduce false positives)
  • move expr to last arg of atime_test, so easy to comment historical versions.
  • added and fixed comments to document commit ID sources.

Toby Dylan Hocking added 2 commits April 8, 2026 11:24
@tdhock tdhock requested a review from Anirban166 as a code owner April 8, 2026 15:28
@tdhock tdhock changed the title Atime merge Use merge commits for atime tests Apr 8, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 8, 2026

  • HEAD=atime-merge much slower for isoweek improved in #7144
  • HEAD=atime-merge slower P<0.001 for setDT improved in #5427
    Comparison Plot

Generated via commit c9ebb95

Download link for the artifact containing the test results: ↓ atime-results.zip

Task Duration
R setup and installing dependencies 3 minutes and 2 seconds
Installing different package versions 4 minutes and 8 seconds
Running and plotting the test cases 6 minutes and 52 seconds

@tdhock tdhock marked this pull request as draft April 8, 2026 16:52
@tdhock tdhock added the atime Requests related to adding/improving/monitoring performance regression tests via atime. label Apr 13, 2026
@tdhock tdhock marked this pull request as ready for review April 21, 2026 03:55
@tdhock tdhock changed the title Use merge commits for atime tests Use merge commits for atime tests + cleanup Apr 21, 2026
@tdhock
Copy link
Copy Markdown
Member Author

tdhock commented Apr 21, 2026

I checked atime results before and after this PR, and they look consistent. The biggest difference is in this test, which is normal because old test code was starting at a custom N=1e3 whereas new test code uses the standard N=1e1 for this case (custom N code deleted).
image

Above is a screenshot from
image

which I made using data from

library(data.table)
bench_dt <- data.table(test_code=c("old","new"))[, {
  result_dir <- paste0("atime-results-",test_code)
  tests.RData <- file.path(result_dir, "tests.RData")
  load(tests.RData)
  bench.dt[, .(Test, unit, N, empirical, expr.name)]
}, by=test_code]

library(ggplot2)
gg <- ggplot()+
  geom_line(aes(
    N, empirical, color=expr.name),
    data=bench_dt)+
  scale_color_manual(values=atime:::default.version.colors)+
  facet_grid(unit ~ Test + test_code, scales="free", labeller=label_both)+
  scale_x_log10()+
  scale_y_log10()
png("atime-results.png", width=120, height=8, units="in", res=100)
print(gg)
dev.off()

@tdhock
Copy link
Copy Markdown
Member Author

tdhock commented Apr 21, 2026

I updated https://github.com/Rdatatable/data.table/wiki/Performance-testing to mention

  • only use merge commits on master (not other commits in PR branches which will be deleted)
  • put expr as last argument of atime_test() so we can easily comment other args.

@tdhock
Copy link
Copy Markdown
Member Author

tdhock commented Apr 24, 2026

@Anirban166 @MichaelChirico @aitap can you please review?

Copy link
Copy Markdown
Member

@Anirban166 Anirban166 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes to tests.R and the wiki look good to me. Thanks!

Copy link
Copy Markdown
Member

@aitap aitap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A commit is "on a branch" if its common ancestor with the HEAD of the branch is the commit itself:

library(codetools)
w = makeCodeWalker(leaf = \(e, w)
  if (is.character(e) && nchar(e) > 4 && grepl('^[0-9a-f]+$', e, perl = TRUE)) {
    mergebase = system2('git', c('merge-base', 'master', e), stdout = TRUE)
    if (mergebase != e) message(sprintf(
      "merge-base(master, %s) = %s", e, mergebase
    ))
  }
)
for (e in parse('.ci/atime/tests.R')) walkCode(e, w)

Looks like we need just one more fix.

Comment thread .ci/atime/tests.R
Fixed = "b6ad1a4",
seconds.limit = 1),
Before = "2cb03162a21328cc5f68a8c3b0e554f5edfcb5b9", # Parent of the first commit (https://github.com/Rdatatable/data.table/commit/4cc77c617435b46a0faac35c56e7fb7b81c629fc) in the regression PR (https://github.com/Rdatatable/data.table/pull/6890/commits)
Regression = "e5e10a09b32f851465790cef98526ab63d5cee3a", # Parent of first commit (https://github.com/Rdatatable/data.table/commit/4acabf0bf8541f6db629bccc6d5f7c806199416a) of fix PR (https://github.com/Rdatatable/data.table/pull/7480/commits), another choice would be 6f49bf1935a3009e85ea1e6f9752ff68ffa47d9b which is merge commit in regression PR https://github.com/Rdatatable/data.table/pull/6890
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the merge commit of the regression-introducing PR is the better choice here.

Suggested change
Regression = "e5e10a09b32f851465790cef98526ab63d5cee3a", # Parent of first commit (https://github.com/Rdatatable/data.table/commit/4acabf0bf8541f6db629bccc6d5f7c806199416a) of fix PR (https://github.com/Rdatatable/data.table/pull/7480/commits), another choice would be 6f49bf1935a3009e85ea1e6f9752ff68ffa47d9b which is merge commit in regression PR https://github.com/Rdatatable/data.table/pull/6890
Regression = "6f49bf1935a3009e85ea1e6f9752ff68ffa47d9b", # Merge commit of the regression PR https://github.com/Rdatatable/data.table/pull/6890

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks much cleaner!

Comment thread .ci/atime/tests.R
expr = data.table::fwrite(L, out.csv, compress="gzip"),
Before = "f339aa64c426a9cd7cf2fcb13d91fc4ed353cd31", # Parent of the first commit https://github.com/Rdatatable/data.table/commit/fcc10d73a20837d0f1ad3278ee9168473afa5ff1 in the PR https://github.com/Rdatatable/data.table/pull/6393/commits with major change to fwrite with gzip.
PR = "3630413ae493a5a61b06c50e80d166924d2ef89a"), # Close-to-last merge commit in the PR.
PR = "3630413ae493a5a61b06c50e80d166924d2ef89a", # Close-to-last merge commit in the PR.
Copy link
Copy Markdown
Member

@aitap aitap Apr 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3630413 was also on the PR branch, not on master

Suggested change
PR = "3630413ae493a5a61b06c50e80d166924d2ef89a", # Close-to-last merge commit in the PR.
PR = "e0abdfcd79ba30efcf679e33cbb8eba897a46f9c", # merge commit of PR6393

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

atime Requests related to adding/improving/monitoring performance regression tests via atime.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

allow to fail atime job in GHCI / use merge commits to allow branch deletion

3 participants