Misc #20254
closedAdd Launchable into Ruby CI
Description
I’m a software engineer who works at Launchable, Inc.
Some Ruby CI maintainers have granted me (Naoto Ono) the permission to integrate our service, Launchable into ruby/ruby. This decision was made during the meeting.
Thus, I have created the ticket to share the benefit and progress.
Progress¶
Currently, both ruby/debug and ruby/vscode-rdbg has started using Launchable.
Benefits¶
By introducing Launchable, we can enjoy the following benefits:
Rich UI to see stderr and stdout in failed tests¶
Some Ruby developers find it challenging to review test failure logs in Github Actions. Launchable provides a rich UI that makes it easier to examine stderr and stdout related to test failure.
Test reports in Github comments¶
Identifying which CI job failed in the Github UI can be cumbersome. Launchable addresses this by creating a comment that summarizes test results directly in Github pull request.
This enhancement will significantly improve the development experiment.
Flakey test insight¶
By developing Ruby, some flakey tests might be added. Launchable has Unhealthy Tests page which analyze the flakey tests. For instance, the page shows the flakiness score which express how much the test is flakiness. Let's say foo
test has a flakiness score of 0.5. This means that if foo
test failed against 10 different commits, that failure was not a true failure in half of the commits. By seeing this graph, we can prioritize which tests that we have to deal with. In addition to that, we can understand the test is flakey or not.
Predictive Test Selection¶
Ruby has an extensive suite of tests, and the number of tests is expected to grow. However, waiting for all tests to pass can be time-consuming. To address this challenge, Launchable offers the Predictive Test Selection feature. Predictive Test Selection leverages machine learning to identify the right tests to run for a specific code change. By analyzing data from past test runs and considering the changes being tested, Launchable determines which tests are most relevant. Here is a execution strategy with Launchable.
-
Initial Selection
Launchable selects a subset of tests (let's say approximately 25% of all tests) based on the predictive analysis. These tests are executed, first. -
Remaining Tests
The remaining of 75% of tests are then executed. Since the initial selection focuses on potentially problematic areas, there's a high likelihood of identifying any failed tests sooner.
Updated by mame (Yusuke Endoh) 11 months ago
In short, Launchable is willing to provide us their service for free. I feel it convenient to see the test results formatted, instead of the log of GitHub Actions.
As the CI results are already public information, I don't see any problem in trying it out. If there is a problem by any chance, it is easy to remove it.
Updated by jaruga (Jun Aruga) 10 months ago
Conflict of interest: I work at Red Hat, being paid by Red Hat. I am a committer in the Ruby project. In the project, one of my interests and caring is about maintaining the CI services and infra, because the downstream Ruby RPM pakcage in Fedora, CentOS Stream and Red Hat Enterprise Linux I am working for can have a benefit from it. But I am trying to make the best decision for the Ruby Project.
@ono-max (Naoto Ono) Thank you for opening this issue ticket, and thank you for your work to onboard the Launchable in the Ruby. I am positive for trying this CI service, Launchable.
Some Ruby CI maintainers have granted me (Naoto Ono) the permission to integrate our service, Launchable into ruby/ruby. This decision was made during the meeting.
Let me clarify the context. An event that people (ono-max, yoshiori) at Launchable explained the service was shared on the Ruby committer Slack channel, and the event was held in January 2024, then I can see the maximum 6 people: ko1, mrkn, znz, mame, hsbt attended the meeting, and maybe they have granted, and they may have information about Lunchable to guide people seeing this ticket.
I thought we should transparently share this information with other committers and contributors, who are working for the unit tests in the ruby/ruby. Therefore I asked @ono-max (Naoto Ono) to open a new issue ticket. I thought we needed to show our intention to select the best tools and to be vendor-neutral for the Ruby project, while everyone has a cognitive bias, and some people may have conflict of interest.
Below is my main response.
As the CI results are already public information, I don't see any problem in trying it out. If there is a problem by any chance, it is easy to remove it.
I don't see any problem in trying it out too.
Below are the pull-requests about adding the Launchable.
-
https://github.com/ruby/ruby/pull/9757
tool/lib/test/unit.rb
-
https://github.com/ruby/ruby/pull/9777
tool/lib/test/unit.rb
tool/lib/test/unit/parallel.rb
.github/workflows/macos.yml
For the future improvements, I would love to see make the Launclable's code as coarse coupling, so that we can drop Launchable even easier when we want as a possibility.
- For the
tool/lib/test/*
, we may be able to split the Launchable's logic in thetool/lib/test/unit.rb
into the another file such astool/lib/test/unit/launchable.rb
. - For the GitHub Actoins YAML file, we may be able to manage the Launchable's YAML setting in a separate file by using
uses:
syntax such as https://github.com/ruby/setup-ruby . It looks useful when we use Launchable for other GitHub YAML files other thanmacos.yml
Updated by ko1 (Koichi Sasada) 10 months ago
Some more background:
1, I heard the introduction of Launchable service with some committers by my former colleague (as described in #5) in closed meeting.
2. On the meeting, we've learn the service and we found the task is enough small to introduce it.
Although the impetus came from a referral from an acquaintance, by comparing the benefit from the service and the cost of introduction, I think trying it is reasonable and this does not constitute a conflict of interest (it is a normal selection of services for development).
In this case, the cost of introduction is enough low.
I agree there are room to improve the source code but we can improve after merging and trying it out, IMO.
(and we can revert the code if we found another issues)
Updated by jaruga (Jun Aruga) 10 months ago
I agree there are room to improve the source code but we can improve after merging and trying it out, IMO.
(and we can revert the code if we found another issues)
I agree that we merge the necessary PRs to run Launchable on CI first before working for further improvements. And I understand we can revert the code if it is necessary. Let's keep in mind that we squash commits to one commit, or merge with squash for Launchable related PRs if it is possible, because of reverting the necessary commits easily. I see that we forgot to ask to squash the commits on the past PR: https://github.com/ruby/ruby/pull/9757.
Updated by byroot (Jean Boussier) 10 months ago
So if I understand correctly, this isn't an alternative CI, just a side system that collect statistics and also better format test failures.
I haven't looked at it in detail, but some form of flakiness analysis would indeed be very welcome.
The one small bit of feedback I have about what I've seen, and that is mostly down to personal preference, is I tend to dislike bots that post GitHub comments, as I find them very noisy given my GitHub notification setup (email).
So I wonder if it would be possible for it to register itself as a normal commit status instead? That said, GitHub Action already publish so many of these I understand one worry might be that it will be hard to find in the list?
Also it would be nice to see a PR with it enabled on ruby/ruby
. The result on ruby/debug
look nice, but ruby/ruby
CI output is way more complicated, so I'd be curious to see how well it adapts to that.
Updated by hsbt (Hiroshi SHIBATA) 10 months ago
- Subject changed from FYI: Add Launchable into Ruby CI to Add Launchable into Ruby CI
Updated by ono-max (Naoto Ono) 10 months ago
Thank you for the feedback. It makes sense to minimize notification noise, especially considering different preferences in notification settings. To address this, we will adjust the GitHub comments feature as follows:
- Launchable will create a single comment summarizing test results when a pull request is created.
- Subsequently, test updates will be reflected by updating this comment.
This approach aims to consolidate notifications into a single comment upon pull request creation, reducing overall noise while still providing valuable information on test results.
Updated by hsbt (Hiroshi SHIBATA) 10 months ago
- Status changed from Open to Assigned
In dev meeting at #20193, we discussed about Launchable. No one against Launchable integration for our GitHub Actions. In my opinion, Launchable helps to improve CI result of GitHub Actions and easily find flaky test. I as administrator of dev infrastructure am supporting @ono-max (Naoto Ono) to integrate Launchable.
Updated by byroot (Jean Boussier) 10 months ago
Subsequently, test updates will be reflected by updating this comment.
Thank you!
Updated by ono-max (Naoto Ono) 10 months ago
I wonder if I should introduce the following options to customize the behavior of the GitHub comments feature in the Launchable Web Console:
Configurable Option 1:
Allow users to choose when Launchable creates a GitHub comment.
- Option 1.1: Create a GitHub comment every time a commit is pushed.
- Option 1.2: (Default) Launchable creates a single comment summarizing test results when a pull request is created. Subsequent test updates are reflected by updating this comment.
Configurable Option 2:
Allow users to choose the visibility of GitHub comments based on test results.
- Option 2.1: Hide GitHub comments when all tests pass.
- Option 2.2: (Default) Do not hide comments, regardless of test results.
I welcome your thoughts on these proposed options.
Updated by duerst (Martin Dürst) 10 months ago
@ono-max (Naoto Ono) About customizing options, whatever results in less email by default is highly preferable.
Updated by jaruga (Jun Aruga) 10 months ago
FYI: I added document about Launchable to ruby/ruby's wiki page.
https://github.com/ruby/ruby/wiki/CI-Servers#launchable-ci
Updated by ono-max (Naoto Ono) 10 months ago
I updated the GitHub comments feature based on https://bugs.ruby-lang.org/issues/20254#note-10. Thank you for your feedback!
Updated by jaruga (Jun Aruga) 10 months ago
jaruga (Jun Aruga) wrote in #note-5:
...
Below are the pull-requests about adding the Launchable....
- https://github.com/ruby/ruby/pull/9777
tool/lib/test/unit.rb
tool/lib/test/unit/parallel.rb
.github/workflows/macos.yml
Note the https://github.com/ruby/ruby/pull/9777 mentioned above was merged and integrated into the GitHub Actions CI yesterday. Everyone can check how the test reports look like on the page below.
https://app.launchableinc.com/organizations/ruby/workspaces/ruby/data/test-sessions
Updated by jaruga (Jun Aruga) 10 months ago
As a reference to see how the results of the Launchable CI look like, below is one example to see the flaky, the most failed, and longest tests.
https://github.com/ruby/ruby/pull/10118#issuecomment-1967550964
Updated by ono-max (Naoto Ono) 10 months ago · Edited
FYI: I'm going to change the GitHub comment feature as follows today:
If a test is executed multiple times, we'll treat it as a single test, and the status of the test will reflect the latest result. By implementing this behavior, retried tests will be marked as passed if the final result after retrying is passed. This feedback was given by Shibata-san and Kokubun-san.
Updated by ono-max (Naoto Ono) 10 months ago
Thank you for introducing the feature, @jaruga (Jun Aruga). Also, I'd like to introduce the page https://app.launchableinc.com/organizations/ruby/workspaces/ruby/insights/unhealthy-tests. This page displays Flaky Tests, Never Failing Tests, Longest Tests, and Most Failed Tests.
For instance, the flakiness score is displayed in the Flaky Tests section. The flakiness score expresses how much the test exhibits flakiness. Let's say "test_nested_timeout" has a flakiness score of 0.2. This means that if "test_nested_timeout" fails against 10 different commits, the test failure is not considered a true failure in two of those commits.
For more detailed information, you can visit the document page at https://www.launchableinc.com/docs/features/unhealthy-tests/.
Updated by hsbt (Hiroshi SHIBATA) 10 months ago
- Status changed from Assigned to Closed
@ono-max (Naoto Ono) and @jaruga (Jun Aruga) Thank you for your works. Launchable has been added our repository.
We discovered https://bugs.ruby-lang.org/issues/20314 with this integration.
We should close this and create new ticket each concerns.