Third & GroveThird & Grove
Jun 20, 2021 - Nathaniel Catchpole

Issue Queue Analysis by TAG's Drupal Core Contributor

dog at computer

I decided to look at the Drupal core issue queue to understand better how many issues are open and the direction things are going. The results were a bit of a shock.

First, let's get the shock out of the way, then let's take a look at what's happening, and finally, what we can do to change things.
 

As of February 2021, there are 18,000 open issues against Drupal core.

chart

The above stats come from the “8.x, 9.x, 10.x versions.” Meanwhile, 7.x also has 4,906 open issues, although these will only be relevant to Drupal 7 and not carried forward when support for Drupal 7 ends.

The number of open issues at each status has approximately tripled in the past eight years.

Why is the queue growing so fast?

  • We have adopted several large issue queues from contrib. There are 1,660 open core issues against Views/Views UI; 420 are against Migrate and 370 are against Media. Just these three subsystems account for more than 10% percent of the total issues against core.
  • Very few core modules have been removed during the same period. 
  • Even when modules don't have many issues against them specifically, they will be included in cross-core issues (cspell, phpunit deprecations, etc.), adding to the amount of code to be modified and reviewed.
  • Issue queue size itself causes a feedback loop. If you find a bug, it is harder to find among 18,000 issues than 300, so you are more likely to create a duplicate bug report. Every duplicate bug report could have been a review/test of an existing patch instead. If you're looking for an issue to work on, it is harder to know where to start. If you're involved in 100 open issues, it is harder to keep track of them and requires more context switching than if only 10 are open.
  • A lot of people are using core and opening issues against it. We may want to look into statistics for how many issues are opened per year, too.

What is preventing throughput?

The RTBC queue is a significant bottleneck. At many points in the past three to five years, we have reached over 150 RTBC issues. It can then take weeks of sustained effort to get back under 50. The queue very rarely goes under 30 issues. This is despite adding more committers to the team. However, this is not a new situation — there are always several issues RTBCed every day from a very large pull of issues at “needs review,” and the RTBC queue snapshot from 2013 shows 90 issues RTBC.

While there are more committers now than in 2013, there’s also a lot more core committer work that doesn't involve committing patches — minor releases every six months with alpha/beta/rcs, core initiatives, security support for two minor branches at a time plus the LTS branch, and often co-ordinating security releases with upstream projects.

RTBC issues take more time to review since Drupal 8 released, particularly backward compatibility and backport eligibility decisions. Steps such as assigning contribution credits can double or triple the time they usually take to commit a trivial issue. Therefore the slower commit rate doesn't necessarily indicate less activity; it also shows more effort spent on each issue.

When the RTBC queue is long, there is a knock-on effect on other contributions. Core contributors will have a threshold of patches they've worked on that are stuck at RTBC, after which other patches may be blocked on those commits, or they may just want to see progress on their RTBC issues before picking up the next ones. This varies for different people, but there is a limit to how many issues anyone can work on simultaneously.

A long RTBC queue also increases the chance that any one individual issue will be hit by either a random test failure or conflict with another patch that's committed. At the same time, it's RTBC, artificially taking it out and returning it to the RTBC queue.

The quality of RTBCs is not always high, meaning one issue can take multiple rounds of feedback from a committer before it can be committed. In many cases, these could have been found by an experienced reviewer earlier in the process.

line graph

There are approximately 1,500 commits made to the development branch of core per year, a rate that has been steady since 2017. This is lower than the peaks of 4,300 in 2013-14 (just before the release of Drupal 8) and 2,700 in 2009-10 (just before the release of Drupal 7).

Several commits don’t necessarily indicate the actual throughput of work. For example, code style scoping guidelines encourage a handful of issues with large patches in each, vs. dozens of issues with small patches, which was common in previous years for identical changes to the code base. Additionally, it says nothing about the type of issues being committed — i.e., whether they're critical bug fixes or typos in code comments. However, it does show that the frenzy of commits around Drupal's old big bang major releases has flattened out considerably with the new release cycle.

Possible solutions

  • Add more core committers, especially people who primarily focus on the RTBC queue.
  • Ensure that RTBC queue time is funded for new and existing core committers.
  • Fix or disable random test failures to reduce pointless RTBC queue noise. Some random failures have been around for over a year without resolution.
  • Look at ways to streamline the commit process, although these almost always have trade-offs vs. consistency/quality/issue credit etc.
  • Try to improve the quality of RTBCs by cultivating more experienced reviewers and rewarding more high-quality reviews.

Needs review, needs work, and active

While the RTBC queue is very busy and sometimes congested, the needs review queue is more like a giant car park, with nearly 3,500 issues waiting for review.

When an issue is RTBC, you have a reasonable expectation that a core committer will review the issue within x amount of time and either provide feedback or commit it. This is a small but concrete group of people.

“Needs review” issues can be reviewed by anyone other than the patch author — theoretically a very large number of people—but in practice, a lot less than those opening issues or even posting patches.

An issue at “needs review” for more than a week without any feedback is effectively stalled, and 3.5k needs review issues indicate this happens a lot.

The oldest issue in the needs review queue (at the time of this writing) is nine years and nine months old. The most recent patch was posted in 2011, passed 190 simpletest test, and was never retested. The most recent comment was in 2018, suggesting it should be closed as no longer relevant but not actually changing the status. In general, an issue that needs review is either ready to commit, needs more work, or isn’t applicable. The ideal state of the needs review queue then would be a shortlist of issues (tens, not hundreds) that have just been marked needs review, then quickly get feedback so they can go back for another round of work or on to commit.

On the other hand, there are many reasons why issues might be “active” or “needs work.” Many issues are legitimately at that state while solutions and patches are being worked on, while many more will be essentially unviable — duplicates, bugs that should have been posted against contrib modules, cannot reproduce, etc.

With a large number of issues against core, many are closely related or duplicates. This often results in split efforts and isolation; for example, people experiencing a bug post a new issue rather than finding an existing issue with a working patch, which they might be able to contribute to reviewing and getting committed. Or two sets of people work on patches in different issues, one ultimately getting discarded after weeks, months, or years when the other is committed.

The Drupal community is extremely reluctant to move issues to won't fix/by design/duplicate/outdated. While marking an issue duplicate usually connects it to another active one, the other statuses are a dead end. On the one hand, this makes the issue queue friendlier, where other projects might immediately reply to a bug report or feature request with “won't fix”; on the other, it makes it hard to know what has a viable route to commit and what doesn't.

For example, there are 549 issues marked “postponed, maintainer needs more info.” The oldest was put into that state 10 years and one month ago.

Possible solutions

  • Retesting of “needs review” issues. Many “needs review” issues haven't had a test run in several years.  
  • Show related issues via apachesolr-related content in addition to the current manual links, and display these in a block when viewing issues may enable duplicates to be more quickly identified.
  • When posting a new issue, we could try to find similar issues based on the contents of the form before it is posted, potentially preventing duplicate issues from being posted in the first place.
  • The above and a few other issues are collected under this meta.
  • Develop more precise criteria for invalid issues to be closed (cannot reproduce etc.) and deal with both.
  • Critical and major triage, carried out by core committers, sometimes with subsystem maintainers, effectively reduced the length of both queues (especially as Drupal 8 was nearing release). We could look at reviving this. However, the number of critical issues has remained relatively steady since Drupal 8’s release.
  • Bugsmash has lowered, or at least flattened, the total number of bugs against core, including fixing some very old issues. Could we set up a similar triage initiative for tasks?