06-release-engineering-analytics.Rmd

# Release Engineering Analytics

## Motivation

Release engineering is a software engineering discipline
concerned with the development, implementation, and improvement of
processes to deploy high-quality software reliably and predictably [@dyck2015a].
The changes made by developers of a software system
should eventually be integrated and deployed such that end users may benefit
from them.
In recent years, release engineers have developed and
adopted techniques to build infrastructures and pipelines which automate the
process of releasing software to an increasingly large degree.
These modern approaches have resulted
in various practices such as releasing new versions of a software system in
significantly shorter cycles.

Due to these developments being industry-driven, release engineering forms a
largely uncharted territory for software engineering research.
Efforts to close this gap would be relevant both for practitioners as well as
researchers [@adams2016a].
On the one hand, claims and rationales are presented by the industry to justify
practices in release engineering, but these are often not empirically validated.
For this reason, research should aim to build an understanding of the actual
effects of release engineering practices on the software development process.
On the other hand, software engineering researchers need to be aware of modern
release engineering practices in order to account for them in their analyses.
Otherwise, their lack of familiarity with these practices will likely result
in biases in their study results.

This systematic literature review aims to provide an overview of
the software analytics research that has been conducted so far on modern
release engineering.
Its main purpose is to identify the apparent gap
between research and practice, in order to guide further research efforts.

### Research Questions

Contrary to what is regularly the case, advances in release engineering
practices are driven by industry, instead of scientific research.
Building on this idea, our questions are constructed to identify in which ways
existing modern release engineering practices should still be studied in
software analytics research.
Our review thus aims to answer the following questions.

- **RQ1:** _How is modern release engineering done in practice?_

    This question aims to identify the so-called "state of the practice" in
    release engineering.
    We will summarize practices that have been adopted to drive release
    engineering forward. In addition, we will identify the tools utilized to bring
    this about.

- **RQ2:** _What aspects of modern release engineering have been studied
  in software analytics research so far?_

    In order to answer this question, we investigate the practices that previous
    empirical case studies have focused on.
    In doing so, we identify the associated costs and benefits
    that have been found, and the analysis methods used.

- **RQ3:** _What aspects of modern release engineering make for relevant
  study objects in future software analytics research?_

    In answering this question, we aim to identify the gap between practice and
    research in release engineering.
    This way, our intent is not only to guide but also to motivate future research.

## Research Protocol

Our research has been performed following the procedures for performing
systematic reviews by Kitchenham @kitchenham2004procedures.
In this process, we have set up strategies for searching, selecting and
quality-assessing studies.
Subsequently, we have extracted data from the selected studies and
synthesized the answers to our research questions.

All the papers that were found, were stored in a custom-built web-based tool for
conducting literature reviews.
The source code of this tool is published in a GitHub repository.
^[See https://github.com/jessetilro/research]
The tool was hosted on a virtual private server, such that all retrieved
publications were stored centrally, accessible to all reviewers.

In order to save space in this chapter of the book,
we have omitted the full research protocol from this chapter.
The interested can find our research protocol in detail in
[Section 7.5](#appendix).

## Answers

In this section, we will give an answer to each of the research questions
that we have presented in [Section 7.1.1](#research-questions-1).

### RQ1: Modern Release Engineering Practices

This section aims to answer to the question:
_How is modern release engineering done in practice?_

Adams et al. @adams2016a and Karvonen et al. @karvonen2017a have described release engineering
practices that are currently in use in the industry.
Adams et al. @adams2016a focused on modern release engineering,
while Karvonen et al. @karvonen2017a investigated agile release engineering,
which is a subset of all modern release engineering.
Both studies agreed that modern release engineering consists
of the following components:

- **Rapid Releases (RR).**

    In contrast with traditional release cycles,
    RRs regularly push new releases to users in a regular schedule.
    As an example, FireFox releases a new version every six weeks.

- **DevOps.**

    According to Dyck et al. @dyck2015a: "DevOps is an organizational approach
    that stresses empathy and cross-functional collaboration
    within and between teams (especially development and IT operations)
    in software development organizations, in order to operate
    resilient systems and accelerate delivery of changes."

- **Continuous Integration (CI).**

    To quote Adams et al. @adams2016a: "[CI] refers to the activity of continuously polling the
    [version control system] for new commits or merges,
    checking these revisions out on dedicated build machines,
    compiling them and running an initial set of tests to check for regressions."

- **Continuous Deployment or Continuous Delivery.**

    When the CI tests pass, the code can be automatically deployed
    to the production environment.
    The difference between these terms is that continuous delivery
    does not require that changes are automatically deployed,
    but continuous deployment always automatically deploys changes.
    However, the change might not yet be released,
    because it can be hidden using feature toggles [@laukkanen2018a].

Besides these four components, Adams et al. @adams2016a also identified three other concepts
that are used in modern release engineering,
specifically to release software as often as possible
and thus enable continuous deployment or delivery:

- **Branching and Merging.**

    There are several possible strategies of branching and merging.
    Typically, merging must be done as often as possible
    in order to release as often as possible.

- **Build System.**

    Building the software must be done in a consistent way,
    such that each build produces the same result.
    With the build configuration stored inside the project,
    every developer (or automated tool) only needs to issue a single command
    in order to build the project,
    instead of manually having to configure the build process every time.

- **Infrastructure-as-Code.**

    In the same alley of "storing configuration",
    infrastructure-as-code means that the server (or virtual machine)
    on which the software product is running
    can also be automatically configured with code,
    instead of having to configure each server manually.

Besides these seven components that are more technical, Poo-Caamaño @poo-caamano2016a has 
identified that there are also social aspects to modern release engineering. Specifically, most 
large software projects have a dedicated Release Team that will decide on the release strategies 
and communicate them to others.

### RQ2: Studied Parts of Release Engineering

This section aims to answer to the question:
_What aspects of modern release engineering
have been studied in software analytics research so far?_

Over the years, the software industry has come up with inventive approaches to deliver new 
features and fixes in a more efficient and faster manner. This has resulted in case studies 
being done to assess the associated risk and cost factors, and what benefits certain 
strategies can give.

Khomh et al. @khomh2015a have looked into the effects of switching from traditional to rapid 
release cycles 
in the case of Mozilla Firefox. The paper has concluded that users do not experience 
significantly more post-release bugs, bugs are fixed faster, but that users experience bugs 
earlier in the software execution. Mantyla et al. @mantyla2015a have also considered data from 
Mozilla 
Firefox and has examined the impact of release engineering on testing efforts. Observations of 
the paper conclude that the rapid release cycle performs more test executions per day, but 
these tests focus on a smaller subset of the test case corpus and that testing happens closer 
to release and is more continuous. A limitation of this study is that it measures correlation, 
rather than causation. Da Costa et al. @da2014a has further zoomed into the integration of 
addressed issues
and has considered data from Mozilla Firefox, as well as data from ArgoUML and Eclipse. The 
paper found that addressed issues are usually delayed in a rapid release cycle and are often 
excluded from releases.
Similar conclusions based on Mozilla Firefox were made by Da Costa et al. @da2016a,
who found that minor-traditional releases tend to have
less integration delay than major/minor-rapid releases.

Castelluccio et al. @castelluccio2017a has examined the practice of *patch uplifting* in the 
release management at 
Mozilla Firefox where patches that fix critical issues, or implement high-value features are 
often promoted directly from the development channel to a stabilization channel. The paper 
evaluated the characteristics of patch uplift decisions and interviewed three Mozilla release 
managers. The paper concluded that the majority of patch uplift decisions are made due to a 
wrong functionality or crash. The specificity and code author of patches that are requested to 
be uplifted are also a major factor for release managers.

In response to case studies being done on many prominent open source software projects, 
Teixeira @teixeira2017a has described OpenStack's shift to a liberal six-month release cycle. As 
this 
is an ongoing study, the results given by the paper are preliminary and only observe the 
process. OpenStack's release process can be considered as a hybrid of feature-based and 
time-based releases. OpenStack encourages regular releases but also attempts to include new 
features at each regular release.

Rather than focussing on topics such as issue and delays, Poo-Caamaño @poo-caamano2016a focusses 
on the 
communication in release engineering in the cases of GNOME and OpenStack. Through analyzing 
over 2.5 years of communication, the paper has made a number of observations.
The paper found that developers tend to communicate through
blogs, bug trackers, conferences, and hackfests.
Another finding is that a release team is set to define
requirements, quality standards, and coordination through (direct) communication.
Although only the mailing lists of the projects were studied,
defined challenges include keeping everyone informed and engaged,
monitoring changes and setting priorities in cross-project coordination.

Laukkanen et al. @laukkanen2018a have described what effects modern release engineering
have on software with different organizational contexts.
This study specifically focusses on continuous deployment practices.
The paper has found that high internal quality standards combined with the
large distributed organizational context of large corporations slowed the verification process 
down and therefore had a negative impact on release capability. However, in small 
corporations, the lack of internal verification measures due to a lack of resources was 
mitigated by code review, disciplined CI and external verification by customers in customer 
environments. More about the factors that can play a role is addressed by Rodríguez et al. 
@rodriguez2017a, 
where an overview of contributing factors in continuous deployment are defined and categorized 
based on literature between 2001 and 2014.

As rapid release cycles and continuous deployment are topics that are new and emerging, not 
enough research has been done to generalize any conclusions that are made in the case studies 
discussed in this section. This is why all the empirical studies in this survey have one major 
sidenote in common: more case studies are needed. Open challenges such as these will be 
discussed in the next section.

### RQ3: Future Research

This section aims to answer the question:
_What aspects of modern release engineering make for
relevant study objects in future software analytics research?_

#### General Suggestions

The body of literature that we analyzed for this survey mostly comprised case
studies that employed quantitative analysis methods.
From these studies, interesting conclusions have been drawn about the effects of
release engineering practices on software development processes in specific
contexts.
However, the generalizability of the findings in these case studies is very
limited.
Therefore, in general many studies suggest that future research efforts focus on
performing additional case studies, both to verify existing findings and to
study new relationships and new contexts [@karvonen2017a; @teixeira2017a;
@khomh2015a; @claes2017a; @laukkanen2018a; @adams2016a; @castelluccio2017a].
It also seems worthwhile to triangulate findings by complementing data analyses
with other quantitative (e.g. a survey) or qualitative (e.g. an interview)
methods [@karvonen2017a].
Finally, additional literature reviews will allow researchers to keep an
overview of the most recent developments and findings in the area of release
engineering [@rodriguez2017a; @laukkanen2018a].

Apart from verifying results, it might be worthwhile to leverage them by
constructing analysis tools for practitioners.
For example, Castelluccio et al. @castelluccio2017a suggest exploring possibilities to leverage
their research by building classifiers capable of automatically assessing the
risk associated with patch uplift candidates and recommend patches that can be
uplifted safely.
Also, companies seem to be struggling with the adoption of continuous delivery
and deployment, so a checklist for analyzing readiness for these practices
might be developed [@karvonen2017a].

The review by Karvonen et al. @karvonen2017a makes a number of general suggestions for future
research.
In particular, there should be more attention to comprehensively reporting how
practices are implemented and in which context they are embedded, instead of
just stating that they are used.
Also, the viewpoints of different stakeholders other than developers can be
taken into account.
For example, the customer perceptions regarding the adoption of a certain
practice can be investigated.

#### Directions for Specific Practices

**Rapid Releases**

As established, rapid releases are a prevalent topic in current research on
modern release engineering.
However, it will be useful to verify these results given the fact that this
research mainly involves case studies (most of which are only concerned with
Mozilla Firefox due to the availability of data).
To this end, there are opportunities to further investigate the effects of
switching to rapid releases on:

- code integration [@da2014a; @da2016a; @souza2015a; @castelluccio2017a],
- testing efforts [@mantyla2015a],
- software quality [@khomh2015a; @khomh2012a],
- (library) adoption [@fujibayashi2017a]
- time pressure and work patterns [@claes2017a].

**DevOps**

When it comes to DevOps, future research is needed to refine its definition such
that it is uniform and valid for many situations [@dyck2015a]. According to
Karvonen et al. @karvonen2017a, it seems that the goals in DevOps are congruent with those
in release engineering, and future research on this topic is therefore highly
relevant in order to study modern release engineering.

**Continuous Delivery / Deployment**

Research on continuous deployment seems to be still in its infancy, therefore Rodríguez et al.
@rodriguez2017a have suggested a significant number of different concrete
opportunities for future research.
In general, they conclude that the topic needs an increase in the number and
rigor of empirical studies, and thus it presents opportunities for software
analytics research.
In a systematic literature review, Laukkanen et al. @laukkanen2017a identified 40 problems,
28 causal relationships and 29 solutions related to the adoption of continuous
delivery.
These problems and solutions can be studied further to deepen the
understanding of the nature of these problems and how to apply their solutions.
Some of the problems that are more of a human or organizational nature might
be involved with a broader spectrum of changes, so it should also be
investigated to what extent these problems are specific to continuous delivery.

**Continuous Integration / Build system**

One of the main issues that seems to be obstructing organizations in adopting
modern release engineering practices is build design [@laukkanen2017a]. In a
case study by Laukkanen et al. @laukkanen2018a, it was found that a complex automated build and
integration system led to a more undisciplined one, which in turn slowed down
the verification and release processes. Therefore, future research might
investigate how developers can make their builds more maintainable and of
higher quality, how anti-patterns in the design of the build can be refactored,
and how continuous integration can be made faster and more energy efficient
[@adams2016a].

## Conclusion

In this literature survey,
we have provided an answer to the following three research questions:

- **RQ1:** _How is modern release engineering done in practice?_

    We found that there are six important technical aspects to
    modern release engineering: Rapid Releases, DevOps, Continuous Integration,
    Continuous Deployment, Branching and Merging, Build Configuration and
    Infrastructure-as-code.
    The most important social aspect of modern release engineering is communication.

- **RQ2:** _What aspects of modern release engineering have been studied
  in software analytics research so far?_

    At this point in time, case studies have mainly focussed on
    the resulting factors of switching from a traditional release cycle
    to a rapid release cycle, and what effects this has in various
    organizational contexts.
    As all included studies suggest, more empirical studies are needed to be
    able to make general conclusions in the novel field of release engineering.

- **RQ3:** _What aspects of modern release engineering make for relevant
  study objects in future software analytics research?_

    In general, more empirical research is required to validate and generalize
    the results of many previous case studies within the field of release
    engineering.
    In addition to performing case studies involving quantitative analyses,
    it may be beneficial to triangulate results using various research methods.
    Also, future research should more comprehensively describe how practices
    are implemented, and consider different stakeholders.
    For each practice, future research is suggested on the one hand to further
    investigate their effects on the development process, and on the other hand
    to investigate problems involved with their adoption.

## Appendix

This appendix contains sections that were part of our process during
this literature survey.
They are not directly needed to answer the research questions,
but are still relevant in order to validate our survey.
This Appendix contains our project timetable, the research protocol in full detail,
and the raw extracted data from the selected studies.

### Project Timetable

The literature review was conducted over the course of four weeks. We worked iteratively and 
planned for four weekly milestones.

+-----------+----------+-------------------------------------------------------+
| Milestone | Deadline | Goals                                                 |
+===========+==========+=======================================================+
| 1         | 16/9/18  | - Develop the search strategy  |
|           |          | - Collect initial publications |
+-----------+----------+---+
| 2         | 23/9/18  | - Write full research protocol |
+-----------+----------+---+
| 3         | 30/9/18  | - Collect additional literature according to the protocol |
|           |          | - Perform data extraction |
+-----------+----------+---+
| 4         | 7/10/18  | - Perform data synthesis |
|           |          | - Write final version of the chapter |
+-----------+----------+---+

### Research Protocol

In this appendix, we will describe in detail how we applied the protocol for
performing systematic literature reviews by Kitchenham @kitchenham2004procedures.
In order, we will go over the search strategy, study selection,
study quality assessment, and data extraction.
The last subsection will list which studies we included in this review
and which we have found, but excluded from the review for a specific reason.

#### Search Strategy

Since release engineering is a relatively new research topic,
we took an exploratory approach in collecting any literature revolving around
the topic of release engineering from the perspective of software analytics.
This aided us to determine a more narrow scope for our survey,
subsequently to allow us to find additional literature to fit this scope.

At the start of this project, we were provided with an initial seed of five
papers as a starting point for our literature survey
[@adams2016a; @da2014a; @da2016a; @khomh2012a; @khomh2015a].

We collected other publications using two search engines:
Scopus and Google Scholar.
Each of the two search engines comprises several databases such as
ACM Digital Library, Springer, IEEE Xplore and ScienceDirect.
The main query that we constructed is displayed in Figure 1.
The publications found using this query were:

- @kaur2019a
- @kerzazi2013a
- @castelluccio2017a
- @karvonen2017a
- @claes2017a
- @fujibayashi2017a
- @souza2015a
- @laukkanen2018a
- @dyck2015a

```
TITLE-ABS-KEY(
  (
    "continuous release" OR "rapid release" OR "frequent release"
    OR "quick release" OR "speedy release" OR "accelerated release"
    OR "agile release" OR "short release" OR "shorter release"
    OR "lightning release" OR "brisk release" OR "hasty release"
    OR "compressed release" OR "release length" OR "release size"
    OR "release cadence" OR "release frequency"
    OR "continuous delivery" OR "rapid delivery" OR "frequent delivery"
    OR "fast delivery" OR "quick delivery" OR "speedy delivery"
    OR "accelerated delivery" OR "agile delivery" OR "short delivery"
    OR "lightning delivery" OR "brisk delivery" OR "hasty delivery"
    OR "compressed delivery" OR "delivery length" OR "delivery size"
    OR "delivery cadence" OR "continuous deployment" OR "rapid deployment"
    OR "frequent deployment" OR "fast deployment" OR "quick deployment"
    OR "speedy deployment" OR "accelerated deployment" OR "agile deployment"
    OR "short deployment" OR "lightning deployment" OR "brisk deployment"
    OR "hasty deployment" OR "compressed deployment" OR "deployment length"
    OR "deployment size" OR "deployment cadence"
  ) AND (
    "release schedule" OR "release management" OR "release engineering"
    OR "release cycle" OR "release pipeline" OR "release process"
    OR "release model" OR "release strategy" OR "release strategies"
    OR "release infrastructure"
  )
  AND software
) AND (
  LIMIT-TO(SUBJAREA, "COMP") OR LIMIT-TO(SUBJAREA, "ENGI")
)
AND PUBYEAR AFT 2014
```

_Figure 1. Query used for retrieving release engineering publications via Scopus._

In addition to querying search engines as described above,
references related to retrieved papers were analyzed.
These reference lists were obtained from Google Scholar and from the
_References_ section in the papers themselves.
We selected all papers on release engineering that are citing or being cited
by the initial set of papers.
Using this approach, we have found six additional papers.
The results of the reference analysis are listed in Table 1.

_Table 1. Papers found indirectly by investigating citations of/by other papers._

+-----------------+-------------+-------------------+
| Starting point  | Type        | Result            |
+=================+=============+===================+
| @souza2015a     | has cited   | @plewnia2014a     |
|                 |             | @mantyla2015a     |
+-----------------+-------------+-------------------+
| @khomh2015a     | is cited by | @poo-caamano2016a |
|                 |             | @teixeira2017a    |
+-----------------+-------------+-------------------+
| @mantyla2015a   | is cited by | @rodriguez2017a   |
|                 |             | @cesar2017a       |
+-----------------+-------------+-------------------+
| @laukkanen2018a | has cited   | @laukkanen2017a   |
+-----------------+-------------+-------------------+

All the papers that were found, were stored in a custom-built web-based tool for
conducting literature reviews.
The source code of this tool is published in a GitHub repository.
^[See https://github.com/jessetilro/research]
The tool was hosted on a virtual private server, such that all retrieved
publications were stored centrally, accessible to all reviewers.

#### Study Selection

We selected the studies that we wanted to include in the survey
with aid of the aforementioned tool for storing the papers.
In this tool,
it is possible to label papers with tags and leave comments and ratings.
Every paper is reviewed based on the inclusion and exclusion criteria.
Based on this, the tool allowed to filter out all papers
that appeared not to be relevant for this literature survey.

We only used one exclusion criteria: studies that are published before 2014,
will not be included in our survey (this is enforced by our search query).
The inclusion criteria are as follows:

- The study must show (at least) one release engineering technique.
- The study must not just show a release engineering technique,
  but analyze its performance compared to other techniques.

The last subsection of this appendix lists which studies
were selected and which were discareded.

#### Study Quality Assessment

Based on Kitchenham @kitchenham2004procedures, the quality of a paper will be assessed
by the evidence it provides, based on the following scale.
All levels of quality in this scale will be accepted,
except for level 5 (evidence obtained from expert opinion).

1. Evidence obtained from at least one properly-designed
   randomised controlled trial.
2. Evidence obtained from well-designed pseudo-randomised controlled trials
   (i.e. non-random allocation to treatment).
3. Comparative studies in a real-world setting:
    1. Evidence obtained from comparative studies with concurrent controls and
       allocation not randomised, cohort studies, case-control studies or
       interrupted time series with a control group.
    2. Evidence obtained from comparative studies with historical control,
       two or more single arm studies,
       or interrupted time series without a parallel control group.
4. Experiments in artificial settings:
    1. Evidence obtained from a randomised experiment performed in an
       artificial setting.
    2. Evidence obtained from case series,
       either post-test or pre-test/post-test.
    3. Evidence obtained from a quasi-random experiment performed in an
       artificial setting.
5. Evidence obtained from expert opinion based on theory or consensus.

Also, the studies will be examined to see if they contain any type of bias.
For this, the same types of biases will be used as described by 
Kitchenham@kitchenham2004procedures:

- Selection/Allocation bias: Systematic difference between comparison groups
  with respect to treatment.
- Performance bias: Systematic difference is the conduct of comparison groups
  apart from the treatment being evaluated.
- Measurement/Detection bias: Systematic difference between the groups in how
  outcomes are ascertained.
- Attrition/Exclusion bias: Systematic differences between comparison groups in
  terms of withdrawals or exclusions of participants from the study sample.

The studies will be labeled by their quality level and possible biases.
This information can be used during the Data Synthesis phase
to weigh the importance of individual studies [@kitchenham2004procedures].

#### Data Extraction

To accurately capture the information contributed by each publication in our
survey, we will use a systematic approach to extracting data.
To guide this process, we will be using a data extraction form which describes
what aspects of a publication are crucial to record.
Besides general publication information (title, author etc.), the form contains
questions that are based on our defined research questions.
Furthermore, the form contains a section for quantitative research, where
aspects such as population and evaluation will be documented.
The form that is used for this is shown below:

```
General information:

- Name of person extracting data:
- Date form completed (dd/mm/yyyy):
- Publication title:
- Author information:
- Publication type:
- Conference/Journal:
- Type of study:

What practices in release engineering does this publication mention?

Are these practices to be classified under dated, state of the art or state of
the practice? Why?

What open challenges in release engineering does this publication mention?

What research gaps does this publication contain?

Are these research gaps filled by any other publications in this survey?

Quantitative research publications:

- Study start date:
- Study end date or duration:
- Population description:
- Method(s) of recruitment of participants:
- Sample size:
- Evaluation/measurement description:
- Outcomes:
- Limitations:
- Future research:

Notes:

```

#### Data Synthesis

To summarize the contributions and limitations of each of the included
publications, we will apply a descriptive synthesis approach.
In this part of our survey, we will compare the data that was extracted of the
included publications.
Publications with similar findings will be grouped and evaluated, and
differences between groups of publications will be structured and elaborated on.
In this we will compare them using specifics such as their study types, time of
publication and study quality.

If the extracted data allows for a structured tabular visualization of
similarities and differences between publications this we serve as an additional
form of synthesis. However, this depends on the final included publications of
this survey.

#### Included and Excluded Studies

**Included:**

- @adams2016a
- @castelluccio2017a
- @cesar2017a
- @claes2017a
- @da2014a
- @da2016a
- @dyck2015a
- @fujibayashi2017a
- @karvonen2017a
- @kerzazi2013a
- @khomh2015a
- @laukkanen2017a
- @laukkanen2018a
- @mantyla2015a
- @plewnia2014a
- @poo-caamano2016a
- @rodriguez2017a
- @souza2015a
- @teixeira2017a

**Excluded:**

- @khomh2012a has been excluded, because it presents the same results as
  @khomh2015a, while the latter is more extensive because it is a journal article
  instead of a conference article.
- @kaur2019a has been excluded, because we could not obtain the actual paper
  since it has not yet been officially released.

### Raw Extracted Data

#### Understanding the impact of rapid releases on software quality -- The Case of Firefox

Reference: @khomh2015a

General information:

- Name of person extracting data: Maarten Sijm
- Date form completed: 27-09-2018
- Author information: Foutse Khomh, Bram Adams, Tejinder Dhaliwal, Ying Zou
- Publication type: Paper in Conference Proceedings
- Conference: Mining Software Repositories (MSR)
- Type of study: Quantitative, empirical case study

What practices in release engineering does this publication mention?

- Changing from traditional to rapid release cycles in Mozilla Firefox

Are these practices to be classified under dated,
state of the art or state of the practice? Why?

- State of the practice, because they study Firefox
  and Firefox is still using rapid release cycles.
  However, it is dated because the data is six years old.

What open challenges in release engineering does this publication mention?

- More case studies are needed

What research gaps does this publication contain?

- More case studies are needed

Are these research gaps filled by any other publications in this survey?

-

Quantitative research publications:

- Study start date: 01-01-2010 (Firefox 3.6)
- Study end date or duration: 20-12-2011 (Firefox 9.0)
- Population description: Mozilla Wiki, VCS, Crash Repository, Bug Repository
- Method(s) of recruitment of participants: N/A (case study)
- Sample size: 25 alpha versions, 25 beta versions,
  29 minor versions and 7 major versions.
  Amount of bugs/commits/etc. is not specified.
- Evaluation/measurement description: Wilcoxon rank sum test
- Outcomes:
    - With shorter release cycles,
      users do not experience significantly more post-release bugs
    - Bugs are fixed faster
    - Users experience these bugs earlier during software execution
      (the program crashes earlier)
- Limitations: Results are specific to Firefox
- Future research: More case studies are needed

#### On the influence of release engineering on software reputation

Reference: @plewnia2014a

General information:

- Name of person extracting data: Maarten Sijm
- Date form completed: 27-09-2018
- Author information: Christian Plewnia, Andrej Dyck, Horst Lichter
- Publication type: Paper in Conference Proceedings
- Conference: 2nd International Workshop on Release Engineering
- Type of study: Quantitative, empirical case study on multiple software

What practices in release engineering does this publication mention?

- Rapid releases

Are these practices to be classified under dated,
state of the art or state of the practice? Why?

- Dated practice, data is from before 2014

What open challenges in release engineering does this publication mention?

- Identifying software reputation can better be done using a qualitative study.

What research gaps does this publication contain?

- Identifying software reputation can better be done using a qualitative study.

Are these research gaps filled by any other publications in this survey?

-

Quantitative research publications:

- Study start date: Q3 2008
- Study end date or duration: Q4 2013
- Population description: Chrome, Firefox, Internet Explorer
- Method(s) of recruitment of participants: N/A (case study)
- Sample size: 3 browsers
- Evaluation/measurement description: No statistical analysis,
  just presenting market share results
- Outcomes:
    - Chrome's market share increased after adopting rapid releases
    - Firefox's market share decreased after adopting rapid releases
    - IE's market share decreased
- Limitations:
    - Identifying software reputation can better be done
      using a qualitative study.
- Future research:
    - Identifying software reputation can better be done
      using a qualitative study.

#### On rapid releases and software testing: a case study and a semi-systematic literature review

Reference: @mantyla2015a

General information:

- Name of person extracting data: Maarten Sijm
- Date form completed: 28-09-2018
- Author information: Mäntylä, Mika V. and Adams, Bram and Khomh, Foutse and Engström, Emelie
  and Petersen, Kai
- Publication type: Journal/Magazine Article
- Journal: Empirical Software Engineering
- Type of study: Empirical case study and semi-systematic literature review

What practices in release engineering does this publication mention?

- Impact of rapid releases on testing effort

Are these practices to be classified under dated,
state of the art or state of the practice? Why?

- State of the practice for the case study
- State of the art for the literature review

What open challenges in release engineering does this publication mention?

- Future work should focus on empirical studies of these factors
  that complement the existing qualitative observations
  and perceptions of rapid releases.

What research gaps does this publication contain?

- See open challenges

Are these research gaps filled by any other publications in this survey?

-

Quantitative research publications:

- Study start date: June 2006 (Firefox 2.0)
- Study end date or duration: June 2012 (Firefox 13.0)
- Population description: System-level test execution data
- Method(s) of recruitment of participants: N/A (case study)
- Sample size: 1,547 unique test cases, 312,502 executions,
  performed by 6,058 individuals on 2,009 software builds,
  22 OS versions and 78 locales.
- Evaluation/measurement description: Wilcoxon rank-sum test, Cliff's delta,
  Cohen's Kappa for Firefox Research Question (FF-RQ) 5.
- Outcomes (FF-RQs; RR = rapid release; TR = traditional release):
    1. RRs perform more test executions per day, but these tests focus on a
       smaller subset of the test case corpus.
    2. RRs have less testers, but they have a higher workload.
    3. RRs test fewer, but larger builds.
    4. RRs test fewer platforms in total,
       but test each supported platform more thoroughly.
    5. RRs have higher similarity of test suites and testers
       within a release series than TRs had.
    6. RR testing happens closer to the release date and is more continuous,
       yet these findings were not confirmed by the QA engineer.
- Limitations:
    - Study measures correlation, not causation
    - Not generalizable, as it is a case study on FF
- Future research: More empirical studies

Semi-systematic literature survey:

- Study date: Unknown (before 2015)
- Population description: Papers with main focus on:
    - Rapid Releases (RRs)
    - Aspect of software engineering largely impacted by RRs
    - An agile, lean or open source process having results of RRs
    - Excluding: opinion papers without empirical data on RRs
- Method(s) of recruitment of participants: Scopus queries
- Sample size: 24 papers
- Outcomes:
    - Evidence is scarce.
      Often RRs are implemented as part of agile adoption.
      This makes it difficult to separate the impact of RRs
      from other process changes.
    - Originates from several software development paradigms: Agile, FOSS, Lean,
      internet-speed software development
    - Prevalence
        - Practiced in many software engineering domains, not just web applications
        - Between 23% and 83% of practitioners do RRs
    - (Perceived) Problems:
        - Increased technical debt
        - RRs are in conflict with high reliability and high test coverage
        - Customers might be dipleased with RRs (many updates)
        - Time-pressure / Deadline oriented work
    - (Perceived) Benefits:
        - Rapid feedback leading to increased quality focus of the devs and testers
        - Easier monitoring of progress and quality
        - Customer satisfaction
        - Shorter time-to-market
        - Continuous work / testing
    - Enablers:
        - Sequential development where multiple releases are under work simultaneously
        - Tools for automated testing and efficient deployment
        - Involvement of product management and productive customers
- Limitations:
    - Not all papers that present results about RRs,
      have "rapid release" mentioned in the abstract.
- Future research:
    - Systematically search for agile and lean adoption papers

#### Release management in free and open source software ecosystems

Reference: @poo-caamano2016a

General information:

- Name of person extracting data: Maarten Sijm
- Date form completed: 28-09-2018
- Author information: Germán Poo-Caamaño
- Publication type: PhD Thesis
- Type of study: Empirical case study on two large-scale FOSSs: GNOME and OpenStack

What practices in release engineering does this publication mention?

- Communication in release engineering

Are these practices to be classified under dated,
state of the art or state of the practice? Why?

- State of the practice, because case study

What open challenges in release engineering does this publication mention?

- Is the ecosystem [around the studied software] shrinking or expanding?
- How have communications in the ecosystem changed over time?

What research gaps does this publication contain?

- More case studies are needed

Are these research gaps filled by any other publications in this survey?

-

Quantitative research publications (GNOME):

- Study start date: January 2009 (GNOME 2.x)
- Study end date or duration: August 2011 (GNOME 3.x)
- Population description: Mailing lists
- Method(s) of recruitment of participants:
  GNOME's website recommends this channel of communication.
  IRC is also recommended, but its history is not stored.
- Sample size: 285 mailing lists, 6947 messages, grouped into 945 discussions.
- Evaluation/measurement description: Counting
- Outcomes:
    - Developers also communicate via blogs, bug trackers, conferences,
      and hackfests.
    - The Release Team has direct contact with almost all participants
      in the mailing list
    - The tasks of the Release Team:
        - defining requirements of GNOME releases
        - coordinating and communicating with projects and teams
        - shipping a release within defined quality and time specifications
    - Major challenges of the Release Team:
        - coordinate projects and teams of volunteers without direct power over them
        - keep the build process manageable
        - monitor for unplanned changes
        - monitor for changes during the stabilization phase
        - test the GNOME release
- Limitations:
    - Only mailing list was investigated, other channels were not
    - Possible subjective bias in manually categorizing email subjects
    - Not very generalizable, as it's just one case study
- Future research:
    - Fix the limitations

Quantitative research publications (OpenStack):

- Study start date: May 2012
- Study end date or duration: July 2014
- Population description: Mailing lists
- Method(s) of recruitment of participants: Found on OpenStack's website
- Sample size: 47 mailing lists, 24,643 messages, grouped into 7,650 discussions.
  Filtered data: 14,486 messages grouped into 2,682 discussions.
- Evaluation/measurement description: Counting
- Outcomes:
    - Developers communicate via email, blogs, launchpad, wiki, gerrit,
      face-to-face, IRC, video-conferences, and etherpad.
    - Project Team Leaders and the Release Team members are the key players
      in the communication and coordination across projects in the context of
      release management
    - The tasks for the Release Team and Project Team Leaders:
        - defining the requirements of an OpenStack release
        - coordinating and communicating with projects and teams to reach
        the objectives of each milestone
        - coordinating feature freeze exceptions at the end of a release
        - shipping a release within defined quality and time specifications
    - Major challenges of these teams:
        - coordinate projects and teams without direct power over them
        - keep everyone informed and engaged
        - decide what becomes part of of the integrated release
        - monitor changes
        - set priorities in cross-project coordination
        - overcome limitations of the communication infrastructure
- Limitations:
    - Only studies mailing list, to compare with GNOME case study
    - Possible subjective bias in manually categorizing email subjects
    - Not very generalizable, as it's just one case study
- Future research:
    - Fix the limitations

Notes:

- Since there are two case studies, the results become a bit more generalizable
- The author set up a theory that encapsulates the communication and
  coordination regarding release management in FOSS ecosystems,
  and can be summarized as:
    1. The size and complexity of the integrated product is constrained
       by the release managers capacity
    2. The release management should reach the whole ecosystem
       to increase awareness and participation
    3. The release managers need social and technical skills

#### Release Early, Release Often and Release on Time. An Empirical Case Study of Release Management

Reference: @teixeira2017a

General information:

- Name of person extracting data: Maarten Sijm
- Date form completed: 28-09-2018
- Author information: Jose Teixeira
- Publication type: Paper in Conference Proceedings
- Conference: Open Source Systems: Towards Robust Practices
- Type of study: Empirical case study

What practices in release engineering does this publication mention?

- Shifting towards rapid releases in OpenStack

Are these practices to be classified under dated,
state of the art or state of the practice? Why?

- State of the practice, because it is a recent case study on OpenStack

What open challenges in release engineering does this publication mention?

- More case studies are needed.

What research gaps does this publication contain?

- More case studies are needed.

Are these research gaps filled by any other publications in this survey?

-

Quantitative research publications:

- Study start date: Not specified
- Study end date or duration: Not specified
- Population description: Websites and blogs
- Method(s) of recruitment of participants:
  Random clicking through OpenStack websites
- Sample size: Not specified
- Evaluation/measurement description: Not specified
- Outcomes:
    - OpenStack releases in a cycle of six months
    - The release management process is a hybrid of feature-based and time-based
    - Having a time-based release strategy is a challenging coopearative task
      involving multiple people and technology
- Limitations:
    - Study is not completed yet, these are preliminary results
- Future research:
    - Not indicated

#### Kanbanize the release engineering process

Reference: @kerzazi2013a

General information:

- Name of person extracting data: Jesse Tilro
- Date form completed: 29-09-2018
- Author information: Kerzazi, N. and Robillard, P.N.
- Publication type: Paper in Conference Proceedings
- Journal: 2013 1st International Workshop on Release Engineering, RELENG 2013 -
  Proceedings
- Type of study: Action research

What practices in release engineering does this publication mention?

- Following principles of the Kanban agile software development life-cycle model
  that implicitly describe the release process
- (Switching to) more frequent (daily) release cycles
- (Transitioning to) a structured release process

Are these practices to be classified under dated,
state of the art or state of the practice? Why?

- Either dated or state of the practice, not sure. Would have to do some
  additional research on the adoption of Kanban

What open challenges in release engineering does this publication mention?

- Release effectiveness: minimize system failure and customer impact
- Problems with releasing encountered in practice

What research gaps does this publication contain?

-

Are these research gaps filled by any other publications in this survey?

-

Quantitative research publications:

- Study start date:
- Study end date or duration:
- Population description:
- Method(s) of recruitment of participants:
- Sample size:
- Evaluation/measurement description:
- Outcomes:
    1.
- Limitations:
- Future research:

Notes:

-

#### Is it safe to uplift this patch? An empirical study on mozilla firefox

Reference: @castelluccio2017a

General information:

- Name of person extracting data: Jesse Tilro
- Date form completed: 29-09-2018
- Author information: Castelluccio, M. and An, L. and Khomh, F.
- Publication type: Paper in Conference Proceedings
- Journal: Proceedings - 2017 IEEE International Conference on Software Maintenance and Evolution,
  ICSME 2017
- Type of study: Case study, both quantitative (data analysis) and qualitative
  (interviews)

What practices in release engineering does this publication mention?

- Patch uplift (meaning the promotion of patches from development directly
  to a stabilization channel, potentially skipping several channels)

Are these practices to be classified under dated,
state of the art or state of the practice? Why?

- State of the practice: case study of what is being done in the field,
  quite recently (2017).

What open challenges in release engineering does this publication mention?

- Exploring possibilities to leverage this research by building classifiers
  capable of automatically assessing the risk associated with patch uplift
  candidates and recommend patches that can be uplifted safely.
- Validate and extend results of this study for generalizability.

What research gaps does this publication contain?

- Study aimed to fill two identified gaps identified in literature:
    - How do urgent patches in rapid release models affect software quality
      (in terms of fault proneness)?
    - How can the reliability of the integration of urgent patches be improved?

Are these research gaps filled by any other publications in this survey?

- The paper itself

Quantitative research publications:

- Study start date:
- Study end date or duration:
- Population description:
- Method(s) of recruitment of participants:
- Sample size:
- Evaluation/measurement description:
- Outcomes:
    1.
- Limitations:
- Future research:

Notes:

-

#### Systematic literature review on the impacts of agile release engineering practices

Reference: @karvonen2017a

General information:

- Name of person extracting data: Jesse Tilro
- Date form completed: 29-09-2018
- Author information: Karvonen, T. and Behutiye, W. and Oivo, M. and Kuvaja, P.
- Publication type: Journal/Magazine Article
- Journal: Information and Software Technology
- Type of study: Systematic literature review

What practices in release engineering does this publication mention?

- Agile release engineering (ARE) practices
    - Continuous integration (CI)
    - Continuous delivery (CD)
    - Rapid Release (RR)
    - Continuous deployment
    - DevOps (similar to CD, congruent with release engineering practices)


Are these practices to be classified under dated,
state of the art or state of the practice? Why?

- State of the art, for it concerns a state of the art report and was published
  recently (2017).

What open challenges in release engineering does this publication mention?

- Claims that modern release engineering practices allow for software
  to be delivered faster and cheaper should be further empirically validated.
- This analysis could be extended with industry case studies, to develop a
  checklist for analyzing company and ecosystem readiness for continuous
  delivery and continuous deployment.
- The comprehensive reporting of the context and how the practice is
  implemented instead of merely referring to usage of the practice should be
  considered by future research.
- Different stakeholders' points of view, such as customer perceptions regarding
  practices require further research.
- Research on DevOps would be highly relevant for release engineering and the
  continuous software engineering research domain.
- Future research on the impact of RE practices could benefit from more
  extensive use of quantitative methodologies from case studies, and the
  combination of quantitative with qualitative (e.g. interviews) methods.

What research gaps does this publication contain?

- Refer to challenges

Are these research gaps filled by any other publications in this survey?

-

Quantitative research publications:

- Study start date: N/A
- Study end date or duration: N/A
- Population description: N/A
- Method(s) of recruitment of participants: N/A
- Sample size: N/A
- Evaluation/measurement description: N/A
- Outcomes: N/A
- Limitations: N/A
- Future research: N/A

Notes:

-

#### Abnormal Working Hours: Effect of Rapid Releases and Implications to Work Content

Reference: @claes2017a

General information:

- Name of person extracting data: Jesse Tilro
- Date form completed: 29-09-2018
- Author information: Claes, M. and Mantyla, M. and Kuutila, M. and Adams, B.
- Publication type: Paper in Conference Proceedings
- Journal: IEEE International Working Conference on Mining Software Repositories
- Type of study: Quantitative case study

What practices in release engineering does this publication mention?

- Faster release cycles

Are these practices to be classified under dated,
state of the art or state of the practice? Why?

-

What open challenges in release engineering does this publication mention?

- Future research might further study the impact of time pressure and work
  patterns - indirectly release practices - on software developers.

What research gaps does this publication contain?

-

Are these research gaps filled by any other publications in this survey?

-

Quantitative research publications:

- Study start date: first data item 2012-12-21
- Study end date or duration: last data item 2016-01-03
- Population description: N/A
- Method(s) of recruitment of participants: N/A
- Sample size: 145691 bug tracker contributors (1.8% timezone), 11.11 million
  comments (53% author with timezone)
- Evaluation/measurement description: measure distributions on number of
  comments per day of the week and time of the day, before and after transition
  to rapid release cycles. Test distribution difference using Mann-Whitney U
  test and test effect size using Cohen's d and Cliff's delta.
  Also evaluate general development of number of comments, working day against
  weekend and day against night.
- Outcomes:
    1.  Switching to rapid releases has reduced the amount of work performed
        outside of office hours. (Supported by results in psychology.)
    2.  Thus, rapid release cycles seem to have a positive effect on
        occupational health.
    3.  Comments posted during the weekend contained more technical terms.
    4.  Comments posted during weekdays contained more positive and polite
        vocabulary.
- Limitations:
- Future research:

Notes:

-

#### Does the release cycle of a library project influence when it is adopted by a client project?

Reference: @fujibayashi2017a

General information:

- Name of person extracting data: Jesse Tilro
- Date form completed: 29-09-2018
- Author information: Fujibayashi, D. and Ihara, A. and Suwa, H. and Kula, R.G. and Matsumoto, K.
- Publication type: Paper in Conference Proceedings
- Journal: SANER 2017 -
  24th IEEE International Conference on Software Analysis, Evolution, and Reengineering
- Type of study: Quantitative study

What practices in release engineering does this publication mention?

- Rapid release cycles

Are these practices to be classified under dated,
state of the art or state of the practice? Why?

- State of the art and practice: practitioners currently practice it,
  researchers currently research it.

What open challenges in release engineering does this publication mention?

- Gaining an understanding of the effect of a library's release cycle on its
  adoption.

What research gaps does this publication contain?

- First step towards solving the above challenge.

Are these research gaps filled by any other publications in this survey?

- This paper

Quantitative research publications:

- Study start date: 21-07-2016 (data extraction)
- Study end date or duration:
- Population description:
- Method(s) of recruitment of participants:
- Sample size: 23 libraries, 415 client projects
- Evaluation/measurement description:
  - Scott-Knott test to group libraries with similar release cycle.
- Outcomes:
    1.  There is a relationship between release cycle of a library project
        and the time for clients to adopt it: quicker release seems to be
        associated with quicker adoption.

- Limitations:
    - Small sample size
    - Not controlled for many factors
    - No statistical significance tests?
- Future research:

Notes:

- Very short, probably not very strong evidence, refer to limitations
- Nice that the focus is libraries here, very interesting population because
  most studies focus on end-user targeting software systems

#### Rapid releases and patch backouts: A software analytics approach

Reference: @souza2015a

General information:

- Name of person extracting data: Jesse Tilro
- Date form completed: 29-09-2018
- Author information: Souza, R. and Chavez, C. and Bittencourt, R.A.
- Publication type: Journal/Magazine Article
- Journal: IEEE Software
- Type of study: Quantitative case study (Mozilla Firefox)

What practices in release engineering does this publication mention?

- Rapid release
- Backing out of broken patches (patch backouts)
- Stabilization channels / monitored integration repository

Are these practices to be classified under dated,
state of the art or state of the practice? Why?

- State of the practice (case study)

What open challenges in release engineering does this publication mention?

- How rapid release cycles affect code integration, where patch backouts are
  a proxy for studying code integration
- Integrate backout rate analysis in an analytics tool to provide release
  engineers with up-to-date information on the process

Quantitative research publications:

- Study start date: first data item 30 june 2009
- Study end date or duration: last data item 17 september 2013
- Population description:
- Method(s) of recruitment of participants:
- Sample size: 43198 bug fixes, no further sample sizes of the raw data
  mentioned anywhere unfortunately. (Data from Mozilla Firefox project.)
- Evaluation/measurement description: Associate commit log, bug reports and
  releases. Classify backouts. Measure rate of backouts against all fixed bugs,
  per month and per release strategy period. Test for statistical significance
  using Fisher's exact test and Wilcoxon signed-rank test.
- Outcomes:
    1.  Absolute numbers of bug fixes and backouts increased under rapid
        releases (probably the increase in regular contributors played a role,
        cannot conclude anything about workload.)
    2.  Backout rate increased under rapid releases (sheriff managed integration
        repositories may have increased the prevalence of backout culture)
    3.  Higher early backout rate and lower late backout rate indicate a shift
        towards earlier problem detection (proportion early from 57 to 88 %)
        The time-to-backout also dropped.
- Limitations:
    - Sample size not mentioned
    - Quite trivial statistics
- Future research:
    - Integrate backout rate analysis in an analytics tool to provide release
      engineers with up-to-date information on the process

Interview triangulation

- Explanations of quantitative outcomes:
    - larger code base and more products -> more conflicts
    - evolution of automated testing toolset -> earlier and more backouts
    - sheriff managed integration repos -> earlier and more backouts
- Explanations of impact
    - cultural shift reduced testing efforts beforehand, and higher early
      backout rate eventually reduced the effort to integrate bug fixes for
      developers
    - given the many stabilization channels and the rarity of very late backouts
      both in traditional and rapid release cycles, changes in backouts do not
      seem to influence users' perception of quality (even though frequent
      update notifications and broken compatibilities caused upset users)

Notes:

- Also reviews existing literature well.
- Treats transitional period from traditional to rapid releases as a separate
  period.

#### Comparison of release engineering practices in a large mature company and a startup

Reference: @laukkanen2018a

General information:

- Name of person extracting data: Jesse Tilro
- Date form completed: 29-09-2018
- Author information: Laukkanen, E. and Paasivaara, M. and Itkonen, J. and Lassenius, C.
- Publication type: Journal/Magazine Article
- Journal: Empirical Software Engineering
- Type of study: Case study (2 cases)

What practices in release engineering does this publication mention?

- Continuous Integration (mainly)
- Code review
- Internal Verification Scope
- Domain Expert Testing
- Testing with customers

Are these practices to be classified under dated,
state of the art or state of the practice? Why?

-

What open challenges in release engineering does this publication mention?

- The results in this study can be verified by additional case studies or
  or even surveys to close the of empirical research on release engineering

Quantitative research publications:

- Study start date:
- Study end date or duration:
- Data acquisition period: 22 weeks (BigCorp) and 24 weeks (SmallCorp)
- Population description:
- Method(s) of recruitment of participants:
- Sample size: 1889 builds (BigCorp) and 760 builds (SmallCorp)
- Evaluation/measurement description:
- Outcomes:
    - High internal quality standards combined with the large distributed
      organizational context of BigCorp slowed the verification process
      down and therefore had a negative impact on release capability
    - In SmallCorp, the lack of internal verification measures due to a lack
      of resources was mitigated by code review, disciplined CI and external
      verification by customers in customer environments. This allowed for
      fast release capability and gaining feedback from production.
    - Variables
        - Multiple customers -> High quality standards
        - High quality standards -> Complex CI
        - High quality standards -> Slow Verification
        - Complex CI -> Undisciplined CI
        - Large distributed organization -> Undisciplined CI
        - Undisciplined CI -> Slow verification
        - Slow verification -> Slow release capability
- Limitations:
    - Only a case study, so difficult to generalize
- Future research:

Notes:

- Quantitative results triangulated with interviews

#### Modern Release Engineering in a Nutshell

Reference: @adams2016a

General information:

- Name of person extracting data: Nels Numan
- Date form completed (dd/mm/yyyy): 28/09/2018
- Publication title: Modern Release Engineering in a Nutshell
- Author information: Bram Adams and Shane McIntosh
- Journal: 23rd International Conference on Software Analysis, Evolution, and Reengineering (2016)
- Publication type: Conference paper
- Type of study: Survey

What practices in release engineering does this publication mention?

- Branching and merging
    - Software teams rely on Version Control Systems
    - Quality assurance activities like code reviews are used before doing a merge or even
      allowing a code change to be committed into a branch
    - Keep branches short-lived and merge often. If this is impossible, a rebase can be done.
    - "trunk-based development" can be applied to eliminate most branches below the master
      branch.
    - Feature toggles are used to provide isolation for new features in case of the absence of
      branches.
- Building and testing
    - To help assess build and test conflicts, many projects also provide "try" servers to
      development teams, which automatically runs a build and test process referred to as CI.
    - The CI process often does not run full test, but a representative subset.
    - The more intensive tests, such as integration, system or performance typically get run
      nightly or in weekends.
- Build system:
    - GNU Make is the most popular file-based build system technology.
      Ant is the prototypical task-based build system technology.
      Lifecycle-based build technologies like Maven consider the build system of a project to
      have a sequence of standard build activities that together form a "build lifecycle."
    - "Reproducible builds" involve for a given feature and hardware configuration of the code
      base, every build invoca- tion should yield bit-to-bit identical build results.
- Infrastructure-as-code
    - Containers or virtual machines are used to deploy new versions of the system for testing
      or even production.
    - It has been recommended that infrastructure code is to be stored in a separate VCS
      repository than source code, in order to restrict access to infrastructure code.
- Deployment
    - The term "dark launching" corresponds to deploying new features without releasing them to the
      public, in which parts of the system automatically make calls to the hidden features in a way
      invisible to end users.
    - "Blue green deployment" deploys the next software version on a copy of the production
      environment, and changes this to be the main enviroment on release.
    - In "canary deployment" a prospective release of the software system is loaded onto a subset
      of the production environments for only a subset of users.
    - "A/B testing" deploys alternative A of a feature to the environment of a subset of the user
      base, while alternative B is deployed to the environment of another subset.
- Release
    - Once a deployed version of a system is released, the release engineers monitor telemetry data
      and crash logs to track the performance and quality of releases. Several frameworks and
      applications have been introduced for this.

Are these practices to be classified under dated, state of the art or state of
the practice? Why?

- The majority of these practices are classified by the paper as state of the practice, but state
  of the art practices are also mentioned.

What open challenges in release engineering does this publication mention?

- Branching and merging
    - No methodology or insight exists on how to empirically validate the best branching structure
      for a given organization or project, and what results in the smallest amount of merge
      conflicts.
    - Release engineers need to pay particular attention to conflicts and incompatibilities caused
      by evolving library and API dependencies.
- Building and testing
    - Speeding up CI might be the major concern of practitioners. This speed up can be achieved
      through predicting whether a code change will break the build, or by "chunking" code changes
      into a group and only compile and test each group once.
    - The concept of "green builds" slowly is becoming an issue, in the sense that frequent
      triggering of the CI server consumes energy.
    - Security of the release engineering pipeline in general, and the CI server in particular,
      also has become a major concern.
- Release
    - Qualitative studies are not only essential to understand the rationale behind quantitative
      findings, but also to identify design patterns and best practices for build systems.
        - How can developers make their builds more maintainable and of higher quality?
        - What refactorings should be performed for which build system anti-patterns?
    - Identification and resolution of build bugs, i.e., source code or build specification changes
      that cause build breakage, possibly on a subset of the supported platforms.
    - Basic tools have a hard time determining what part of the system is necessary to build.
    - Studies on non-GNU Make build systems are missing.
    - Apart from identifying bottlenecks, such approaches should also suggest concrete refactorings
      of the build system specifications or source code.
- Infrastructure-as-code
    - Research on differences between infrastructure languages is lacking.
    - Best practices and design patterns for infrastructure-as-code need to be documented.
    - Qualitative analysis of infrastructure code will be necessary to understand how developers
      address different infrastructure needs.
    - Quantitative analysis of the version control and bug report systems can then help to
      determine which patterns were beneficial in terms of maintenance effort and/or quality.
- Deployment
    - More emperical studies can be done to answer question like this:
        - Is blue-green deployment the fastest means to deploy a new version of a web app?
        - Are A/B testing and dark launching worth the investment and risk?
        - Should one use containers or virtual machines for a medium-sized web app in order to meet
          application performance and robustness criteria?
        - If an app is part of a suite of apps built around a common database, should each app be
          deployed in a different container?
    - Better tools for quality assurance are required, to prevent showstopper bugs from slipping
      through and requiring re-deployment of a mobile app version (with corresponding vetting),
      these include:
        - Defect prediction (either file- or commit-based)
        - Smarter/safer update mechanisms
        - Tools for improving code review
        - Generating tests
        - Filtering and interpreting crash reports
        - Prioritization and triaging of defect reports
- Release
    - More research is needed on determining which code change is the perfect one for triggering
      the release of one of these releases, or whether a canary is good enough to be released to
      another data centre.
    - Question such as the following should be investigated:
        - Should one release on all platforms at the same time?
        - In the case of defects, which platform should receive priority?
        - Should all platforms use the same version numbering, or should that be feature-dependent?
        - Research on the continuous delivery and rapid releases from other systems should be
          explored.

What research gaps does this publication contain?

- As is common with surveys, it does not contain the state of the field today. More quantitive and
  qualitive research has been done, which can not possibly be included.

Are these research gaps filled by any other publications in this survey?

- An example of further research that expand on this study is @da2016a

#### The Impact of Switching to a Rapid Release Cycle on the Integration Delay of Addressed Issues

Reference: @da2016a

General information:

- Name of person extracting data: Nels Numan
- Date form completed (dd/mm/yyyy): 28/09/2018
- Publication title: The Impact of Switching to a Rapid Release Cycle on the Integration Delay of
  Addressed Issues
- Author information: Daniel Alencar da Costa, Shane McIntosh, Uira Kulesza, Ahmed E. Hassan
- Journal: 13th Working Conference on Mining Software Repositories (2016)
- Publication type: Conference paper
- Type of study: Emperical study

What practices in release engineering does this publication mention?

- To give a context to the study, the paper describes the concept of traditional releases, rapid
  releases, their differences, and how issue reports are structured.

Are these practices to be classified under dated, state of the art or state of the practice? Why?

- State of the practice. The paper describes common practices that were in use at the time of the
  publication.

What open challenges in release engineering does this publication mention?

- The study mentions that comparing systems with different release structures is difficult since
  one has to distinguish to what extent the results are due to the release strategy and which are
  due to intricacies of the systems or organization itself.

What research gaps does this publication contain?

- The main gap in this study is the specificity of the data. Only Mozilla has been considered, and
  external factors such as other organizational challenges which could have an effect on release
  time could not be included. More research that looks further into comparing this case to that of
  other organizations is needed.

Are these research gaps filled by any other publications in this survey?

-

Quantitative research publications:

- Study start date: Used data starts from 1999
- Study end date or duration: Used data ends in 2010
- Population description: The paper describes multiple steps to describe their data collection
  approach. The paper collected the date and version number of each Firefox release. Tags within
  the VCS were used to link issue IDs to releases. The paper discards issues that are potential
  false positives: IDs that have less five digits, issues that refer to tests instead of bugfixes,
  any potential ID that is the name of a file. Since the commit logs are linked to the VCS tags,
  the paper is able to link the issue IDs found within these commit logs to the releases that
  correspond to those tags.
- Method(s) of recruitment of participants: Firefox release history wiki and VCS logs
- Sample size: 72114 issue reports from the Firefox system (34673 for traditional releases and
  37441 for rapid releases)
- Evaluation/measurement description: The paper aims to answer three research questions:
    - Are addressed issues integrated more quickly in rapid releases?
        - Approach: Through beanplots to compare the distributions, the paper first observes the
          lifetime of the issues of traditional and rapid releases. Next, it looks at the time span
          of the triaging, fixing, and integration phases within the lifetime of an issue.
    - Why can traditional releases integrate addressed issues more quickly?
        - Approach: the paper groups traditional and rapid releases into major and minor releases
          and study their integration delay through beanplots, Mann-Whiteney-Wilcoxon tests,
          Cliff's delta, and MAD.
    - Did the change in the release strategy have an impact on the characteristics of delayed
      issues?
        - Approach: the paper builds linear regression models for both release approaches. The
          paper firstly estimates the degrees of freedom that can be spent on the models. Secondly,
          they check for metrics that are highly correlated using Spearman rank correlation tests
          and perform a redundancy check to remove redundant metrics. The paper then assesses the
          fit of our models using the ROC area and the Brier score. The ROC area is used to
          evaluate the degree of discrimination achieved by the model. The Brier score is used to
          evaluate the accuracy of probabilistic predictions. The used metrics include reporter
          experience, resolver experience, issue severity, issue priority, project queue rank,
          number of impacted files and fix time. A full list of metrics can be found in Table 2 of
          the paper.
- Outcomes:
    - Are addressed issues integrated more quickly in rapid releases?
        - Results: There is no significant difference between traditional and rapid releases
          regarding issue lifetime.  Results:
    - Why can traditional releases integrate addressed issues more quickly?
        - Results: Minor-traditional releases tend to have less integration delay than
          major/minor-rapid releases.
    - Did the change in the release strategy have an impact on the characteristics of delayed
      issues?
        - Results: The models achieve a Brier score of 0.05- 0.16 and ROC areas of 0.81-0.83.
          Traditional releases prioritize the integration of backlog issues, while rapid releases
          prioritize the inte- gration of issues of the current release cycle.
- Limitations: Defects in the tools that were developed to perform the data collection and
  evaluation could have an effect on the outcomes. Furthermore, the way that issue IDs are linked
  to releases may not represent the total addressed issues per release. The results cannot be
  generalized as the evaluation was solely done on the Firefox system.
- Future research: Further research can look into applying the same evaluation strategy to other
  organizations that switched from traditional to rapid release.

Notes:

#### An Empirical Study of Delays in the Integration of Addressed Issues

Reference: @da2014a

General information:

- Name of person extracting data: Nels Numan
- Date form completed (dd/mm/yyyy): 29/09/18
- Publication title: An Empirical Study of Delays in the Integration of Addressed Issues
- Author information: Daniel Alencar da Costa, Surafel Lemma Abebe, Shane McIntosh, Uira Kulesza,
  Ahmed E. Hassan
- Journal: 2014 IEEE International Conference on Software Maintenance and Evolution
- Publication type: Conference paper
- Type of study: Emperical study

What practices in release engineering does this publication mention?

- This publication discusses the usage of issue tracking systems, and what the term issue means to
  form a context around the study.

Are these practices to be classified under dated, state of the art or state of
the practice? Why?

- State of the practice.

What open challenges in release engineering does this publication mention?

- The results based on the investigated open source projects may not be generalizable and
  replication of the study is required on a larger set of projects to form a more general
  conclusion. Another challenge is finding metrics that are truly correlated with the integration
  delay of issues.

What research gaps does this publication contain?

- Please see last question.

Are these research gaps filled by any other publications in this survey?

- @da2016a

Quantitative research publications:

- Study start date:
- Used data start dates:
    - ArgoUML: 18/08/2003
    - Eclipse: 03/11/2003
    - Firefox: 05/06/2012
- Used data end dates:
    - ArgoUML: 15/12/2011
    - Eclipse: 12/02/2007
    - Firefox: 04/02/2014
- Population description:
- Method(s) of recruitment of participants: The data was collected from both ITSs and VCSs of the
  studied systems.
- Sample size: 20,995 issues from ArgoUML, Eclipse and Firefox projects
- Evaluation/measurement description:
    - How long are addressed issues typically delayed by the integration process?
        - Approach: models are created using metrics from four dimensions: reporter, issue,
          project, and history. Please refer to Table 2 in the paper for all of the metrics
          considered. The models are trained using the random forest technique. Precision, recall,
          F-measure, and ROC area are used to evaluate the models.
- Outcomes:
    - How long are addressed issues typically delayed by the integration process?
        - Addressed issues are usually delayed in a rapid release cycle. Many delayed issues were
          addressed well before releases from which they were omitted. Many delayed issues were
          addressed well before releases from which they were omitted.
    - Can we accurately predict when an addressed issue will be integrated?
        - The prediction models achieve a weighted average precision between 0.59 to 0.88 and a
          recall between 0.62 to 0.88, with ROC areas of above 0.74. The models achieve better
          F-measure values than Zero-R.
    - What are the most influential attributes for estimating integration delay?
        - The integrator workload has a bigger influence on integrator delay than the other
          attributes. Severity and priority have little influence on issue in- tegration delay.
- Limitations: See open challenges.
- Future research: See open challenges.

Notes:

#### Towards Definitions for Release Engineering and DevOps

Reference: @dyck2015a

General information:

- Name of person extracting data: Nels Numan
- Date form completed (dd/mm/yyyy): 30/09/2018
- Publication title: Towards Definitions for Release Engineering and DevOps
- Author information: Andrej Dyck, Ralf Penners, Horst Lichter
- Journal:
- Publication type:
- Type of study: Survey

What practices in release engineering does this publication mention?

- This paper talks about approaches to improve the collaboration between development and IT
  operations teams, in order to streamline software engineering processes. The paper defines for
  release engineering and devops.

Are these practices to be classified under dated, state of the art or state of the practice? Why?

- Not applicable.

What open challenges in release engineering does this publication mention?

- The paper mentions that creating a definition which is uniform and valid for many situations is
  difficult to find and that further research is needed.

What research gaps does this publication contain?

- This paper aims to form a uniform definition for release engeneering and devops, in collaboration
  with experts. It is unclear how many experts were consulted for this definition, and more
  consultations and research could be done to further improve the definition.

Are these research gaps filled by any other publications in this survey?

-

Quantitative research publications:

- Study start date:
- Study end date or duration:
- Population description:
- Method(s) of recruitment of participants:
- Sample size:
- Evaluation/measurement description:
- Outcomes:
- Limitations:
- Future research:

Notes:

#### Continuous deployment of software intensive products and services: A systematic mapping study

Reference: @rodriguez2017a

General information:

- Name of person extracting data: Nels Numan
- Date form completed (dd/mm/yyyy): 30/09/18
- Publication title: Continuous deployment of software intensive products and services: A
  systematic mapping study
- Author information: Pilar Rodrígueza, Alireza Haghighatkhaha, Lucy Ellen Lwakatarea, Susanna
  Teppolab, Tanja Suomalainenb, Juho Eskelib, Teemu Karvonena, Pasi Kuvajaa, June M. Vernerc,
  Markku Oivoa
- Journal:
- Publication type:
- Type of study: Semantic study

What practices in release engineering does this publication mention?

- This paper discussed the developments of continuous development over the years until June 2014.
  This paper has performed a semantic study to identify, classify and analyze primary studies
  related to continuous development. The paper has found the following major points:
    - Almost all primary studies make reference in one way or another to accelerate the releae
      cycle by shortening the release cadence and turning it into a continuous flow.
    - Some reviewed publications claim that accelerating the release cycle can make it harder to
      perform re-engineering activities.
    - CD challenges and changes traditional planning towards continuous planning in order to
      achieve fast and frequent releases.
    - Tighter integration between planning and execution is required in order to achieve a more
      holisitic view on planning in CD.
    - It is important for the engineering and QA teams to ensure backward compatibility of
      enhancements, so that users perceive only improvements rather than experience any loss of
      functionality.
    - Code change activities tend to focus more on bug fixing and maintenance than functional- ity
      expansion
    - The architecture must be robust enough to allow the organization to invest its resources in
      offensive initiatives such as new functionalitity, product enhancements and innovation rather
      than defensive efforts such as bugfixes.
    - A major challenge in CD is to retain the balance between speed and quality. Some approaches
      reviewed by this study propose a focus on measuring and monitoring source code and
      architectural quality.
    - To avoid issues such as duplicated testing efforts and slow feedback loops it is important to
      make all testing activities transparent to individual developers.

What open challenges in release engineering does this publication mention?

- Continuous and rapid experimentation is an emerging research topic with many possibilities for
  future work. This is why it's important to keep up with the newly contributed studies and add
  them to future reviews to compare their findings.

What research gaps does this publication contain?

-

Notes:

#### Frequent Releases in Open Source Software: A Systematic Review

Reference: @cesar2017a

General information:

- Name of person extracting data: Nels Numan
- Date form completed (dd/mm/yyyy): 30/09/18
- Publication title: Frequent Releases in Open Source Software: A Systematic Review
- Author information: Antonio Cesar Brandão Gomes da Silva, Glauco de Figueiredo Carneiro, Fernando
  Brito e Abreu and Miguel Pessoa Monteiro
- Journal: Information
- Publication type: Journal
- Type of study: Survey

What practices in release engineering does this publication mention?

- This paper discussed the developments of continuous development over the years. This paper has
  performed a semantic study to identify, classify and analyze primary studies related to
  continuous development. The paper finds:
    - Two main motivations for the implementation of frequent software releases in the context of
      OSS projects, which are the project attractiveness/increase of participants and maintenance
      and increase of market share
    - Four main strategies are adopted by practitioners to implement frequent software releases in
      the context of OSS projects: time-based release, automated release, test-driven development
      and continuous delivery/deployment.
    - The main positive points associated to rapid releases are: quick return on customer needs,
      rapid delivery of new features, quick bug fixes, immediate release security patches,
      increased efficiency, entry of new collaborators, and greater focus on quality on the part of
      developers and testers.
    - The main negative points assocaited to rapid releases are reliability of new versions,
      increase in the "technical debt", pressure felt by employees and community dependence.

Are these practices to be classified under dated, state of the art or state of
the practice? Why?

- The practices discussed are a combination of state of the art
  and state of the practice approaches.

What open challenges in release engineering does this publication mention?

- A meta-model for the mining of open source bases in view of gathering data that leads to
  assessment of the quality of projects adoping the frequent release approach.

What research gaps does this publication contain?

-

Are these research gaps filled by any other publications in this survey?

-