Difference between revisions of "Scaling continuous integration to the enterprise"

Latest revision as of 15:17, 6 April 2008

Enterprise Scale Continuous Integration

These were the notes captured from the discussions. We started by describing the problems we had seen in trying to run continuous integration on very large code bases, and in large organizations which are coming from monthly or weekly integration cycles to continuous integration cycles.

Problem Definition

300 devs, 1 build break per year per developer means:
- build will be broken every day
- slows us down
- distrust of the source master
- various defensive behaviors so programmers can get work done even though the source master is frequently broken
  - Only sync from the master when it is "known good"
  - Private branches to insulate from noise on the master
  - Project branches to insulate from noise on the master

2 hour build, 3 day acceptance test
- A 2 hour build means that the time between a submit and feedback on the result of the build could be as much as 4 hours (I just missed the start of the current build, my build will start in almost two hours and will require two hours to confirm success or failure)

hard to assign failure due to multiple commits per build (many developers may submit during the 2 hour build window, so it becomes harder to diagnose which submit caused a build failure)

long cycle time on failure (hours before you know you broke something)

failures affect more people, are more expensive

Understand root cause of failure is not obvious

How to handle 300 applications, each with a few devs, how to scale to many projects and still manage level

How to manage many branches to many mains

Managing build time dependencies (unexpected, undetected coupling)
- incorrect incremental builds

Addressing the problems, alternatives, risks, and trade-offs

subcomponents
- reduces build time, but
- increases integration time
build acceleration technology
- parallel build, multi-machine, multi-core (Electric Accelerator, for instance)
- bug fast machines (although disc I/O may dominate)
modularize to get recent successful build, not compile
- faster, less built (narrow the impact to a smaller team)
Use "pre-flight" build (production build with many changes, not yet on the source master)
- integration race conditions
- faster hardware
- parallel builds

Alternatives (2)

3 day acceptance test
- throw bodies at the problem (but it is not scalable)
- review the acceptance process for automation opportunities
- increase automated testing inside the application (at the interfaces)
- modularize tests, make them independent so they can run in parallel
- accept human tests less frequently, automation running continuously
- use assistive automation to support more effective exploratory testing
  - Brian Marick has some work going in this area
  - Michael Bolton describes his use of Watir as assistive automation

Lisa Crispin suggested that Jared Richardson had done the continuous integration work for SAS and might share insights and ideas.

Alternatives (3)

300 applications, small teams on each

Either many independent CI systems or an enterprise CI system
- unified view
- shared configuration
- reuse between teams
- security
- usable for small teams

Dependency management
- component level dependencies managed by tools
  - Anthill / Codestation
  - maven
  - ivy
- scheduling builds, which build should be run first
- how do I express the rules by which I select a component
  - version (specific version, pattern match a version, relational operator to version string, etc.)
  - acceptance test results

@@ Line 1: / Line 1: @@
 == Enterprise Scale Continuous Integration ==
+These were the notes captured from the discussions.  We started by describing the problems we had seen in trying to run continuous integration on very large code bases, and in large organizations which are coming from monthly or weekly integration cycles to continuous integration cycles.
 '''Problem Definition'''
-- 300 devs, 1 build break / year
+* 300 devs, 1 build break per year per developer means:
-  = build will be broken every day
+** build will be broken every day
-  = slows us down
+** slows us down
-  = creates distrust of the source master
+** distrust of the source master
+** various defensive behaviors so programmers can get work done even though the source master is frequently broken
+*** Only sync from the master when it is "known good"
+*** Private branches to insulate from noise on the master
+*** Project branches to insulate from noise on the master
-- 2 hour build, 3 day acceptance test
+* 2 hour build, 3 day acceptance test
+** A 2 hour build means that the time between a submit and feedback on the result of the build could be as much as 4 hours (I just missed the start of the current build, my build will start in almost two hours and will require two hours to confirm success or failure)
-- hard to assign failure due to multiple commits per build
+* hard to assign failure due to multiple commits per build (many developers may submit during the 2 hour build window, so it becomes harder to diagnose which submit caused a build failure)
-- long cycle time on failure (hours before you know you broke something)
+* long cycle time on failure (hours before you know you broke something)
-- failures affect more people, are more expensive
+* failures affect more people, are more expensive
-- Understand root cause of failure is not obvious
+* Understand root cause of failure is not obvious
-- How to handle 300 applications, each with a few devs, how to scale to many projects and still manage level
+* How to handle 300 applications, each with a few devs, how to scale to many projects and still manage level
-- How to manage many branches to many mains
+* How to manage many branches to many mains
-- Managing build time dependencies (unexpected, undetected coupling)
+* Managing build time dependencies (unexpected, undetected coupling)
-  = incorrect incremental builds
+** incorrect incremental builds
-Addressing the problems, alternatives, risks, and trade-offs
+'''Addressing the problems, alternatives, risks, and trade-offs'''
-- subcomponents
+* subcomponents
-  = reduces build time, but
+** reduces build time, but
-  = increases integration time
+** increases integration time
-- build acceleration technology
+* build acceleration technology
-  = parallel build, multi-machine, multi-core (Electric Accelerator, for instance)
+** parallel build, multi-machine, multi-core (Electric Accelerator, for instance)
-  = bug fast machines (although disc I/O may dominate)
+** bug fast machines (although disc I/O may dominate)
-- modularize to get recent successful build, not compile
+* modularize to get recent successful build, not compile
-  = faster, less built (narrow the impact to a smaller team)
+** faster, less built (narrow the impact to a smaller team)
-- Use "pre-flight" build (production build with many changes, not yet on the source master)
+* Use "pre-flight" build (production build with many changes, not yet on the source master)
-  = integration race conditions
+** integration race conditions
-  = faster hardware
+** faster hardware
-  = parallel builds
+** parallel builds
-Alternatives (2)
+'''Alternatives (2)'''
-- 3 day acceptance test
+* 3 day acceptance test
-  = throw bodies at the problem (but it is not scalable)
+** throw bodies at the problem (but it is not scalable)
-  = review the acceptance process for automation opportunities
+** review the acceptance process for automation opportunities
-  = increase automated testing inside the application (at the interfaces)
+** increase automated testing inside the application (at the interfaces)
-  = modularize tests, make them independent so they can run in parallel
+** modularize tests, make them independent so they can run in parallel
-  = accept human tests less frequently, automation running continuously
+** accept human tests less frequently, automation running continuously
-  = use assistive automation to support more effective exploratory testing
+** use assistive automation to support more effective exploratory testing
-    * Brian Marick has some work going in this area
+*** Brian Marick has some work going in this area
-    * Michael Bolton describes his use of Watir as assistive automation
+*** Michael Bolton describes his use of Watir as assistive automation
 Lisa Crispin suggested that Jared Richardson had done the continuous
 integration work for SAS and might share insights and ideas.
-Alternatives (3)
+'''Alternatives (3)'''
 applications, small teams on each
-- Either many independent CI systems or an enterprise CI system
+* Either many independent CI systems or an enterprise CI system
-  = unified view
+** unified view
-  = shared configuration
+** shared configuration
-  = reuse between teams
+** reuse between teams
-  = security
+** security
-  = usable for small teams
+** usable for small teams
-- Dependency management
+* Dependency management
-  = component level dependencies managed by tools
+** component level dependencies managed by tools
-    * Anthill / Codestation
+*** Anthill / Codestation
-    * maven
+*** maven
-    * ivy
+*** ivy
-  = scheduling builds, which build should be run first
+** scheduling builds, which build should be run first
-  = how do I express the rules by which I select a component
+** how do I express the rules by which I select a component
-    * version (specific version, pattern match a version, relational operator to version string, etc.)
+*** version (specific version, pattern match a version, relational operator to version string, etc.)
-    * acceptance test results
+*** acceptance test results

Difference between revisions of "Scaling continuous integration to the enterprise"

Latest revision as of 15:17, 6 April 2008

Enterprise Scale Continuous Integration

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools