#TStech: Managing Jenkins job configurations by Puppet

When you have a Jenkins server in your pipeline that builds or tests branches of code, the number of jobs you have to manage very quickly becomes too large to handle manually. At Tradeshift, our Jenkins server builds and tests many different branches of 10+ application components as well as run integration tests on the combined set of components for each branch.

Virtually any change to the configuration of our pipeline touches all jobs, meaning we currently have to update 70-100 job configurations to roll out a new job priority or add an email notification receiver. Moreover, as a lean startup, we refocus development efforts quite frequently to work on the stuff that matters. Every time we refocus, job names, dependencies and properties change, yielding new jobs to be created and numerous changes to be made to existing jobs.

The need to automate

Instead of doing all these redundant changes manually in the Jenkins UI, we use a module in our Puppet infrastructure, which handles changes automatically for us.

We have a core set of build and test jobs which take a git commit hash as parameter. The build jobs are triggered by branch and component specific trigger-jobs, which monitor the branches in the git repositories, translate branch HEADs into git hashes, handle pre- and post-build dependencies, trigger the build jobs with correct git hashes, send out email notifications and collect and archive test reports and build artifacts.

As the build jobs do not change very often nor share much configuration, these are not managed by puppet. All the other jobs, however, are.

One of the reasons we don’t have just one trigger job per build job with wildcard monitoring on git repositories is that for a subset of our branches, we need to trigger build of a certain branch of our integration test when a developer pushes to that branch on a component.

Also, build history and build state changes (e.g. from successful to failed) become unusable if multiple branches are built by the same job. Finally, recording the history of when a test failure first appeared is only useful if the test runs on the same branch every time, which it does not in a generic, parameterized job.

To manage all the necessary jobs, we’ve built one single puppet module that produces and updates a config.xml for all our managed jobs using only 3 templates (one per job type; unit test, integration test or build-trigger).

Our puppet module

In our jenkins_jobs puppet module, we define a job resource:

class jenkins_jobs {
  $jenkins_home = '/var/lib/jenkins'
  $conf_dir = "${jenkins_home}/confs"

  define job($branch = null, $component = null, $type) {
    $job_name = $name
    $job_dir = "${jenkins_jobs_dir}/${job_name}"
    $config_file = "${job_dir}/config_by_puppet.xml"
    $template_file = "jenkins_jobs/${type}-config.xml.erb"
    file { $config_file:
      content => template("${template_file}"),
      require => File[$conf_dir],
      owner => 'jenkins',
      group => 'jenkins',
      notify => Exec["${job_name}-update"],
    }
  ...
}

To materialize the configurations, we define an aggregate resource that defines which jobs to create for a specific branch. The aggregate resource is realized with a list of branches:

define branch_jobs() {
  $branch = $name

  job { "frontend-build-trigger-${branch}":
    branch => $branch,
    component => 'frontend',
    type => 'build-trigger',
  }

  [... more jobs...]

  job{ "${branch}-ubl-unittest":
    branch => $branch,
    component => 'ubl',
    type => 'unittest',
  }

  [...]

  job{ "integration-test-build-${branch}":
    branch => $branch,
    type => 'integration-test-build',
  }

  [...]
}

branch_jobs { [‘master’, ‘team1branch’, ‘team2branch’, ...]:
require => Service[jenkins],
}

In the job resource, we can then configure the properties and conditions that separate the configurations for the different branches.

For example job priority by branch:

$priority_by_branch = {
     ‘master’ => ‘150’,
     ‘team1branch’ => ‘100’,
     ...
}

$output_priority = $priority_by_branch[$branch]

We employ the $output_ pattern for templates, since special conditions might dictate special values that are not easily stored in predefined lists, and re-assigning values in puppet doesn’t make sense.

We utilize the properties exposed in the config.xml templates:

<?xml version='1.0' encoding='UTF-8'?>
<project>
...
  <properties>
    <hudson.queueSorter.PrioritySorterJobProperty>
      <priority><%= output_priority %></priority>
    </hudson.queueSorter.PrioritySorterJobProperty>
...
</project>

Whenever something in a managed configuration needs to change, it is either updated in the puppet module manifest or in a template.

Update jobs automatically upon changes

To update jobs after changes, we have the following two exec resources in the job resource which are notified to query the Jenkins API whenever the $config_file content changes.

If the config.xml does not exist in the first place, the job itself is created via the API, to allow for new jobs to be created. Since all this happens via the Jenkins API, no configuration reload or Jenkins restart are required to roll out configuration updates.

exec { "${job_name}-update":
  command => "curl -u \"username:password\" -X POST --data-binary @${config_file}
http:///job/${job_name}/config.xml",
  path => "/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/local/sbin",
  user => jenkins,
  cwd => "/var/lib/jenkins",
  refreshonly => true,
  require => Exec["${job_name}-create"],
}

exec { "${job_name}-create":
  command => "curl -u \"username:password\" -X POST --data-binary @${config_file}
-H \"Content-Type:text/xml\" \"http:///createItem?name=${job_name}\"",
  path => "/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/local/sbin",
  user => jenkins,
  cwd => "/var/lib/jenkins",
  creates => "${job_dir}/config.xml",
}

In conclusion

All in all, it’s really neat to have managed job configurations so we don’t introduce regressions due to copy/paste or waste time manually updating a lot of redundant configurations. Also the fact that we can control which jobs are enabled and disabled based on knowledge about branches, components and job types directly in the puppet module is a big advantage that allows for quick reconfigurations.

The downside of this approach is that manual changes will get overwritten as soon as something is rolled out by puppet. This means that any permanent changes need to go via the puppet module through our operations team, which can become a bottleneck in time.
Also, when any new component or job type in pipeline gets branched, we need to puppetize the config, which incurs some overhead. However, thus far, the benefits have by far outweighed the downsides.

Next up: How we use Nexus in our build pipeline to exercise build-and-test-once on all pushes to git which in turn allows us to verify multi-repository pull requests into our release pipeline.

About the Author

Anders Nickelsen

Anders is a quality assurance engineer at Tradeshift, optimizing performance of the Tradeshift platform and its software release process. He has a PhD on service migration in dynamic and resource-constrained networks and tweets as @anickelsen.