#release-engineering (2020-12)
All things CI/CD. Specific emphasis on Codefresh and CodeBuild with CodePipeline.
CI/CD Discussions
Archive: https://archive.sweetops.com/release-engineering/
2020-12-10
I have an interesting issue, and would love to get some input on how to continue to troubleshoot: We have a jenkins pipeline that runs hourly, at the top of the hour, and without fail for a while now, it fails each job it starts at midnight and recovers the following run. The reason for failing also differs from time to time. One time “docker login” failed, another time the “helm” command timed out, or an error regarding forwarding of ports. The failurs are not seen during the rest of the day, but only during that one execution at midnight. Any pointers would be wonderful!
so it is a networking issue
is that onprem or cloud?
time outs, forward ports, docket login they all need to connect somehow
if it was on prem, I will say maybe an auotmated backup of some router that needs to do a failover and it happens to do it at the same time than the jenkins run
a Nat device loosing arp tables, way to many things
or it could be a simple dns issue?
if I were you I will install some network monitoring software in jenkins and capture network traffic and run a continuous test of dns lookups, connectivity to external endpoints and internal gateways and do that every 15 sec or so and record it for 1 day and well then analyse all that
Thanks a bunch for the questions and tips. I’ll definitely go for the network monitoring testing, and work from that!
if you add tests to different ips
make sure to add a test to localhost and it’s own ip in the jenkins server
to discard any internal issue
What if it ran at a different time, not at 00:00 but at 0000? There could be lots of time-of-day factors, including simple resource contention with other things that also start at midnight. If this fixes it then that narrows down your search space.
I’m wondering what best practices of managing a version of an app as a whole in a microservices world are if any. Say, we have tens or a hundred of microservices and a monolith. Back then we versioned this monolith and that was the version of the app. Dead simple. Now every microservice gets its own version since each of them has its own lifecycle. And still, by inertia, people tend to consider a version of a monolith as a version of the whole system which is only partially true now.
What are your thoughts on this?
One thing that came to my mind is the following. Since we define all deployed releases via some tooling (helmfile for example) and store this in git we can make a reference to a snapshot of a repo with helmfile values (commit hash or a tag) which represents a set of services of certain versions. And we can prepend this by a good old version of a monolith which everybody is aware of or a calendar version.
“why does the version matter to anyone?” is usually the question i would start with to figure out what make sense as a version
that’s not a rhetorical question either; why do people look at the version and what do they want to learn when they do?
Mm, we thought it was too hard to version everything and couldn’t see value so we only version microservices themselves and it’s by git sha.
If dev wants to run everything as is typically the case for local dev, then they use latest
tag in docker-compose and similar. They can obviously pin as need be.
I’ve been pushing for services to have a ‘version’ only in the sense that it helps them say ‘deploy based on this commit ref’. System ‘version’ is meaningless and just serves as a talking point for Product when dealing with the clients. Product was very concerned about knowing what ‘release’ we’re on and I told them they could just make something up
Yes it’s mostly for POs and clients. They want to differentiate a “state” of a whole system, especially when it’s an enterprise and the system is deployed per client.
2020-12-11
Semver is nice, and in theory tells you if anything has changed that you should be concerned about… in reality it doesn’t get abided so… for our services we moved to tagging images with the short hash of the git commit that produced them:
What generated this screenshot it’s pretty cool. I use gitversion to do automatic semver versioning so it’s almost the same. You have bump major versions but all minor versions and patches are automatically calculated.
Doesn’t work very well and in a mono repo though :-)
monorepos and java projects with parent pom’s are a different creature… and bring back a little PTSD of code shared across boundaries in all the worst ways…
Anybody using the GitHub actions deployment feature? I use azure devops but I’m the only one on the team that uses actively. Was thinking about leveraging for some more controlled pull requests terraform deployments. Curious if anybody’s evaluated the pros and cons. I think the deploys been there but now they have the new visual feature of it I guess.
Are you referring to GitHub Actions, or something else?
Yes.
I’m most familiar with azure devops and I find that the central management is easier to control. However because my repos are in GitHub I’m trying to work towards more pull request integrations that comment back on the work as well as some simple package and deployment features. I’m not heard quite as much about the deployment feature and how smooth of a process it is.
Oh, the deployment itself I have less experience with. However, if you’re looking to integrate security analysis (github action), I have experience on that front.
I love GitHub Actions, but holy moly, I really want manual approvers before I deploy into an environment!!!! My typical workflow would send my pull request to a dynamically provisioned PR specific staging environment and then to teststaging production. And I totally need manual approvers between environments!
Requires Enterprise plan for private repos, per the lead dev at Github Universe when I asked.
I love GitHub Actions, but holy moly, I really want manual approvers before I deploy into an environment!!!! My typical workflow would send my pull request to a dynamically provisioned PR specific staging environment and then to teststaging production. And I totally need manual approvers between environments!
I really hope that’s not the case
All things GitHub Universe 2020! Contribute to githubevents/universe2020 development by creating an account on GitHub.
Kind of weird wording but it seemed like ‘enterprise only’ was the answer
All repos both public and private across all GitHub skus will get support for the deployment history tracking. Public repos on all plans will have full support for Environments, Protection rules and Secrets. Only private repos under an Enterprise plan will have the ability to use those features.
oh and the first rep said ‘under Enterprise plans’ which is why I asked about the other paid tier
2020-12-14
2020-12-17
I’m not sure if this has been discussed before but couldn’t really find anything when searching… Preview environments, branch based environments, ephemeral environments… call them what you will, is anybody doing it? It’s always the dream that a developer can create a branch and some bot pings them on Slack to say to visit the magic url…
Check out the office hours recording from yesterday. Env0, one of the vendors that did a presentation, looks to have some pretty cool stuff around creating ephemeral environments with an automatic destroy after a configurable period of time
I meant to catch up on that today but my headphones died - will take a look, thanks!
Hey yeah I can also help with this @tim.j.birkett. I was doing some research on the topic the other day.
There are many vendors providing a managed solution for this but I would say building it your own isnt a great pain either
Where you stand?
Thinking of building but wanted to seek out some of the higher level design opinions. For example, you have an application made up of a few microservices and a UI.
The UI would be the simplest thing to deal with, but backend services and data sources (and other infrastructure), not so easy.
Do you clone the entire application (all microservices, UI, and data), or just the microservice being worked on… - what are the patterns and views around this
Maybe starting with a fully distributed microlith isn’t the way to think about it. Starting with something simple, a single container and a DB, like Ghost, or Wordpress perhaps…
I think as you say a simple fe app is the easier part. Perhaps is a little bit slower to with spinning a huge database. But I was only talking about the FE part when I was thinking of ephimeral environments.
I am keen to see what others think about the BE part
Perhaps a simple MVP would be a way to start with it
you can also check Jenkins X, which has native support for preview env: https://jenkins-x.io/
Documentation, guides and support for Jenkins X
15 years ago did that with php, lighttpd, and wildcard host names mapped to dynamic web roots. It’s amazing how complex the old simple things have gotten .
Can you even start a complete environment in docker compose? Start with the simple things first.
@mfridh good point, for some reason I thought OP was talking about infra. If this is just an app then this gets a heck of a lot easier with things like kubernetes, and having a docker-compose to be able to spin one up locally is a great start
Yes if your compute is containers or functions, ephemeral test environments are achievable. If you are using legacy compute (EC2) then I suggest sticking with hard coded test environments. The complexity is too high
Agree. Containerize all the things and this becomes pretty simple
(We actually use ephemeral environments with AWS Elastic Beanstalk. But I’d argue this is basically equivalent to ECS/k8s, rather than self-managed EC2)
Fargate or Beanstalk is beautiful for that. But I wouldn’t shy away from doing it with EC2 either if I had to. In some situations you could even argue plain EC2 is easier because some of the other fancy stuff even have a dependency on EC2 itself.
“So you want to solve your problem with Kubernetes? Now you have two problems” is not a completely exaggerated silliness.
I may be saying this because of the beautiful 5+ year old vpc dns caching/forwarding solution I just modernized yesterday and today on EC2 fully terraformed and on spot instances with asg lifecycle hooks and all… I bloody like it to be honest.
When it comes to ECS and fargate there are so many interesting tools I haven’t tried yet.
Here’s one I stumbled on the other day…
Quickly create disposable QA environments. Contribute to askwonder/wonqa development by creating an account on GitHub.
I managed to have this working with a combination of terraform w/ terragrunt and ECS. Essentially, I grouped all the terragrunt.hcl
files in one folder and then I just do a cp
on my CI which then runs another apply
. Not perfect but it served its purpose.
We’ve achieved this at my work with Kubernetes and internally built tooling. We have hundreds of QA/PR/Demo/Trial sites where every engineer, designer, sales, bdr, marketing person has their own site (or multiple) “environments”, each one essentially being a group of pods deployed into a single namespace (and some aws resources that get auto-provisioned for each site via the internal tool). The internally built tool is the brains that allows granular control of a single QA or PR site (override-able env vars, feature flags, different version, etc), deploy a percentage of sites, or all of them.
It’s actually one of the things I love to highlight when chatting w/ candidates. I too am familiar being at a company where “dev is down” or “qa is broken” brings the entire engineering team’s efficiency to a halt.
We use IaC (CloudFormation), and the first part of every pipeline is to spin up a clean infra, then deploy the microservices (all containerized) to it. Once the infra & services are deployed, then we run integration tests on the whole thing. Once those pass, we tear it down. In the meantime, we are free to hit the services manually for debugging purposes.
We do this for PRs as well as for deployments. It’s nice, repeatable, and isolated. There are a few shared resources (CloudWatch log groups, VPCs, DBs) but it’s really nice.
@Jonathan Marcus sounds good! I am curious do you have databases that you populate with data? How long usually populating the database and add data would take in your experience?
When I say populate with data I mean a meaningful good amount of data. Very close to the amount of data a production env might have.
–
And one more question. You say you run integration tests. You run them on every push
or how you handling this situation?
If we abort a pipeline early then we can end up with orphaned environments, but we have a few ways to mitigate that.
• Each env is named with a hash of the branch the PR is for, and each pipeline starts by tearing down whatever is there. So if you reuse the same branch name a lot (I’m usually on jm-dev
by default) then you’ll reuse the same env.
• Our CI pipeline lets us fix errors in place, instead of starting over at each error. This lets us ensure that we can get to the end & clean up each time, so no orphaned environments. Also we don’t have to do a slow spin-up/teardown each time we want to iterate with new code.
• We also have a check that shows us each env and how old it is
@Christos the DB is part of the infra that is shared (along with VPC, log groups, etc). We do that because RDS takes a super long time to spin up. We therefore don’t have to load fresh data into it either.
On each PR, we:
- Deploy new infra
- Unit test microservices
- Deploy microservices to new infra
- Run integration tests against new env
- Tear down
On each
push
we do the same, except step 5 is replaced with 5) manual QA, 6) blue/green deployment.
And per @tim.j.birkett’s original question, we do also get a Slack notification with the URL to the new env. Yes that is possible, and yes it is as amazing as you’re hoping.
I see, thanks for taking the time to explain. Makes sense about the DB because we are having the problem you mention. Spinning the db is being very slow.
We wanted to run some e2e tests on our webapps on each push on the PR but this is very time consuming.. At least if you want to spin a new clean db on each push the developer does on the PR. We wanted it to be clean database so that tests dont fail unexpectedly when someone manipulated data from a shared database. This way tests could always run against the same database.
Maybe try reusing the same DB instance but making a new table each time. It’ll probably be much faster to do INSERT INTO new_tbl SELECT * FROM src_tbl
than to load it fresh from an external source.
Right, thats smart
Yeah. We done this as well for 15 years. Also using snapshots on LVM historically for repeated db migration script verification tests on production data snapshots. My thought is: What am I really testing? My applications? Or Amazon’s infrastructure… so I remove as much of the cloud provider parts as possible. It gets tested enough anyway in many other manners.
Nice, very clever.
@Jonathan Marcus - when you create the PR infra / environment, is it just the microservice you’re changing that gets deployed in step 3? Or all other dependencies that make up the full software “product”?
All of them. We want to test how the new microservice interacts with all the others, so for full coverage we deploy the full product.
The aws amplify console build service does this pretty well, with branches and pr preview builds
2020-12-18
2020-12-20
anyone know of tooling for managing select files across numerous github repositories? mostly thinking about a handful of files that are often identical (or nearly), like .editorconfig, LICENSE, or maybe a github actions yaml… i did find one github action, curious if anyone has experience with it or any other options… https://github.com/kbrashears5/github-action-file-sync
Github Action to sync files across repositories. Contribute to kbrashears5/github-action-file-sync development by creating an account on GitHub.
figuring some kind of file templating will be necessary, also…
Github Action to sync files across repositories. Contribute to kbrashears5/github-action-file-sync development by creating an account on GitHub.
first thing that comes to mind is a template repo and makefile or similar to iterate a list of concrete repo’s.
I recently copied Cloud Posse’s build-harness / gomplate pattern that they use for READMEs for a client to handle this type of thing. I use it to template drone.yml (Drone CI, similar to Jenkinsfile), gitignore, and .editorconfig files right now across 30+ repos.
It solves the problem of allowing centralized management of files that are cross cutting over many projects, but you do still have to execute commands against the repos to pull the file updates and then go through the PR process. If I had the time, I’d invest in going the mergify route and make sure that the automated PRs that I put up when I do updates across all repos would merge automagically when CI passes.
For sure, it’s the entirety of the workflow that I’m looking for… Pull from one central repo with the templates, for every managed repo compare the current contents of the default branch, if different then branch, update, commit, and open a pr…
The github action I linked is actually pretty close, though it doesn’t look like it supports file templating… I’ve also considered using terraform, with the resource github_repository_file
, which can do the diff and the templating but doesn’t feel like it can handle the workflow of checking the files in the default branch but making any changes in a (new) bug/feature branch…
I feel like the file templating is a must.
I’ve had rough going with the the GH provider and while I love Terraform obviously… using that repo_file resource always seemed like a bad idea to me. That’s without much research though so if you go down that path I’d be interested in your results.
using the repo_file resource is definitely a little rough… and fails completely when branch protection is enabled. but also seems to have some api restrictions that i do not understand… for example, currently cannot manage files under the .github/workflows path… https://github.com/terraform-providers/terraform-provider-github/issues/633
Terraform Version $ terraform -v Terraform v0.14.3 + provider registry.terraform.io/hashicorp/github v4.1.0 + provider registry.terraform.io/hashicorp/random v3.0.0 Affected Resource(s) github_repo…
went ahead and made a feature request for file templating on that github action… we’ll see where that goes… https://github.com/kbrashears5/github-action-file-sync/issues/7
Hi, cool tool! Was researching how to manage common files across GitHub repos, and came across this project. One of the things we are finding is that we often need to template the files a little bi…
hmm, gomplate may be close to doing this all on its own… found a ticket opened by @Erik Osterman (Cloud Posse) for the feature i could abuse to make this work… https://github.com/hairyhenderson/gomplate/issues/589
@hairyhenderson I understand this may be a long shot feature request! Anyways, just wanted to get it on your radar, in case it sounds cool. what support a way to create templatized projects, e.g. a…
linked from there to another gomplate issue maybe indicating the datasource feature could be abused to retrieve the templates also… https://github.com/hairyhenderson/gomplate/issues/963#issuecomment-710559472
I read the documentation but couldn't quite figure the answer to my question. What I would like to do is to store the complete template in AWS SSM Parameter Store. Inside the template is the da…
Yes I would like to use gomplate for the same purpose as you mention
There is also https://github.com/vmware-tanzu/carvel-vendir
Easy way to vendor portions of git repos, github releases, helm charts, docker image contents, etc. declaratively - vmware-tanzu/carvel-vendir
But no templating
Can’t wait for https://github.com/github/roadmap/issues/98
Summary To encourage best practices and consistency, this feature enables you to provide centrally managed workflows across multiple repositories in your organization. This feature includes: Abilit…
@hairyhenderson any updates on https://github.com/hairyhenderson/gomplate/issues/589
@hairyhenderson I understand this may be a long shot feature request! Anyways, just wanted to get it on your radar, in case it sounds cool. what support a way to create templatized projects, e.g. a…
oh hi
@Erik Osterman (Cloud Posse) sort of? there are a few new features that help with that, and I’ve been working on refactoring some stuff to help with how data/templates/etc are read in general (so you can read input templates from any URL, etc…). But it’s been slow-going… I’ve been working on switching jobs (starting new job on Monday) so I haven’t been able to dedicate much time to gomplate unfortunately
Congrats on the job transition! Yea, understood. Appreciate the update!
tx
one thing that’s bogging me down too is the changes I need to make are pretty major, and I may need to break API to do it… I don’t want to release gomplate 4.0 yet, but I may need to… we’ll see
We have more an more use-cases for generators/scaffolding (E.g. github actions, terraform modules) and I really want something simple like boilr
(abandoned) that’s distributed as a single binary.
oh, just came across copier
, which seems built for exactly this repo templating and updating use case… https://copier.readthedocs.io/en/stable/
Library and command-line utility for rendering projects templates.
Ah cool - also seems like an alternative to Cookie cutter
I am considering introducing a tool like cookie cutter - leaning towards it because of critical mass. It would be a done deal if it was a standalone binary
copier actually has a comparison page, includes cookiecutter, reads pretty fair to me, https://copier.readthedocs.io/en/stable/comparisons/
Library and command-line utility for rendering projects templates.
What led you to copier? Are you seeking a solution for this right now? @loren
(Sorry I lost context and got confused by another thread)
…for templating GitHub actions and centralizing workflows
A few days ago we released the version we are using now for #codefresh
Our Library of GitHub Actions. Contribute to cloudposse/actions development by creating an account on GitHub.
Uses gomplate :-)
the idea is still a little undefined… but something like that, yeah. maintain a central “template” repo. use it to create new repos. and use it to periodically “refresh” contents in other repos. the pattern in that pipeline-creator comes pretty close i think… github action in each project that clones the template repo, runs gomplate to template files. i would like to also open a pr if there is a changeset… copier supports git sources and templating, and has options to control whether to overwrite any given file.
Ya for Codefresh it was a little bit easier because we didn’t need to commit back the files anywhere. We just deploy the pipelines manifests via the API.