#geodesic (2018-08)
Discussions related to https://github.com/cloudposse/geodesic
Archive: https://archive.sweetops.com/geodesic/
2018-08-01
2018-08-02
damn, I spun up an AWS Workspace for Windows 10 not realizing it doesn’t support Hyper-V, so no Docker. =(
heh, and no “Windows Server 1709” support yet for “AWS Workspaces”, so no WSL even.
=(
@Sebastian Nemeth i wanted to try this to test Geodesic on WSL/Docker
@rohit.verma going back to your question of simplifying IDE integration with geodesic
does running ` mount –bind /localhost/Dev/cloudposse/terraform-root-modules/ /conf` inside of geodesic make things any better for you?
I’ve been using this today and it really has helped for this specific use-case
For example, I’m working on some kops
automation. So I run:
mount --bind /localhost/Dev/cloudposse/terraform-root-modules/aws/kops /conf/kops
(replace Dev/cloudposse/terraform-root-modules/
with the path to your root-modules
folder)
I’ve started getting these errors. I suspect it could be related to bash not responding to kill -WINCH $$
within some kind of deadline
That’s crazy talk. POSIX signals were designed to be deadlock free. I’ve never heard of a signal deadline. Have you?
I don’t know the internals well enough anymore. Not strictly speaking of posix signals, thought I have experienced that if a process doesn’t acknowledge a signal that something happened to it, but I confess it’s been 15 years since I was that low level and probably am entirely off.
I just don’t know what to attribute this behavior to
I think what I might be thinking of (confusing it with) is where signal handling has been used for heartbeating a process. If it doesn’t respond, then the process is reaped.
@tamsky Have you seen this before?
i started getting the errors too (after some period of inactivity in geodesic
)
I’ve noticed strange behavior after long period of inactivity but was assuming it had to do with my aws-vault session expiring
what Only call kill -WINCH \(when dimensions of screen change why Theory is that it contributes to this error... I suspect it could be related to bash not responding to kill -WINCH\) within s…
2018-08-03
2018-08-04
[2] Stopped
indicates the process received a SIGTSTP
I should smile along with my crazy talk.
https://github.com/cloudposse/geodesic/pull/210#pullrequestreview-143392008
so in marked contrast to my bash shell’s default shopt
,
geodesic does have checkwinsize on
….
so unless that setting changed only recently – I think that the shopt should have been solving this problem all along.
what Only call kill -WINCH \(when dimensions of screen change why Theory is that it contributes to this error... I suspect it could be related to bash not responding to kill -WINCH\) within s…
And I’m thinking: “where have you been all my life checkwinsize
?”
Not having it, forced me to learn the kill -WINCH
trick in the meantime.
The checkwinsize wasn’t working for me, but maybe that was masked by other problems
For example the ones you previously fixed
maybe docker run doesn’t pass WINCH to subshells?
because the terminal’s window change signal needs to propagate down this entire chain: (iterm/xterm) -> local shell -> docker run -> geodesic shell
Oh fascinating. Didn’t know that either
I guess it does propagate… here’s one where the signal was killing apache if docker run -it
was used:
https://github.com/docker-library/php/issues/64
When making a derived image from php:5.6-apache, when the server starts, at the slightest movement it stops, and gives me this message: [Wed Jan 21 2005.736731 2015] [mpm_prefork:notice] [pid 1…
Regarding the other problem, with sigstop it seems to happen when AWS session expires. Probably related to aws-vault usage.
yes, that sounds like aws-vault trying to read/write stdin/out and being blocked
Yes, I think it is related to stdin
I just scanned their issue queue and didn’t see anything related.
next time this happens to someone, can they please run and report back:
pstree -p ; for i in $(jobs -p) ; do echo $p ; ls -l /proc/$i/fd ; done;
as well as the [N] Stopped aws-vault exec ...
Will do!
Man… can’t say how much I appreciate your insights. In the short time since you’ve joined, learned a lot of little tricks from you.
@tamsky this is the output.
pid 8 is aws-vault server
mode (mock metadata api)
2018-08-06
doh, pstree -p ; jobs -l; for i in $(jobs -p) ; do echo $i ; ls -l /proc/$i/fd ; done;
$p
was in my example – should have been $i
oops, I missed that too.
will try again next time.
2018-08-07
@Sebastian Nemeth just cut 0.13.0
which adds WSL (Windows) support
Can you give that a shot?
Will do!
Hey man - so this still isn’t solving the problem convincingly, I think… There’s just one problem I can see…
The first is that it’s common for WSL users to change the mount path for their local drives from /mnt/c to just /c (for example), which makes a lot of things easier for us. But this causes the new geodesic to fail with:
/usr/local/bin/root.potato.com: line 91: /mnt/c/Windows/System32/cmd.exe: No such file or directory
E.g. line here: https://github.com/cloudposse/geodesic/commit/a096ddf28314f0d7c9423f61b8516853663b4d24#diff-499f40d14b68a5dc159a3d3ebc5c4870R91
Looks like it’s looking for cmd.exe under /mnt - however, cmd.exe is something that should always be in PATH in WSL, so it might be fine to omit the path and just use cmd.exe
everywhere?
- Add user environment preserving * fix(*): add env variables for changing $HOME varaible(for wsl) * fix(wrapper-on-wsl): Now windows and linux usernames get dynamically * refactor(wrapper…
The location of the mounted drives can be obtained from /etc/wsl.conf
under [automount]
> root
.
https://docs.microsoft.com/en-us/windows/wsl/wsl-config#set-wsl-launch-settings
Reference listing and configuring multiple Linux distributions running on the Windows Subsystem for Linux.
@Erik Osterman (Cloud Posse)
Add a PR here: https://github.com/cloudposse/geodesic/pull/214
Uses regex to look up the correct root mount path from wsl.conf. I tested the script on my system, and it works - however wasn’t able to test the whole build.
2018-08-08
2018-08-09
2018-08-15
@Dylan has joined the channel
2018-08-17
Lots of great UX fixes were merged today
- check for duplicate syslog-ng
- fancier banner
- prompt line-wrapping
- ^C ssh-agent no longer aborts subsequent scripts
2018-08-19
some discussion right now in #release-engineering related to #geodesic
2018-08-21
@tarrall has joined the channel
2018-08-22
@Adam has joined the channel
OK carrying on from #announcements
First up… we’ve followed the cold-start process and now have root.example.com, prod.example.com etc repos, accounts stood up, k8s cluster up in prod. Which is cool, but the current Dockerfile is basically an intermingling of configuration and code, making updating it to track the “upstream” versions (e.g. [prod.cloudposse.co/Dockerfile
](http://prod.cloudposse.co/Dockerfile`)) awkward. Are there plans to extract the configuration into a separate file in order to make the existing repo more usable longterm, or is this intended more as a “here’s an example of how you might glue this all together” repo rather than a tool you’d use directly?
And yeah I’m picking up where Jonathan left off, or at least trying to
the Current Dockerfile is basically an intermingling of configuration and code, making updating it to track the “upstream” versions awkward
so there are a lot of versioning going on
(binaries, images, charts, helmfiles, modules)
which versions are you referring to?
hi @tarrall, welcome
maybe I asked this the wrong way, let me rephrase. When I got here, they had a copy of prod.cloudposse.co that was obviously from several commits ago, and things didn’t seem quite right, so I figured hey let me check out the latest version before I try to troubleshoot too far.
However our Dockerfile (and yours) have lines like this
ENV TF_VAR_account_id="12345"
ENV TF_VAR_namespace="example"
ENV TF_VAR_stage="prod"
ENV TF_VAR_domain_name="prod.example.com"
ENV TF_VAR_zone_name="prod.example.com."
aha, gotcha!
yes, so the reference architectures IMO are designed to be hardforked
which means I need to examine your Dockerfile, copy the relevant changes into ours, instead of having a portable Dockerfile with that stuff elsewhere & pulled in
I really don’t think it makes sense to try to use ours verbatim
OK gotcha
they are an example of how to use all of our tools
it’s an example of how we do and how we do it for our customers
additionally, terraform-root-modules
is also more of highly functioning examples
you can definitely reference them and use them, but what you run in AWS will be different from what other people run. they are basically examples for how to invoke all our terraform modules. . . a demonstration of how we use it.
and then question #2 is … let’s say I want to slap an RDS instance in here — maybe RDS postgres, maybe Aurora. I see cloudposse/terraform-aws-rds-cluster
and that’s likely where I’d start. What would the recommended approach be here? I could copypasta that into prod.example.com/rds-cluster and then have the Dockerfile pull that in, but that’s kinda no bueno — if my stage and prod envs both end up with RDS clusters, I should be sharing the same code for both.
ok, good question
so there are a few concepts here. let me try to explain.
geodesic
is our base image. that distributes the tools. so that’s our “opinionated” toolchain.
- then there are “geodesic modules” which are basically those reference architectures. those implement some architecture using the tool chain.
- then there are
terraform-root-modules
. those are basically a collection of patterns. usually those patterns are highly specific to your organization. for example, you would have a way of defining the infrastructure for your “API service”
this is basically our “MVC” of infrastructure.
let’s take your example.
You want to add an RDS cluster. How you do this is specific to your organization. You may choose postgres or mysql. You have some opinions on the parameter groups. You have some requirements for security groups, etc.
You add that to terraform-root-modules
. The root module have no “identity”.
root modules are versioned. also, we like to build a container for the root modules so we can easily copy that stuff around between images.
Now, to invoke that RDS database, you pull that into [prod.example.com](http://prod.example.com)
image. this achieves many things:
- it’s super DRY
- it’s versioned infrastructure
- separation of concerns. the
[prod.example.com](http://prod.example.com)
repo defines all parameters to run in that environment. this is basically the “identity” layer.
yup I like the versioning approach there — saves drama around your TF module’s interface changing
then there’s the question: how do we develop?
i mean, if you have to push changes to terraform-root-modules
, rebuild image, then rebuild your current account repo ([staging.example.com](http://staging.example.com)
) everytime you make a change, you’ll NEVER get done.
for this reason, when we develop, we cd /localhost/path/to/my/root-modules/
and do all iteration there until we achieve the desired outcome
then commit/push that, open PR against master in terraform-root-modules
and then merge that after approval, tag a release, and subsequently distribute that release across the various stages as needed.
also, if you use [dependabot.com](http://dependabot.com)
it’s pretty cool - you can get these updates as PRs automatically
@tarrall we already have your example as an example https://github.com/cloudposse/terraform-root-modules/blob/master/aws/backing-services/aurora-postgres.tf
terraform-root-modules - Collection of Terraform root module invocations for provisioning reference architectures
(we’re also working on CI/CD of everything - but it will be a bit before that’s fully baked)
and here how we pull
it
prod.cloudposse.co - Example Terraform/Kubernetes Reference Infrastructure for Cloud Posse Production Organization in AWS
staging.cloudposse.co - Example Terraform Reference Architecture for Geodesic Module Staging Organization in AWS.
OK makes sense. I’m inclined here, I think, to have our own root modules container which is separate from yours, and develop in there…
exactly. that’s what I would recommend.
also, we have our https://github.com/cloudposse/helmfiles
helmfiles - Comprehensive Distribution of Helmfiles. Works with helmfile.d
@Andriy Knysh (Cloud Posse) thanks! Somehow all my google-fu was able to turn up was your TF module, not the one that was in root-modules
You might like to fork these too
(also, we welcome all PRs - so if you find/fix bugs, develop cool new things, would love to see it)
I’m waaaaaaay too much of a k8s n00b to want to fork someone’s “young” repo and develop on that. Good odds y’all will find and fix bugs & improve workflow faster than I can, which means I’m better off being able to follow your repo rather than blazing my own trail
so yeah PRs fersure, once I’ve got a vague clue of what I’m doing
haha, sounds good! you’re on a fast track.
BTW one other minor thing — I think AWS well-architected (or whatever they’re calling it these days) normally recommends a separate “identities” account where the humans are managed, rather than managing those out of the master account. At least, that’s what I did at my last place, and I kinda liked that because IMO the master account should be locked down hard. Might be something to consider adding to the reference architecture, though I realize it may be overkill for many places.
I think what we call [root.example.com](http://root.example.com)
is that
(though I think we should rename ours to [identity.example.com](http://identity.example.com)
)
Aaaaah. Yeah to renaming, because I think we ended up with root.example.com == master. Not positive.
hrmmmm ok, so I need to re-review the well-architected doc
Heh I was at that re:Invent talk…
one of only like 3-4 I managed to make last year
Perhaps we have something we should rethink there. I’ve always treated root = identity = master
but deploy nothing other than identity in it
then prod, staging, audit (~security), dev, testing accounts
which share nothing
and identity delegates to those accounts
i need to adjust my mental model for how it would look if master != identity
so is master just billing?
yeah on identity delegating, fersure. I like having the “master account” (payer account, and where the service control policies are defined) separate from the “identities account” (where humans are defined)
master
is root/Org
identity
is our current [root.cloudposse.co](http://root.cloudposse.co)
ok cool so mostly just I got confused by the naming convention
something like that. yea, naming is hard.
2 hard things in CS, right? Naming things, cache invalidation, and counting
we’ve had a lot of discusssion around this internally (fwiw) - we know we need to change root
to something or to rename terraform-root-modules
(for which there’s no relationship)
i’m inclined to rename [root.cloudposse.co](http://root.cloudposse.co)
to [identity.cloudposse.co](http://identity.cloudposse.co)
and introduce a new [master.cloudposse.co](http://master.cloudposse.co)
or [billing.cloudposse.co](http://billing.cloudposse.co)
so we do DNS zone deletation from identity as well
where would that belong?
yea, there we have two diff hierarchies: AWS and DNS
Re DNS, oh man, that’s one of those that’s gonna be super company dependent right? I mean, for some places using Route53 maybe they do a zone per account, other places maybe have a single shared zone with cross-account access…
root
of DNS is [cloudposse.co](http://cloudposse.co)
I mean, for some places using Route53 maybe they do a zone per account, other places maybe have a single shared zone with cross-account access…
=(
LOLOLOL but when you migrated to the cloud LOOONG before AWS had orgs…
haha - yea, we’re probably going remain very strict about “share nothing”
you might happen to be proud of having almost everything migrated out of Classic
(we even recommend using different TLDs per account in some situations)
Yup, reasonable fersure. I’ve always liked at least separating “customer-facing” TLD from “internal ops” TLD
example.com / example.net kinda thing
yes - that’s a must
we call the customer facing one the “vanity domain”
so we provision the root
DNS zone (e.g. [cloudposse.co](http://cloudposse.co)
) in [master.cloudposse.co](http://master.cloudposse.co)
?
An alternative there might be to have an account dedicated to “shared infrastructure.” I can’t decide if it is just massive overkill to have that as a separate account from the “identities” account or not… identities is certainly an instance of “shared infra”.
yes that’s possible
what we’ve found out after many iterations is that there are no perfect solution for this
Yup
you touch/change something in one place, you get a lot of issues in other places
And k8s is new enough that I’m confident that in a year or two, the “best practices” there today will be a laughingstock. This just based on my past experience with seeing workflows mature on Chef and Terraform…
that’s actually one of the main reasons we created https://github.com/cloudposse/terraform-root-modules and https://github.com/cloudposse/helmfiles - to introduce some patterns for TF and k8s
terraform-root-modules - Collection of Terraform root module invocations for provisioning reference architectures
they are not perfect, but at least we have the same structure between projects and consistent naming (which is hard)
2018-08-23
Latest “probably a n00b mistake I’m making” issue — init-terraform
in terraform-root-modules/aws/ecr is erroring…
will cut/paste the error here in a sec
✓ (flowtune-prod-admin) ecr ⨠ init-terraform
Mounted buckets
Filesystem Mounted on
flowtune-prod-terraform-state /secrets/tf
Initializing modules...
- module.kops_ecr_app
Getting source "git::<https://github.com/cloudposse/terraform-aws-kops-ecr.git?ref=tags/0.1.0>"
- module.kops_ecr_user
Getting source "git::<https://github.com/cloudposse/terraform-aws-iam-system-user.git?ref=tags/0.3.0>"
- module.kops_ecr_app.label
Getting source "git::<https://github.com/cloudposse/terraform-null-label.git?ref=tags/0.3.3>"
- module.kops_ecr_app.kops_metadata
Getting source "git::<https://github.com/cloudposse/terraform-aws-kops-metadata.git?ref=tags/0.1.1>"
- module.kops_ecr_app.kops_ecr
Getting source "git::<https://github.com/cloudposse/terraform-aws-ecr.git?ref=tags/0.2.6>"
- module.kops_ecr_app.kops_ecr.label
Getting source "git::<https://github.com/cloudposse/terraform-null-label.git?ref=tags/0.3.1>"
- module.kops_ecr_user.label
Getting source "git::<https://github.com/cloudposse/terraform-null-label.git?ref=tags/0.3.1>"
Initializing the backend...
Successfully configured the backend "s3"! Terraform will automatically
use this backend unless the backend configuration changes.
Error: output 'registry_url': "repository_url" is not a valid output for module "kops_ecr"
Error: output 'repository_name': "name" is not a valid output for module "kops_ecr"
Error: output 'kops_ecr_app_registry_url': "repository_url" is not a valid output for module "kops_ecr_app"
Error: output 'kops_ecr_app_repository_name': "name" is not a valid output for module "kops_ecr_app"
this is with terraform-root-modules:0.5.3
and cloudposse/geodesic:0.16.0
@tarrall I think this is fixed in an upcoming PR
What Disabled default ecr That can be BREAKING CHANGES for some projects that use default ecr. Why Default ecr does not make sense for custom projects, that needs names for ecr
yeah I was thinking that was likely, despite the misleading title
yea, the PR should be updated
code review changed the nature of the PR
btw, definitely recommend forking or creating your own root modules sooner rather than later
are you guys on codefresh?
yeah
not on codefresh no
I updated the PR description
repository_name
was renamed (easy fix on your side)
and yeah time to fork. Kinda in a “chicken and egg” situation where I’m just starting to set up services — no build server yet, we have bitbucket for code (shockingly bad but this should surprise no one, it’s atlassian after all), etc. All of this “build and publish a Dockerfile” workflow would be easier if I wasn’t in the middle of trying to set up ECR to … publish Dockerfiles
yea, coldstart problems..
I’m just glad I already have a few years of experience in arguing with Terraform, or I’d probably be outside yelling at the cloud
(you can possibly use dockerhub automated builds)
since I’m comfortable with vanilla terraform, including magic tricks like data sources / finding stuff by tags, I’m gonna just do a combination of “copy and modify” on your code and just rolling my own from scratch to get this going. Some experience writing Dockerfiles but less with the day-to-day workflow stuff like dockerhub, compose etc
2018-08-24
@Erik Osterman (Cloud Posse) Do you know if there are any known issues with upgrading the nginx ingress image using the cloudposse/nginx-ingress chart? Just curious, I have an edge case scenario that results in a (known) race condition issue in 0.11
So it should be relatively straight forward
I think we recently added a helmfile for the official ingress
basically, our ingress gives you the fancy 404/500 pages
but I can see why you’d want to move to the official one
(we started ours before there was an official one)
right right
helmfiles - Comprehensive Distribution of Helmfiles. Works with helmfile.d
so long as the ingress class is the same it, should be a drop in replacement
but definitely test in staging first!
HAH
who has time for that!
the stable ingress comes with prometheus exporters for monitoring
so it integrates nicely with grafana
i don’t mind being on the CP chart (we use the error pages), was moreso curious if you knew anyone using say… 0.13 nginx ingress
not yet =/ - as in i don’t know if anyone has tried upgrading
okay cool, no worries!
btw, what did you guys decide to do for monitoring?
still auditioning companies, i was in Ireland for two weeks in July so the monitoring got re-prioritized for.. next week
ok
we’ve gotten grafana working with the autodiscovery of dashboards in configmaps
it’s so freggin sweet
we’ve got the portal up and rocking in both stage and prod, it’s soooo nice
i think that alone is a BIG motivator to use it
maybe not for everything, but definitely as a first line of monitoring
right right, i dig it!
have you tried using the portal with argo yet?
should be really easy
not yet, we actually found our first use case for argo last week
so that should be coming along… swiftly
what Add cloudflare ingress controller (aka argo / acsess) why Expose services inside kubernetes securely and speedily using Argo tunnels
as per usual, saving me time!
it’s slightly out of date
also, re: race condition, will put some deets in #kubernetes for other folks?
yea, that would be cool
or even open up an issue in our helmfiles
or charts
repos
sounds good
will do that in a bit
2018-08-27
Here’s the slide I’ve been looking for that came from 2017 re:Invent talk on architecting security and governance across a multi-account strategy (SID331)
It’s very close to our reference architecutres.
The “Enterprise Accounts” are more decomposed