#geodesic (2019-10)
Discussions related to https://github.com/cloudposse/geodesic
Archive: https://archive.sweetops.com/geodesic/
2019-10-02
@here public #office-hours starting now! join us to talk shop https://zoom.us/j/508587304
@Jeremy G (Cloud Posse) Eric suggested that I speak with you regarding my current issue with the reference-architecture, as you had a similar challenge. Here is a Gist of it - https://gist.github.com/dalekurt/7c451ba3914f066bf16b42392904aca1
@dalekurt I need more context to really understand what is going on here, but my guess is that generally the reference architecture scripts still require Terraform v0.11 but the make root
scripts are using the default terraform
which is now v0.12. @Erik Osterman (Cloud Posse) when Terraform is being bootstrapped (initial S3 buckets created and initialized) where are the scripts getting the terraform
binary from?
@Jeremy G (Cloud Posse) If I recall correctly, I have terraform v0.11 installed (at home) I did perform a git pull
recently. The make root
and make children
executed successfully but make finalize
failed with the error in the gist.
Those are executed in the docker container and not my local installation of terraform from my understanding.
@dalekurt Geodesic has both versions of terraform (0.11 and 0.12) installed so we can use both during the current transition period, but defaults to 0.12. We use direnv
to read .envrc
files in directories, and in those .envrc
files we (should) have either use terraform 0.11
or use terraform 0.12
to pick which version to use. That mechanism works fine, we use it in all our current projects, but we use reference-architecture
only rarely and it is therefore often broken around issues like this. My guess is that the bootstrap code is missing a use terraform 0.11
somewhere.
@dalekurt Looking into it some more, I can see a bunch of things that could have gone wrong. I would check which S3 buckets were created for the Terraform state files (just check root
and one child account, like audit
), if they were populated with state during make root
and make children
, and what name the Terraform module thinks the S3 bucket should have when it is erroring out. My current guess is that the buckets are being created correctly but the IAM role is lacking permissions. We recently started locking down S3 buckets by default and ref-arch might have been relying on them being open until everything was finalized. If the bucket exists but you are getting NoSuchBucket
errors, that is a permissions problem. If the bucket really does not exist, either you missed an earlier error where the creation failed or the name of the bucket that was created does not match the name of the bucket TF is attempting to read, which can happen when we update one thing but forget to update another, or are using incompatible versions.
@Jeremy G (Cloud Posse) A follow up to this, I dumped the previously created sub accounts and cleaned up the root. With the latest commits I started over and the result was the same. The s3
bucket was created, in my case lunarops-io-security-terraform-state
. Is there a fix or workaround for this
@Erik Osterman (Cloud Posse) As I look at ref-arch now, I see that while the README lists security
and identity
as accounts that can be enabled, there is no /config/security.tfvars
or /config/identity.tfvars
file. What do you want to do about that?
@dalekurt Despite them being listed in the README, we do not have working support for the security
or identity
accounts. I suggest you stick to the list of accounts enabled by default: https://github.com/cloudposse/reference-architectures/blob/62f20f5bf365944f54e1bbd20a85993ffbae24f6/configs/root.tfvars#L78-L85
Get up and running quickly with one of our reference architecture using our fully automated cold-start process. - cloudposse/reference-architectures
Jermey is correct we didn’t finish implementing those 2 accounts
The idea is to offload whAt we have in root to those 2
Yes, that was explained to me. However it is not the only account that is affected. I just picked that from the list.
@dalekurt Also, it looks like you set namespace = "lunarops-io"
. It should be a single word, and we try to keep it to 4-6 letters. Maybe just lunar
or rops
?
The other sub accounts are also affected by this error, as previously shared in the gist. Dev, staging, etc…
Ah ok. Could that be an issue?
Is the audit bucket named lunarops-io-audit-terraform-state
or lunarops-audit-terraform-state
?
It is
the first
I typed better at the keyboard. The audit bucket is named lunarops-io-audit-terraform-state
Would it result in a breaking change if I changed the namespace?
Yes, changing the namespace breaks everything.
We use naming conventions all over the place.
Ok, does it require manual clean up or the tf state would effect that.
I’m early in this deployment as it is, so i can handle a breaking change. I’m just hoping I don’t have to do the clean up.
And the max the namespace should be is 6-char?
The expected name of the terraform state bucket will change when you change the namespace, so terraform will not be able to automatically clean up. I think the best you can do is terraform destroy in the root account to do some of the clean up.
Do that before changing the namespace.
The length of the namespace is not strictly limited, but it is part of every identifier and the total length of some identifiers are limited, so it’s best to keep it short.
okay, can the clean up (terraform destroy
) be executed from with the container that did the deployment? Or within the project directory
Actually, now that I think of it, most of the terraform state of the bootstrap code is stored in the root directory of the checked-out repo. How many *.tfstate
files do you have there?
Unfortunately, the whole setup is a bit complicated. We run some stuff on your host to get things started, and that includes creating stuff with terraform with the state in local *.tfstate
files, then building a Docker container, then we run more terraform inside the Docker container. You probably didn’t get that far, though, because the Docker container would try to use the S3 bucket for the Terraform state.
Worst comes to worse, you can use https://github.com/rebuy-de/aws-nuke to delete everything.
Nuke a whole AWS account and delete all its resources. - rebuy-de/aws-nuke
I had to use that myself when I created a set of accounts with the wrong namespace
.
@Jeremy G (Cloud Posse) I have 10 *.tfstate
files.
I’ve heard of aws-nuke, in fact from @Erik Osterman (Cloud Posse) on Office hours.
@Jeremy G (Cloud Posse) if I choose the nuclear option, should I perform that on each account or just the root?
I’m thinking your errors happened very early on in make children
, is that right?
That is my assumption as well, It is shown as an error during the make finalize
but if it has a dependency of the s3 buckets which are created in the make children
then you are correct.
Wait, you ran make children
and it exited successfully?
Yeah
That is going to be difficult to recover from. You cannot just delete the AWS accounts, and we do not have the mechanism (as far as I know) to tell the scripts that they have already been created. @Erik Osterman (Cloud Posse) what do you suggest?
To be accurate, there was no success
message just goodbye
chamber_secret_access_key =
chamber_user_arn =
chamber_user_name =
chamber_user_unique_id =
secret_access_key = <sensitive>
user_arn =
user_name =
user_unique_id =
Skipping cloudtrail...
Goodbye
Whereas with the make root
I got
* provider.aws: version = "~> 2.31"
Terraform has been successfully initialized!
You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.
If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.
Goodbye
To be safe, I would make a backup copy of your entire repo (which will include a lot of generated artifacts by now, including the 10 *.tfstate files). Some of the artifacts you are going to need, particularly the list of account numbers, but could be erased if you are not careful.
Sounds like some DevOps surgery
I’ve made the backup
Yes, sorry, you are in a bad state. I think you are going to need to have @Erik Osterman (Cloud Posse) help you during office hours. You hit this failure where it is too late to start over but we suspect the problem is the dash in the namespace, and changing the namespace does break an awful lot of stuff.
No worries. I’m wondering if I should just do the nuke on the root and sub accounts. And re-run the make root
and make children
My hope is that you can get by with changing the config files and then resuming from make children
, but I’m not sure.
The problem is that if you run make root
from scratch, it will create another set of AWS accounts and you really want to avoid that.
gotcha
Thank you for the time, I really appreciate it..
If your are feeling daring, I can suggest you edit /configs/root.tfvars to be what you want it to be (maybe post it or a redacted version of it), and then run make root/init-resume
.
What I’m counting on here is that the names of the accounts do not change, and init-resume
is specifically for picking up after errors, and it avoids cleaning up old state.
What I’m not sure of is whether or not it will really create new AWS accounts or not. It should not, but this is not well tested code.
Thanks for the suggestion
I will tell how I feel tomorrow
OK, I will leave you with this tip. Deleting AWS accounts is a long, painful process, because AWS does not want to be on the hook for deleting stuff that it cannot get back. One key thing is that the email address associated with the account will be _forever_ associated with that account. You will not be able to create a new account with that email address and you will not be able to change the email address later. So before you delete an account, change the email address to something you can consider a throwaway. Gmail and some other providers allow you to add “+anything” to your username to create a unique email that still routes to you, so I suggest you do that, using “+identity-del-201910” or something like that.
I wish I had know I could that a couple days ago before nuking the accounts. Thanks for the tip
Deleting AWS accounts is a long, painful process, because AWS does not want to be on the hook for deleting stuff that it cannot get back. One key thing is that the email address associated with the…
I will deploy the changes to the namespace later today, as well update the email addresses assigned to the existing sub-accounts. I may have to crosswalk this out on paper first so that I don’t miss a step for clean up.
If the accounts are still in the waiting period for deletion, you can probably still change the email address. And if you cannot, you can stop the deletion process, change the email address, and start it again, so long as they are not already fully deleted.
I will check the emails for the cancellation
I’ve submitted a case to re-open the suspended accounts, then I will update the email addressed for those including the accounts I recently created. Then have them terminated.
@Jeremy G (Cloud Posse) Once that is done, I will go ahead and re-deploy using the new namespace lunar
, this I could get away with lunarops
my OCD is going to bug me
Phew I just went through all the accounts replying to the AWS case that was opened. So they should be reinstated soon enough.
@Jeremy G (Cloud Posse) I took your advise and updated the email addresses for the previously suspended AWS accounts. I will do the same from the second set of accounts I had created, then request a termination.
Once that is done I will re-deploy with the new namespace
@dalekurt before deleting the current set of accounts, I suggest you attempt some of the surgical options. If they work, that saves you deleting the accounts. If they mess up horribly, you can still resolve it by nuking and deleting the accounts.
sounds good.
@Jeremy G (Cloud Posse) Here is the plan, update the namespace to lunarops
and re-run the make root
.
No, don’t rerun make root
!
ok
Update the namespace in root.tfvars
Then run make root/init-resume
Ah ah.. okay let me give that a go.
I got a brief output of it.
terraform apply \
-var-file=artifacts/aws.tfvars \
-var-file=configs/root.tfvars \
-auto-approve \
-state=root.tfstate \
accounts/root
...
module.account.module.render.local_file.data[1]: Creation complete after 0s (ID: 6c6649ca2a9e8400634a22bb826e5ae0f6d01543)
module.account.module.render.local_file.data[0]: Creation complete after 0s (ID: ca981bf46f94b8366ea80bcf0f00466c62199b2e)
Apply complete! Resources: 2 added, 0 changed, 2 destroyed.
Outputs:
docker_image = cloudposse/root.lunarops.io:latest
terraform output -state=root.tfstate docker_image > artifacts/root-docker-image
make children
...
===============================================================================================
ARN for lunarops-dev-admin not found in /artifacts/.aws/config
* Please report this error here:
<https://github.com/cloudposse/reference-architectures/issues/new>
Goodbye
make: *** [dev/provision] Error 1
Not entirely relevant, but did you remove security
and identity
from the accounts_enabled
list? If not, please do.
Verify that artifacts/accounts.tfvars
has the correct account numbers
You have a backup of everything, right? In which case, delete everything under repos/
and try make root/init-resume
again
@Jeremy G (Cloud Posse) Thanks for that. So, I did just that. I removed security
and identity
from the accounts_enabled
in configs/root.tfvars
.
As well, I removed them from the artifacts/accounts.tfvars
and confirmed that account IDs are correct.
Then removed everything from the repos/
directory path and executed make root/init-resume
which create the repos/root.lunarops.io
dir
@Jeremy G (Cloud Posse) So i bit the bullet on this and re-deployed make root
the accounts got created successfully. make children
and make finalize
ran successfully as well. This was with the lunarops
namespace. So you were right [lunarops.io](http://lunarops.io)
namespace created issues.
I’m doing clean up now by deleting the previous accounts created.
Omg! That’s awesome. You are one persistent guy. Shows it pays off.
Oh yeah. So I have a question however, though the identity
and security
accounts are not supported.
Would there be any harm in still deploying them.
Yes, we haven’t finished supporting those
I think it should work, but they won’t do much
Here is the deal, I do want to have my users login on the identity account and write my policies in terraform to create the IAM Roles there .
So that I can isolate Root users to the root account and employees to the identity account.
I had it humming before with a manual deployment using the AWS Profile Extender extension
If that is what is it called.
AWS Extend Switch Roles extension for Chrome and Firefox
That sounds interesting
2019-10-03
@Jeremy G (Cloud Posse) Thank you, that was a pretty comprehensive response. I will have a look at this later today
2019-10-04
joys of windows update
dirname: missing operand
Try 'dirname --help' for more information.
/usr/local/bin/prod.xx.uk: line 115: cmd.exe: command not found
docker loads; just no aws-vault
i think the windows wsl interop died; god knows
no fstab entries; using the LOCAL_HOME=”C:\Users\xx\AppData\Local\Packages\CanonicalGroupLimited.Ubuntu18.04onWindows_79rhkp1fndgsc\LocalState\rootfs\home\xxx” env set before calling it works for now though lol ohhhhhhh windows
seems related to the wsl conf; its not changed but its also not working
@chrism dang!!
yea, what a PIA
its fine; its all windows fault
is it something you think you can fix? we can review the PR
I don’t think the code needs fixing tbh
The wsl process is supposed to map all the windows drives to fstab on boot; but its not working. The bootstrap script uses the fstab to find the location
“potentially” this all magically starts to work proplerly in WSL2 as the “wsl home” is a virtual drive not just a windows folder
I just need to fix my wsl to fix it
ok, report back on what you come up with
If im torching it I might as well try wsl 2 and see if it solves all the other issues (hopefully all the hacks made to get wsl 1 working wont break wsl 2 )
i backup my user dir anyway so its just the tedium of waiting for windows to complete steps purging it all out. Not quite as good as when I could use vmware (damn hyperv) and snapshot ubuntu, but it’ll do
wsl2 still requires you to be on the ‘goat sacrifice’ branch (tech preview); reinstalled wsl and the mount issue wasnt fixed. My laptops up to the same build and is fine.
Fixed it in the end with mount -t drvfs C: /c
in the /etc/wsl.conf
left the full path in my bashrc file though after testing; cuts down on the faff
dang, scared to reboot now… haven’t seen the mount issue yet on wsl myself…
$ cat /etc/wsl.conf
[automount]
enabled = true
options = "metadata,umask-22,fmask-11"
mountFsTab = true
$ cat /etc/fstab
LABEL=cloudimg-rootfs / ext4 defaults 0 0
C: /mnt/c drvfs rw,metadata,noatime,case=off,uid=1000,gid=1000,umask=22,fmask=11 0 0
2019-10-07
My laptops been fine; its weird af.
tbh if you just shove the fstab setting back in it works fine; It was only C that went to pot; my other shared drives worked fine.
There are no events this week
:zoom: Join us for “Office Hours” every Wednesday 11:30AM (PST, GMT-7) via Zoom.
This is an opportunity to ask us questions about geodesic
, get live demos and learn from others using it. Next one is Oct 16, 2019 11:30AM.
Register for Webinar
#office-hours (our channel)
@Erik Osterman (Cloud Posse) Where did this come from exactly? The time zone is half wrong. It should say PDT, GMT-7
2019-10-08
FYI I’ve gutted the way I use Geodesic. Instead of one module per account I now use the following structure
└── terraform
├── .envrc
└── aws
├── .envrc
├── bastion
│ ├── .envrc
│ └── Makefile
├── development
│ ├── .envrc
│ └── account
├── production
│ └── .envrc
└── shared
├── .envrc
├── .module
├── .terraform
├── Makefile
└── terraform.tfvars
If anyone is keen on reducing the maintenance of several modules and only holding one I can give some practices
for clarification - how to use geodesic with a monorepo, right?
Yes.
Before my monorepo was:
geodesic:
- dev
- staging
- prod
- bastion
that looks good - but i would also add region
if you want to future proof yourself some more
also, I think I would do terraform/stage/cloud/region
e.g terraform/prod/cloudflare
and terraform/prod/aws/us-east-1
seems natural to bundle everything for prod together.
are you using github?
True I was thinking about this
if you want to future proof yourself some more
No this is on our internal VCS
are you using github?
aha
x)
i have some thoughts on the mono repo + geodesic + github actions
another thing to consider….
so terraform as a community seems to be going more the way of workspaces
Yes I have also shifted to workspaces
mainly to conform with terraform cloud
right
but I obviously have my IP Whitelist issue
but not workspaces as in terraform workspace add
just rather
so then i’m thinking the stage folder makes less sense - as it doesn’t help you be as DRY
├── terraform-aws-app
│ ├── README.md
│ ├── eks.tf
│ ├── environments
│ │ └── development
│ ├── terraform.tf
│ ├── variables.tf
│ ├── versions.tf
yea, something like that
Feature Request Terraform to conditionally load a .tfvars or .tf file, based on the current workspace. Use Case When working with infrastructure that has multiple environments (e.g. "staging&q…
@sarkis shared this which piqued my interest
using yamldecode
you can now define your own configs for environments
I’ve seen that
then load those based on the workspace name
but I’ve got a super duper thing going on
it’s hard to type out
but hopefully I can make office hours tomorrow
yes!
i want to see it.
The gist is:
- One geodesic module for all envs.
└── terraform ├── .envrc └── aws ├── .envrc ├── bastion │ ├── .envrc │ └── Makefile ├── development │ ├── .envrc │ └── account ├── production │ └── .envrc └── shared ├── .envrc ├── .module ├── .terraform ├── Makefile └── terraform.tfvars
- Terraform infra definitions (root modules that uses child modules) has ‘workspace’-like configs
├── terraform-aws-app │ ├── README.md │ ├── eks.tf │ ├── environments │ │ └── development │ ├── terraform.tf │ ├── variables.tf │ ├── versions.tf
- Important files like Makefiles to operate Terraform & terraform.tfvars are loaded at ‘terraform init’ run time
so this means that you don’t have to keep rebuilding the geodesic shell
it just pulls them from git at use-time meaning you always get the latest
And any variables set in Geodesic are extremely static and very unlikely to change (e.g. Account ID, VPC ID)
arguably VPC ID isn’t even needed anymore as I get that from DATA, that’s just a legacy variable
Could we bump https://github.com/cloudposse/packages/blob/master/vendor/terraform-0.12/VERSION to 0.12.10 ?
Cloud Posse installer and distribution of native apps, binaries and alpine packages - cloudposse/packages
2019-10-09
I have a suspicion, but what is the reason behind the layers of indirection to clone the build-harness? i.e. https://github.com/cloudposse/geodesic/blob/master/Makefile#L7 => https://github.com/cloudposse/build-harness/blob/master/templates/Makefile.build-harness => https://github.com/cloudposse/build-harness/blob/master/bin/install.sh
Geodesic is a cloud automation shell. It's the fastest way to get up and running with a rock solid, production grade cloud platform built on top of strictly Open Source tools. ★ this repo! h…
Collection of Makefiles to facilitate building Golang projects, Dockerfiles, Helm charts, and more - cloudposse/build-harness
Collection of Makefiles to facilitate building Golang projects, Dockerfiles, Helm charts, and more - cloudposse/build-harness
to essentially clone the build-harness and include the targets?
With hundreds of repos we wanted a simple one-liner to import the build-harness that was future proofed so we didn’t have to copy pasta a lot of boiler plate and try to keep it up to date.
Also note that git.io no longer supports vanity short links
If this isn’t in a public Github repo, it is going to require a token to clone the https repo, as well as already having an ssh key…
No token needed to clone public repos over https
But yes if you wanted to make it private it would require a token
Yup, wondered how much work to do this all over ssh rather than https
Guess you could do this instead of curl
ah, that maybe the magic I’m after
2019-10-10
2019-10-11
@oscar If you want to have Geodesic install the latest APKs during the build process, first run make geodesic_apkindex.md5
to invalidate the Docker cache layer that caches the old APK indexes.
Thanks!
2019-10-14
@Jeremy G (Cloud Posse) - I’ve updated to the latest Geodesic (0.123.1), and tried both terraform@cloudposse
and terraform-0.12@cloudposse
and still end up with TF v0.12.7
any thoughts?
FROM cloudposse/geodesic:0.123.1
RUN apk add --update --no-cache \
terraform@cloudposse
No existing containers running either
N’aaww I got it. Don’t worry. User error to do with changing the docker image name and not adding it to the bin, meaning make run
was still using the old image.,
There are no events this week
:zoom: Join us for “Office Hours” every Wednesday 11:30AM (PST, GMT-7) via Zoom.
This is an opportunity to ask us questions about geodesic
, get live demos and learn from others using it. Next one is Oct 23, 2019 11:30AM.
Register for Webinar
#office-hours (our channel)
I did try the make *.md5
command though and that came up with no target found.
make geodesic_apkindex.md5
is a safer bet, but either way you have to be in the geodesic
root directory
As of my upgrade to 0.123.1
from 0.122.4
I’m no longer able to run terraform init
using the TF_MODULE_CACHE setting (as per logic in Geodesic’s terraform
script.. below…)
# Starting with Terraform 0.12, you can no longer initialize a directory that contains any files (not even dot files)
# To mitigate this, define the `TF_MODULE_CACHE` variable with an empty directory.
# This directory will be used for `terraform init -from-module=...`, `terraform plan`, `terraform apply`, and `terraform destroy`
if [ -n "${TF_MODULE_CACHE}" ]; then
export TF_CLI_PLAN="${TF_MODULE_CACHE}"
export TF_CLI_APPLY="${TF_MODULE_CACHE}"
export TF_CLI_INIT="${TF_MODULE_CACHE}"
export TF_CLI_DESTROY="${TF_MODULE_CACHE}"
fi
}
The only thing changed is package versions of terraform#
I’ve also reviewed terraform releases and can’t see what could have done this between 0.12.4 and 0.12.10
TF_VAR_tf_module_cache=.module
TF_CLI_ARGS_init=-backend-config=region=eu-west-1 -backend-config=bucket=development-terraform-state -backend-config=dynamodb_table=development-terraform-state-lock -from-module=git::[email protected]:xcv.git?ref=master
TF_MODULE_CACHE=.module
Ok appears to be because I lost my use terraform
along the way in my .envrc
Yarp. Re-added
use terraform
use tfenv
and all gucci on latest geodesic and just cloudposse@terraform
instead of cloudposse@terraform_0.12
excellent!
I’m trying to ssh-add /path/to/private/key
inside of BitBucket pipelines however I get:
> ssh-add /opt/atlassian/pipelines/agent/ssh/id_rsa
Could not open a connection to your authentication agent.
Am I missing something? Is this BB P thing e.g. non exposed service etc
looks like your ssh agent is not started
we only start it by default in interactive sessions
ahhh
gotcha
Geodesic is a cloud automation shell. It's the fastest way to get up and running with a rock solid, production grade cloud platform built on top of strictly Open Source tools. ★ this repo! h…
source <(gosu ${ATLANTIS_USER} ssh-agent -s)
ssh-add - <<<${ATLANTIS_SSH_PRIVATE_KEY}
# Sanitize environment
unset ATLANTIS_SSH_PRIVATE_KEY
Geodesic is a cloud automation shell. It's the fastest way to get up and running with a rock solid, production grade cloud platform built on top of strictly Open Source tools. ★ this repo! h…
here’s how we do it for atlantis
just run
# Otherwise launch a new agent
if [ -z "${SSH_AUTH_SOCK}" ] || ! [ -e "${SSH_AUTH_SOCK}" ]; then
ssh-agent | grep -v '^echo' >"${SSH_AGENT_CONFIG}"
. "${SSH_AGENT_CONFIG}"
# Add keys (if any) to the agent
if [ -n "${SSH_KEY}" ] && [ -f "${SSH_KEY}" ]; then
echo "Add your local private SSH key to the key chain. Hit ^C to skip."
ssh-add "${SSH_KEY}"
fi
ah nice
you could maybe do sometihng similar
thanks
what’s gosu
very easy utility to drop permissions to a user
easier than using sudo
or su
super thx
you may not need it
2019-10-20
I’m having some trouble decrypting the IAM user password created. Using the decrypt command echo "MY_PGP_MESSAGE" | base64 --decode | keybase pgp decrypt
I get the error message “ERROR decrypt error: unable to find a PGP decryption key for this message”
Am I missing something here?
Are you running that natively on your Mac?
Yes, on my mac.
Sounds like something wrong with keybase install
Or using wrong keybase username
That latter was my assumption.
2019-10-21
There are no events this week
:zoom: Join us for “Office Hours” every Wednesday 11:30AM (PST, GMT-7) via Zoom.
This is an opportunity to ask us questions about geodesic
, get live demos and learn from others using it. Next one is Oct 30, 2019 11:30AM.
Register for Webinar
#office-hours (our channel)
2019-10-22
@oscar you have this problem too? https://github.com/hashicorp/terraform/issues/17300
Terraform Version Terraform v0.11.3 + provider.aws v1.8.0 Terraform Configuration Files # aws-stack/backend.tf terraform { backend "s3" { bucket = "my-project" key = "state…
Looks like there’s no way to elegantly work around it
other than creating a symlink in the .module/.terraform
to -> .terraform
the -state=...
argument is useless
Yep!
=(
2019-10-23
Starting to feel like TF init from module is going to be made harder and harder
Not sure…still not really a fan of the terragrunt wrapper style
Pick yer poisons:
terraform
lacks interpolation inbackend {
blocks.terraform
(in 0.12) has materially altered what’s allowed in the working directory wheretf init
is issued.
terragrunt
takes care of those issues for me.
and I really appreciate the simple two-layer cakes it bakes for me….
- layer 0 is the base: whatever I put in
terraform { source=...
, typicallysource = "git::[email protected]:myOrg/myTerraformRepo//aws-blueprints/${path_relative_to_include()}?ref=master"
- layer 1 is the environment config: all files in the directory where I run
terragrunt init
get “rsync’ed” on top of layer 0. Symlinks in layer 1 are resolved before rsync. Filenames in layer 1 overwrite identically named files that exist in layer 0.
I’d offer that the ability to remove (w/ a zero-length file) or replace an entire .tf
file in your base module is this arrangement’s super power.
It’s way better (imho) than dealing with conditional logic for stuff like:
count = var.is_this_module_enabled ? 1 : 0
I have moved to Terraform Cloud to avoid these issues
How does TF cloud help with these types of things? Not using geodesic there?
Yes, I feel that way too
What would be another option if not using tf init from module…
Yea, I think using terrafile
or terragrunt
terragrunt
orchestrates the -from-module
process
there’s bazel which clone dependencies
I like the “just clone this external repo at this version” nothing too smart
From there you can still bare terraform, which you wouldn’t be able to with terragrunt
here’s something you can run with:
it still uses the terraform init -from-module=...
but that could be replaced with “just clone this external repo at this version” and run with it
I know you are iffy about make
but this looks pretty nice to me…
Recovered Recording at Wed Oct 23 2019 1524 GMT-0700 (Pacific Daylight Time)
Nice video
I’m a big fan of make, but feels like a traditional wrapper script, may as well be a bash script/terragrunt. Realise in this case it is being used just for init
I wonder if a straight up git clone seems a bit more understandable across a big team
That needs to be stored as a configuration
A git clone doesn’t have that
Take for example EKS in 2 regions
The current way geodesic is setup this would all be in the same region as far as state goes though, I’m trying to think of the least intrusive way of swapping out cli init from module, and a git clone in the use_terraform function came to mind
IIIRC geodesic doesn’t currently clone a root module more than once https://github.com/cloudposse/testing.cloudposse.co/blob/master/conf/cloudtrail/.envrc#L2
Example Terraform Reference Architecture that implements a Geodesic Module for an Automated Testing Organization in AWS - cloudposse/testing.cloudposse.co
You could, but would the extra logic be worth it? Keep it simple
Too much magic, and the balance tips to tools like terragrunt
We clone each individual root module pinned to a version every time we need it
Otherwise is forced upgrades to other projects
A single clone for all projects is not a stable approach
A single clone for each project pinned to a version is the way to achieve stability
Right, agree that is needed
Given the current approach, could a git clone in https://github.com/cloudposse/geodesic/blob/master/rootfs/etc/direnv/rc.d/terraform (or akin) get the job of init from-module
, this gets run in the context of a single root module pinned to a release version. Not saying git clone terraform-root-modules/$VERSION for all your root modules at that version.
Geodesic is a cloud automation shell. It's the fastest way to get up and running with a rock solid, production grade cloud platform built on top of strictly Open Source tools. ★ this repo! h…
git clone X && cd X
but direnv doesn’t support cd
isn’t that solved with source_up
?
What do you mean “direnv doesn’t support cd
”? Of course it does.
Move a terraform-root-modules style module into the env containers?
That makes sense
Still use direnv and ‘use terraform’ for setting backend etc
But not init from module
Instead have another top level almost stub module…would need to pass vars etc down it, bleurgh
Or something like https://github.com/coretech/terrafile/blob/master/README.md
A binary written in Go to systematically manage external modules from Github for use in Terraform - coretech/terrafile
Which is already in geodesic iirc
I tried that the other day, perhaps I didn’t do it right though
here is a question for the channel, after successfully completing the deployment of the reference-architecture
and I’m on to the next steps to commit the /repos
for the accounts created. Should I be committing each to their individual git repository, for example /repos/audit.mycompany.com
to a [github.com/mycompany/audit.mycompany.com](http://github.com/mycompany/audit.mycompany.com)
?
yep!
that’s how we do it
we treat each account as an application
no 2 accounts are ever the same
They may run the same software pinned at versions
But usually many things vary
@Erik Osterman (Cloud Posse) Today’s office hours really opened my eyes on this. I was just talking about it to my girlfriend. I know she has not idea what I am talking or even cares LOL
I have so many questions!
Haha, probably not a “captive” audience
lol
She does a pretty good job listening to me ramble on. :slightly_smiling_face: So get this, I am looking at creating some policies at the org level (SCP) and I started writing a few using the aggregator module, but you blew that our of the water after today. As now, I think it can be done much easier. So I’m going to start playing around geodesic, the documentations and that video you shared in the office-hours channel to do just that. I know the reference architecture does not yet support identity
account but I still went ahead and created it because it is a great idea to separate the user authN and authZ there.
Then i want to start writing other policies for the IAM groups, etc.
2019-10-24
@tamsky perhaps, but i’m personally more interested in optimizing for terraform cloud
I’m getting started with Geodesic and I created my CLUSTER_NAME env var
export CLUSTER_NAME=identity.mycompany.co
cd $CLUSTER_NAME
make init
make docker/build
After the make docker/build
I got a build error trying to copy the rootfs
Removing intermediate container 9a31e8a4ae01
---> e926f71ba169
Step 25/26 : COPY rootfs/ /
COPY failed: stat /var/lib/docker/tmp/docker-builder620832852/rootfs: no such file or directory
make: *** [docker/build] Error 1
I created the rootfs
directory in my /path/to/identity.mycompany.co/
dir path and executed make docker/build
and moved on to run
docker run cloudposse/identity.mycompany.co | bash -s latest
Then I was able to execute [identity.mycompany.co](http://identity.mycompany.co)
from the CLI and got into the Geodesic.
My question is what should be in the rootfs
dir path
Figured it out from the history.
On to the next, I’m trying to get aws-vault
setup
Just run “aws-setup”
2019-10-25
@Erik Osterman (Cloud Posse) Should aws-setup
be done in the geodesic shell or on the local
in the shell
aws-vault inside the shell actually stores the file store in /localhost/ which will map to your home directory. i use the same aws-vault on my localhost for legacy stuff that’s not run inside geodesic yet. works well
My team is currently talking about building a container similar to Geodesic. Starting out it will just be a toolbox of the tools we need for dev and CI. It will likely be used by other teams within my company eventually. I am arguing for giving it a fun name, a la Geodesic, but they want something descriptive, and admittedly boring, like “devsecops-build-tools”.
Thoughts? Any ammunition I can take back and say, see; here is why we should give it a good name.
we’re mostly debating in good fun, but I am in the camp that the name is gonna matter someday soon
Giving something a name makes it matter. It’s important. You want it to stay. You name your dog Sir Licksalot because of that one hilarious moment when he trampled your aunt Marge and licked her face. You DON’T name your servers because they are cattle, and you will kill them at a moment’s notice
Giving it a descriptive name is pigeon holing it to being one thing and that one thing will morph
You don’t know what that will be today
A descriptive name would be terraform-kops-hemfile-kubectl-etc and get out of date
Build tools only conveys one usage
They won’t use it for triage?
We call geodesic a “cloud automation shell”
Maybe you can have some fun with names of clouds, weather systems, Greek gods of weather (if any?), etc
Loki (Old Norse: [ˈloki], Modern Icelandic: [ˈlɔːkɪ], often Anglicized as ) is a god in Norse mythology. Loki is in some sources the son of Fárbauti and Laufey, and the brother of Helblindi and Býleistr. By the jötunn Angrboða, Loki is the father of Hel, the wolf Fenrir, and the world serpent Jörmungandr. By his wife Sigyn, Loki is the father of Narfi and/or Nari. By the stallion Svaðilfari, Loki is the mother—giving birth in the form of a mare—to the eight-legged horse Sleipnir. In addition, Loki is referred to as the father of Váli in Prose Edda, though this source also refers to Odin as the father of Váli twice, and Váli is found mentioned as a Son of Loki only once. Loki’s relation with the gods varies by source; Loki sometimes assists the gods and sometimes behaves in a malicious manner towards them. Loki is a shape shifter and in separate incidents he appears in the form of a salmon, a mare, a fly, and possibly an elderly woman named Þökk (Old Norse ‘thanks’). Loki’s positive relations with the gods end with his role in engineering the death of the god Baldr and Loki is eventually bound by Váli with the entrails of one of his sons. In both the Poetic Edda and the Prose Edda, the goddess Skaði is responsible for placing a serpent above him while he is bound. The serpent drips venom from above him that Sigyn collects into a bowl; however, she must empty the bowl when it is full, and the venom that drips in the meantime causes Loki to writhe in pain, thereby causing earthquakes. With the onset of Ragnarök, Loki is foretold to slip free from his bonds and to fight against the gods among the forces of the jötnar, at which time he will encounter the god Heimdallr and the two will slay each other. Loki is referred to in the Poetic Edda, compiled in the 13th century from earlier traditional sources; the Prose Edda and Heimskringla, written in the 13th century by Snorri Sturluson; the Norwegian Rune Poems, in the poetry of skalds, and in Scandinavian folklore. Loki may be depicted on the Snaptun Stone, the Kirkby Stephen Stone, and the Gosforth Cross. Loki’s origins and role in Norse mythology, which some scholars have described as that of a trickster god, have been much debated by scholars. Loki has been depicted in or is referenced in a variety of media in modern popular culture.
loki
- god of mischief (which is the fun part). Does almost everything (which is useful). The name is short and easy to remember and pronounce, and sounds like lock
(which is good for security)
the name of my team is Team Vulcan
ok, so you already have one god
I agree with @Erik Osterman (Cloud Posse) that a descriptive name is a terrible idea. (So does the Buddha.) When you label something, you deny everything else that it is. That is bad enough with a fixed thing, but really, really bad with an evolving thing.
2019-10-26
FYI: The issue I experienced with Keybase not being able to decrypt the IAM user password which resulted in “ERROR decrypt error: unable to find a PGP decryption key for this message”. I was able to fix this from the initial Macbook Pro by copying the files from one MacBook Pro to the other. The executing a restart of Keybase keybase ctl restart
~/Library/Application Support/Keybase/secretkeys.<username>.mpack
~/Library/Application Support/Keybase/config.json
I’ve seen this error when the private key is not imported into keybase’s local store. For reference, if you want to import an existing key from the GPG keyring, use keybase
pgp select --import
Thanks for reporting
2019-10-27
I just realized that working with the geodesic shell, I created files and directories in the /conf
dir path are not persistent. Did I miss a step?
No, /conf
is meant not to be persistent. It should match the git repo for itself, and it stays built and up to date for use in your CI/CD stuff. I assume you’re trying to figure out local development?
@Alex Siegman Yes, I am.
Your host home dir is mounted into geodesic at /localhost/
Yes. So, I should work within the geodesic shell within that dir path.
For local development
But not as a workflow with staging and production, for example
2019-10-28
@Erik Osterman (Cloud Posse) so all my development would be done via the geodesic shell within the dir path /localhost
then I can commit those tf files to git for the automated workflow?
So say I have my local development in dir path ~/Workspace/my-company/identity.my-company.com/
from within the geodesic shell I could just work within dir path /localhost/Workspace/my-company/identityt.my-company.com/conf/
, make sense?
I think you might find your workflow is made easier by adjusting the last profile script to cd
to your dev dir:
# echo "cd /localhost/Workspace/my-company/identity.my-company.com/conf" > rootfs/etc/profile.d/zzz-cd-to-workspace.sh
# chmod +x rootfs/etc/profile.d/zzz-cd-to-workspace.sh
And then rebuild your geodesic image with that new file.
That way when you execute the geodesic
wrapper, you hop directly to a shell with CWD set to where you want to be.
I think you can also get away with adjusting the Dockerfile
-WORKDIR /conf
+WORKDIR /localhost/Workspace/my-company/identity.my-company.com/conf
Geodesic is highly run-time customizable. I recommend you use the customization hooks rather than editing the Dockerfile.
Also note that by default Geodesic creates a symbolic link to /localhost that matches the host path to that directory, so you can use host pathnames inside Geodesic. Tlide doesn’t work because you are a different user, but fully qualified path names work. More info here:
what In addition to some small cleanups and additions, provide a capability for users to customize Geodesic at runtime. why Because people vary, their computers vary, what they are trying to accomp…
There are no events this week
:zoom: Join us for “Office Hours” every Wednesday 11:30AM (PST, GMT-7) via Zoom.
This is an opportunity to ask us questions about geodesic
, get live demos and learn from others using it. Next one is Nov 06, 2019 11:30AM.
Register for Webinar
#office-hours (our channel)
2019-10-29
@Erik Osterman (Cloud Posse) – fyi - yesterday I built geodesic
without any python2.
Admittedly I had to chop out support for crudini
in a few places, but other than that it looks good.
I tried and failed to remove Python 2 a while ago.
We rely on crudini
quite a bit in managing AWS credentials. Also, as explained in requirements.txt
, we are stuck with PyYAML 3.13 until awsebcli
upgrades. So we’re keeping Python 2 for a while.
Yes, I saw those comments – in my case, with crudini
out of the picture (my images currently rely on AWS_DEFAULT_PROFILE
being set in the Dockerfile
) it worked.
I also think the boto==2.49.0
requirement was pulling in Python 2.
was this to reduce the size?
(i wish there was a robust go alternative to crudini)
&& this quirk
FROM python:3.7-alpine as python
[stuff}
RUN pip install -r /requirements.txt --install-option="--prefix=/dist" --no-build-isolation --no-cache-dir
# For some reason dateutil/python_dateutil ignores our --prefix and gets
# installed into /usr/local/lib/python3.7/site-packages.
RUN mv /usr/local/lib/python3.7/site-packages/dateutil/ /dist/lib/python3.7/site-packages/
RUN mv /usr/local/lib/python3.7/site-packages/python_dateutil-*.dist-info/ /dist/lib/python3.7/site-packages/
what about the aws
cli?
I think I unpinned awscli
and boto
in requirements.txt
I did it for sanity… one python should be enough for anyone
oh yes, one more later on
COPY --from=python /dist/ /usr/local/
RUN ln -s ../../bin/python3 /usr/local/bin/python
2019-10-31
Has anyone gotten SAML auth with google apps working?
i’ve tried and failed. When testing to see if the role is assumed, it succeeds, but when running a terraform plan, it says that it can not find the credentials.
I added saml2aws by adding this to rootfs/etc/profile.d/aws-saml2aws.sh
I’ve also added this to the dockerfile to provide the binary.
Not sure how to completely un-rig aws-vault successfully, as it doesn’t work with saml.
What are your interactive test results of the above? How are you testing/invoking assume-role
?
Once the container is started, typing assume-role
returns
* Assumed role arn:aws:iam::1123456789:role/admin-role
which is otherwise correct…. It also sets the env var for TF about the assumed role that we want
but when running terraform apply, terraform itself says it can’t get the credentials.
The authentication flow is saml2aws script
gets the temporary credentials and session tokens and returns them to ENV
when running terraform apply, I get authorization response codes that make it fail, as it can’t assume the role.
in my ~~~/.aws/config I have a profile which has the admin role in it, along with the credential source saml (used by saml2aws in ~~~aws/credentials)
if that answers your question
after you run assume-role
successfully – are the expected environment vars set?
CLI tool which enables you to login and retrieve AWS temporary credentials using a SAML IDP - Versent/saml2aws
I’m suspecting your functions don’t exec a new $SHELL
you may be 100% correct. Wow let me test that
it seemed like somethign along those lines, as from my native shell it worked , but only within the shell of the terraform run it failed
assume-role $AWS_DEFAULT_PROFILE /bin/bash
?
following up in a few with result
Initializing the backend...
Error configuring the backend "s3": No valid credential sources found for AWS Provider.
Please see <https://terraform.io/docs/providers/aws/index.html> for more information on
providing credentials for the AWS Provider
Please update the configuration in your Terraform files to fix this error
then run this command again.
ran env
We’re getting closer
I added an env var that should have picked up the contents of /localhost/.saml2aws/config.json
by default. Aside from that, i don’t mind manually doing this inside of the container every time. It just doesn’t seem to work but is close.
definitely seems related to the shell.
now assume-role is not found, leading me to believe the profile.d needs to be sourced
last update for a while. I exited the container shell entirely. Didn’t rebuild the container, notice on container startup it is able to assume to role, but when we switch shells to the workspace, that is when it fails
adding @Tega McKinney as he may have run into this problem before. Good exercise to troubleshoot this @tamsky I’ll look more into this tomorrow, but not going to let this stop me for right now.
Thank you!
Hrmm… maybe a bug? I know we had this working since we support Okta
we didn’t implement GSuite b/c there wasn’t a nifty aws-vault, aws-okta, esque cli tool (in go!)
I haven’t figured out the root cause on this yet. There is some conflict when running saml2aws (v2.18.0) in the container vs on my laptop.
Same command works on my laptop but fails in the container: saml2aws exec --exec-profile <profile> aws sts get-caller-identity
Fails with error: An error occurred (InvalidClientTokenId) when calling the GetCallerIdentity operation: The security token included in the request is invalid.
exit status 255
Interestingly enough, I can use aws --profile <profile> sts get-caller-identity
in the container and it works.
does saml2aws open up a webserver?
a lot of these tools do, so that the callback url can be <http://localhost:...>
And since the container is not listening on that port, it fails
@Erik Osterman (Cloud Posse) I do not believe this particular cli uses a local server however I believe I have found that the issue within the Geodesic shell is that this tool exports AWS_SECURITY_TOKEN=
and for some reason in the shell, it fails. If I unset the token or make it the same value as AWS_SESSION_TOKEN then the commands work.
Are you aware of anything in the Geodesic shell that would cause this? I haven’t been able to replicate that behavior in any other Docker container.
There should always be a pair of session tokens with security tokens when doing STS
Probably easier to scrernshare and see if we can get to the bottom of it
Does $HOME/.saml2aws
need to be a symlink to /localhost/.saml2aws
?
The maintainers are resolving the AWS_SECURITY_TOKEN issue and adding it to the exports; that was a regression.
I did mount .saml2aws
.
I believe this should allow us to continue however I was just wondering why that missing ENV would cause an issue in Geodesic. I created a simplified Docker image with the rootfs scripts but all the Geodesic packages and it worked without the AWS_SECURITY_TOKEN` ENV.
I upgraded to the latest Geodesic shell and it works (without AWS_SECURITY_TOKEN
set).
The only problem I am getting now when I run assume-role profileName /bin/bash
, I get the following in the terminal:
bash: _kube_ps1_update_cache: command not found
bash: _direnv_hook: command not found
bash: prompter: command not found
Any thoughts?
Appears that none of the tfenv
functions are running
The issue seems to happen when I run assume-role profileName /bin/bash
. If I run assume-role profileName /bin/bash -l
then it works as expected.
@Erik Osterman (Cloud Posse) quick question about https://github.com/cloudposse/geodesic/blob/master/rootfs/etc/make.conf
After poking at it for a bit, I’ve come to believe the file is 100% unused – The installed gnu-make
does not use it.
Geodesic is a cloud automation shell. It's the fastest way to get up and running with a rock solid, production grade cloud platform built on top of strictly Open Source tools. ★ this repo! h…
all these refs can go, imho:
GitHub is where people build software. More than 40 million people use GitHub to discover, fork, and contribute to over 100 million projects.
But you missed this: https://github.com/cloudposse/geodesic/blob/65b76a9958a558a1241d0a766ceabd4f79832a75/rootfs/usr/local/bin/make
Geodesic is a cloud automation shell. It's the fastest way to get up and running with a rock solid, production grade cloud platform built on top of strictly Open Source tools. ★ this repo! h…
/me shakes fist at github search
Geodesic is a cloud automation shell. It's the fastest way to get up and running with a rock solid, production grade cloud platform built on top of strictly Open Source tools. ★ this repo! h…
that was my search
I was incredibly frustrated that gnu-make didn’t support a global makefile
config
it does. env var MAKEFILES
GNU make: MAKEFILES Variable
the problem is it doesn’t support wildcards
hrm
well, I guess it would work this way:
MAKEFILES=/etc/make.conf Makefile
kind’a scratching my head though. i feel like i must have tried that and abandoned it for some reason. But hey, if that works
I didn’t include rootfs/usr/local/bin/make
in my setup – so this hasn’t come up yet, or bitten me at all, I’m just pointing out it was “odd”.
My setup also doesn’t need the build-harness
so any dependency that has on this custom make setup – hasn’t bit me yet either.
yea, zero deps on build-harness inside geodesic
are you attempting to create a smaller base image?
kind of. I’m trying to strip it down for simplicity’s sake – hopefully giving it more curbside appeal when introducing it to other team members.
cool
“minimalist”
interested to see what you come up with
It’s working for me. I fell instantly in love with s3 mount
yesterday when I discovered it.
nice!
also works with mount -a
good to know
we hook into the fstab
so it works the linux-way
Where I’m at now has a dozen or so “blueprint repos” that get deployed by a Jenkins CI server. So the version of each blueprint deployed to an environment sits in a json file in S3. By being able to easily grab and parse the file, I’m able to setup terragrunt.hcl to spec out the correct
terraform { source = "private-git-repo//modules?ref=vXX.YY.ZZ" }
The sanity this provokes is very calming. Thank you & the rest of the CP gang.
are you pushing the terragrunt.hcl
to S3 as an artifact?
I build all of them locally, on the fly, via a rootfs/conf/Makefile
( I’m slightly sheepish to admit I chose that path )
Am also doing something similar to this, but have avoided terragrunt so far :D
It’s really cool to see
@rms1000watt is looking to do something with this. @oscar has his way of doing it. @rohit.verma switched it to Ubuntu.
Wish there was a way to reconcile all of it.
one of the hardest things I’ve still got is, having split your TF into multiple states (sometimes 20+ for a single environment), how do you sanely manages the dependencies for these? And don’t say terragrunt apply-all
with deps, cause it don’t work
Does terraform enterprise help with this at all?
Is make
up to the task of determining the order? You’d have to specify the dependencies manually, but you’d need to know those in advance when deploying from scratch… So it’s probably something you already know.
But that implies all in the same run, as terragrunt apply-all works
You could mix plan/apply logic in make but ….. will get nasty
I don’t beleive TFE helps with the chain of dependencies.
(there’s also astro
by Uber)
Yeah, but looks unmaintained and have already seen your use case/issue
Could be worthwhile investing in though
Interestingly, the astro examples give you a clue to how they have laid out thhmeir tf runs, and that’s important
But that implies all in the same run, as terragrunt apply-all works
I’m not seeing it quite the same way. If I’m running apply
in any particular state, make
should be verifying that all dependent state dirs have a clean plan
output… And if not, halt.
Interesting
First of all, I love the direction of this utility. It's a generalized approach to orchestrated complex, multi-phased applies for terraform. It's a nice alternative to terragrunt that's…
astro looks like it could work… but it’s same essential problem, solved the same way:
The operator of the environment is required to explicitly list all dependencies:
- in astro:
modules.deps:
- in make:
recipe: prerequisite
@joshmyers have you tried the terragrunt dependency features in the latest versions? they’ve done a lot of work around that use case recently, so it might be better than before if you haven’t looked at it in a while
Not in a long time, thanks. Will have another look!
Looks like terragrunt has added some cool new features, not quite that fix this problem though, which is an inherit terraform/state machine thing. Some interesting approaches to help with dependencies though