SweetOps #airship for April, 2019

Home of Airship ECS Modules ( https://github.com/blinkist/terraform-aws-airship-ecs-service / https://github.com/blinkist/terraform-aws-airship-ecs-cluster )

Archive: https://archive.sweetops.com/airship/

2019-04-03

Mads Hvelplund

11:23:23 AM

Hi guys

I have a problem with service shown in the snippet. I deployed the service successfully earlier, but ran into problems after destroying my environment over night and reapplying it today.

After first terraform apply I get the following error:

Error: Error applying plan:

1 error(s) occurred:

* module.linkmobility.module.live_task_lookup.data.aws_lambda_invocation.lambda_lookup: data.aws_lambda_invocation.lambda_lookup: AccessDeniedException: The role defined for the function cannot be assumed by Lambda.
        status code: 403, request id: 280f95ac-e6a2-4738-935e-5b9c013e9ceb

It seems related to the timing of resource creation, but when I rerun the apply I get a new error every time I rerun:

Error: Error applying plan:

1 error(s) occurred:

* module.linkmobility.module.ecs_service.aws_ecs_service.app_with_lb: 1 error(s) occurred:

* aws_ecs_service.app_with_lb: ClientException: TaskDefinition is inactive
        status code: 400, request id: ab7ef1d7-5601-11e9-8828-d365d379c104 "linkmobility"

I think it’s because the old task defintion never gets deleted, just deactivated, and that trips up the new creation.

Mads Hvelplund

11:25:32 AM

@maarten & @Maciek Strömich: Any suggestions?

maarten

11:26:15 AM

Will try to get back to you this evening, have to work on work stuff now.

maarten

11:26:53 AM

“* module.linkmobility.module.live_task_lookup.data.aws_lambda_invocation.lambda_lookup: data.aws_lambda_invocation.lambda_lookup: AccessDeniedException: The role defined for the function cannot be assumed by Lambda.”

maarten

11:27:41 AM

interesting, let me get back to you later

Mads Hvelplund

11:33:06 AM

While I’ve been banging on the module for the last few days, I’ve seen a couple of similar situations, where something is a dependency, but isn’t available quickly enough. Re-running fixes most of them, but I’ve hit an impasse here.

Mads Hvelplund

11:34:03 AM

The thing is that deleting a taskdef doesn’t remove it, merely inactivates it. when you deactivate the last taskdef version, the entire taskdef becomes inactive, but it doesn’t disappear.

Mads Hvelplund

11:34:15 AM

looking at the docs i think it never disappears

maarten

11:34:23 AM

what you can always do for now

maarten

11:34:27 AM

is to add an env var

maarten

11:35:17 AM

I’ve ran into issues where I’ve created an ECS service with the module, after that deleted it again, but the old task definitions are more or less kept in AWS. This creates a weird irregularity

maarten

11:36:03 AM

I’m really looking fwd to implementing this the moment tf supports it: https://aws.amazon.com/about-aws/whats-new/2019/03/aws-fargate-and-amazon-ecs-support-external-deployment-controlle/

Mads Hvelplund

11:36:24 AM

i tried to force a new taskdef with force_bootstrap_container_image but I’ll try the env var now

maarten

11:36:44 AM

so for now

container_envvars  {
       FIX = "1"
  }

Mads Hvelplund

11:37:01 AM

well, you could support it now with a lambda, like you handle the live task check

maarten

11:37:36 AM

• module.linkmobility.module.live_task_lookup.data.aws_lambda_invocation.lambda_lookup: data.aws_lambda_invocation.lambda_lookup: AccessDeniedException: The role defined for the function cannot be assumed by Lambda.

this however doesn’t explain the task definition issue you have

Mads Hvelplund

11:37:41 AM

Thanks!

container_envvars         = {
    stamp = 1554291327
  }

… worked like a charm.

Mads Hvelplund

11:38:10 AM

like i wrote, the first error disappears when i rerun. i think its timing related.

Mads Hvelplund

11:38:30 AM

like maybe it takes a second for a new policy to propagate, and terraform is already trying to use it

maarten

11:38:54 AM

hm shouldn’t be!

Mads Hvelplund

11:39:39 AM

btw, i also submitted a PR to fix a bug in the lambdas. there was some python cut and paste in the error reporting that made all errors have the message “NaN”

Mads Hvelplund

11:40:01 AM

i assume it’s python, since it looked like python string building

Mads Hvelplund

11:40:36 AM

anywho, thanks for the workaround

maarten

11:41:18 AM

ah nice, thanks.

maarten

11:41:24 AM

np, any time.

2019-04-10

Mads Hvelplund

09:56:59 AM

@maarten how do you feel about https://github.com/blinkist/terraform-aws-airship-ecs-service/pull/59 ?

Minor bugfixing by mhvelplund · Pull Request #59 · blinkist/terraform-aws-airship-ecs-service

Added stack to custom error. Without it, error origin is lost. Changed Python string building to JS concatenation. String modulo string gives a NaN result every time. Respect the health check grace…

maarten

09:10:36 PM

Looks good, very clean, thanks!

Minor bugfixing by mhvelplund · Pull Request #59 · blinkist/terraform-aws-airship-ecs-service

Release notes from terraform-aws-airship-ecs-service

09:12:59 PM

0.9.3: Minor bugfixing (#59) Fixed bug in custom exception and error reporting

Added stack to custom error. Without it, error origin is lost.

Changed Python string building to JS concatenation. string modulo string gives a NaN result every time.

Cleanup.

Respect the health check grace period variable

Statement actions MUST be lists in Terraform.

blinkist/terraform-aws-airship-ecs-service

Terraform module which creates an ECS Service, IAM roles, Scaling, ALB listener rules.. Fargate & AWSVPC compatible - blinkist/terraform-aws-airship-ecs-service

Mads Hvelplund

04:55:38 AM

Is there a way to make environment vars for containers “valueFrom” instead of “value”?

Mads Hvelplund

06:09:36 AM

~~~NVM. I’ll add it and submit a PR.~~~https://github.com/blinkist/terraform-aws-airship-ecs-service/pull/61>

2019-04-25

Mads Hvelplund

09:16:52 AM

Hi @maarten. I got around to look at the drift detection you requested in the PR above. Looking at the code, I’m uncertain about how to proceed. There doesn’t seem to be any special precautions for normal environment variables. What am I missing?

Mads Hvelplund

10:39:04 AM

Looking at the lookup lambda, i can see where you get the env vars, but it doesn’t look like javascript’s describetaskdefintion returns the secrets as part of the container …

Mads Hvelplund

11:56:46 AM

I use ecs-deploy (https://github.com/silinternational/ecs-deploy) on my CI server to update the container image when the code changes. The script fetches the running taskdef and replace the image before uploading a new task def. However, if I run terraform to update something else, it detects a “change” and wants to downgrade to the last task defintion created with Terraform. I’m not sure if this is related to secrets, or ecs-deploy “scrubbing off” something that Airship uses to detect the newest image.

Any ideas?

silinternational/ecs-deploy

Simple shell script for initiating blue-green deployments on Amazon EC2 Container Service (ECS) - silinternational/ecs-deploy

2019-04-26

maarten

09:52:56 AM

hi @Mads Hvelplund so as we cannot retrieve that from the datasource we need to do something else. You can store a hash of the combined secrets-names in a label, and compare the label for drift detection.

Mads Hvelplund

09:54:27 AM

when your buildserver builds a new docker image, how do you deploy it? by running terraform, or using aws cli/api?

2019-04-29

maarten

08:23:12 PM

@Mads Hvelplund using ecs-deploy as well, and the drift detection always made sure that ecs-deploy could do its work

2019-04-30

Mads Hvelplund

07:34:44 AM

FYI: ecs-deploy doesn’t support secrets unless you run with the changes from https://github.com/silinternational/ecs-deploy/pull/179 .. or rather, it only supports Fargate containers that use secrets, without the fix

add executionRoleArn into NEW_DEF_JQ_FILTER by yu-orz · Pull Request #179 · silinternational/ecs-deploy

If "executionRoleArn" is specified for Task, ecs-deploy will result in an error and a filter will be added because it failed. An error occurred (ClientException) when calling the Register…