SweetOps #sre for January, 2021

Archive: https://archive.sweetops.com/monitoring/

2021-01-05

zeid.derhally

Anyone have any experience with https://snyk.io/ ? I’m at products security/monitoring products that we can use through the dev lifecycle.

Snyk | Developer security | Develop fast. Stay secure. attachment image

Snyk helps software-driven businesses develop fast and stay secure. Continuously find and fix vulnerabilities for npm, Maven, NuGet, RubyGems, PyPI and more.

slack1270

04:47:33 AM

(Caveat: my opinion only, doesn’t represent the views of my employer, etc.) We evaluated Snyk and ended up going with Aquasec instead because it has extensive runtime controls. Snyk was fine for just scanning images, but not as comprehensive a feature set. See also Twistlock - it was probably even better than Aquasec when we compared them, but Aquasec was a much more responsive company to deal with.

Snyk | Developer security | Develop fast. Stay secure. attachment image

Snyk helps software-driven businesses develop fast and stay secure. Continuously find and fix vulnerabilities for npm, Maven, NuGet, RubyGems, PyPI and more.

zeid.derhally

07:02:36 PM

I’d like to be able to scan a registry on new images, approve them, and also configure admission controllers to only allow scanned images

Alex Jurkiewicz

09:30:52 PM

Do you use AWS? Upload your images to an ECR registry and use its built-in security scan. (The registry can be unused otherwise.)

zeid.derhally

07:44:49 AM

We use Azure container registry.

Babar Baig

11:43:41 AM

Yes I use ECR and it’s built-in security scan. But I’ve not studied the scan results of ECR images.

Steven

01:49:22 PM

ECR scan is built on Clair. Results should be the same. Scan is after image is uploaded and available I’d recommend using trivy in your CI and only upload if scan has passed

Erik Osterman (Cloud Posse)

02:29:55 AM

One thing though with ECR scanning though is it’s only at build time. What about vulnerabilities that happen after that?

Erik Osterman (Cloud Posse)

02:31:06 AM

I don’t think it offers continuous scanning. Haven’t researched the solutions for this. Presumably there’s a lambda, but the naive approach of re-scanning daily would generate a lot of noise and be infeasible for repos with thousands and thousands of images (E.g from CI)

Alex Jurkiewicz

03:38:00 AM

You can trigger a scan of an arbitrary image in ECR at any time. To get “diff” results you’d need to store the entire result for all images yourself, rescan everything daily, compare to your “last scan” result, and alert people if they differ. A bit ugly for sure

mrwacky

08:04:03 PM

Last I checked ECR scanner can’t be tuned. For instance, you can’t say “we don’t care about this check, and here’s why”, so it’s kind of all or nothing

Erik Osterman (Cloud Posse)

08:47:55 PM

Another way of looking at this is to off load that to the escalation platform, and thus ignoring the alerts you don’t care about and creating incidents for the ones you do

zeid.derhally

08:48:32 PM

We ended up going with snyk

Erik Osterman (Cloud Posse)

08:48:47 PM

That way you still can track the MTTR of incidents but don’t lose visibility into the alerts a don’t have to expect generator of the alerts to support allow listing

Erik Osterman (Cloud Posse)

08:49:26 PM

I see snyk come up all the time and a few of our customers using them.

mrwacky

11:41:49 PM

We demoed Snyk last summer, the Docker bits were nice, the “dependabot” features had some rough edges, some other usability rough edges throughout

2021-01-06

2021-01-07

2021-01-22

joshmyers

03:58:20 PM

Anyone tried AWS Grafana/Prometheus services? Thoughts?

btai

06:48:35 PM

Anyone have suggestions for a good postgres monitoring system (inefficient sql, debugging iops spikes) that can run completely on prem?

johntellsall

09:49:07 PM

I’ve used https://pganalyze.com/postgres-analyze-query-performance with great luck. Very quickly see slow queries, zoom in/out based on time, default install hits 3 different databases. I used the Sass version but they have an onprem flavor too.

Postgres Query Analysis & Postgres Explain Plans · pganalyze

pganalyze discovers the root cause of critical issues, helps finding missing indices, and lets you optimize slow queries. Learn more!

Vugar

04:59:58 AM

This one maybe: https://github.com/darold/pgbadger

darold/pgbadger

A fast PostgreSQL Log Analyzer. Contribute to darold/pgbadger development by creating an account on GitHub.

#sre (2021-01)

Prometheus, Prometheus Operator, Grafana, Kubernetes

2021-01-05

2021-01-06

2021-01-07

2021-01-22

2021-01-23