#monitoring (2021-01)

Prometheus, Prometheus Operator, Grafana, Kubernetes

Archive: https://archive.sweetops.com/monitoring/

2021-01-23

2021-01-22

joshmyers avatar
joshmyers

Anyone tried AWS Grafana/Prometheus services? Thoughts?

3
btai avatar

Anyone have suggestions for a good postgres monitoring system (inefficient sql, debugging iops spikes) that can run completely on prem?

johntellsall avatar
johntellsall

I’ve used https://pganalyze.com/postgres-analyze-query-performance with great luck. Very quickly see slow queries, zoom in/out based on time, default install hits 3 different databases. I used the Sass version but they have an onprem flavor too.

Postgres Query Analysis & Postgres Explain Plans · pganalyze

pganalyze discovers the root cause of critical issues, helps finding missing indices, and lets you optimize slow queries. Learn more!

2
Vugar avatar
Vugar
darold/pgbadger

A fast PostgreSQL Log Analyzer. Contribute to darold/pgbadger development by creating an account on GitHub.

2021-01-07

2021-01-06

2021-01-05

zeid.derhally avatar
zeid.derhally

Anyone have any experience with https://snyk.io/ ? I’m at products security/monitoring products that we can use through the dev lifecycle.

Snyk | Developer security | Develop fast. Stay secure. attachment image

Snyk helps software-driven businesses develop fast and stay secure. Continuously find and fix vulnerabilities for npm, Maven, NuGet, RubyGems, PyPI and more.

slack1270 avatar
slack1270

(Caveat: my opinion only, doesn’t represent the views of my employer, etc.) We evaluated Snyk and ended up going with Aquasec instead because it has extensive runtime controls. Snyk was fine for just scanning images, but not as comprehensive a feature set. See also Twistlock - it was probably even better than Aquasec when we compared them, but Aquasec was a much more responsive company to deal with.

Snyk | Developer security | Develop fast. Stay secure. attachment image

Snyk helps software-driven businesses develop fast and stay secure. Continuously find and fix vulnerabilities for npm, Maven, NuGet, RubyGems, PyPI and more.

zeid.derhally avatar
zeid.derhally

I’d like to be able to scan a registry on new images, approve them, and also configure admission controllers to only allow scanned images

Alex Jurkiewicz avatar
Alex Jurkiewicz

Do you use AWS? Upload your images to an ECR registry and use its built-in security scan. (The registry can be unused otherwise.)

zeid.derhally avatar
zeid.derhally

We use Azure container registry.

Babar Baig avatar
Babar Baig

Yes I use ECR and it’s built-in security scan. But I’ve not studied the scan results of ECR images.

Steven avatar
Steven

ECR scan is built on Clair. Results should be the same. Scan is after image is uploaded and available I’d recommend using trivy in your CI and only upload if scan has passed

2
Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

One thing though with ECR scanning though is it’s only at build time. What about vulnerabilities that happen after that?

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

I don’t think it offers continuous scanning. Haven’t researched the solutions for this. Presumably there’s a lambda, but the naive approach of re-scanning daily would generate a lot of noise and be infeasible for repos with thousands and thousands of images (E.g from CI)

Alex Jurkiewicz avatar
Alex Jurkiewicz

You can trigger a scan of an arbitrary image in ECR at any time. To get “diff” results you’d need to store the entire result for all images yourself, rescan everything daily, compare to your “last scan” result, and alert people if they differ. A bit ugly for sure

mrwacky avatar
mrwacky

Last I checked ECR scanner can’t be tuned. For instance, you can’t say “we don’t care about this check, and here’s why”, so it’s kind of all or nothing

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

Another way of looking at this is to off load that to the escalation platform, and thus ignoring the alerts you don’t care about and creating incidents for the ones you do

1
zeid.derhally avatar
zeid.derhally

We ended up going with snyk

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

That way you still can track the MTTR of incidents but don’t lose visibility into the alerts a don’t have to expect generator of the alerts to support allow listing

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

I see snyk come up all the time and a few of our customers using them.

mrwacky avatar
mrwacky

We demoed Snyk last summer, the Docker bits were nice, the “dependabot” features had some rough edges, some other usability rough edges throughout

    keyboard_arrow_up