#sre (2021-01)
Prometheus, Prometheus Operator, Grafana, Kubernetes
Archive: https://archive.sweetops.com/monitoring/
2021-01-05
Anyone have any experience with https://snyk.io/ ? I’m at products security/monitoring products that we can use through the dev lifecycle.
Snyk helps software-driven businesses develop fast and stay secure. Continuously find and fix vulnerabilities for npm, Maven, NuGet, RubyGems, PyPI and more.
(Caveat: my opinion only, doesn’t represent the views of my employer, etc.) We evaluated Snyk and ended up going with Aquasec instead because it has extensive runtime controls. Snyk was fine for just scanning images, but not as comprehensive a feature set. See also Twistlock - it was probably even better than Aquasec when we compared them, but Aquasec was a much more responsive company to deal with.
Snyk helps software-driven businesses develop fast and stay secure. Continuously find and fix vulnerabilities for npm, Maven, NuGet, RubyGems, PyPI and more.
I’d like to be able to scan a registry on new images, approve them, and also configure admission controllers to only allow scanned images
Do you use AWS? Upload your images to an ECR registry and use its built-in security scan. (The registry can be unused otherwise.)
We use Azure container registry.
Yes I use ECR and it’s built-in security scan. But I’ve not studied the scan results of ECR images.
ECR scan is built on Clair. Results should be the same. Scan is after image is uploaded and available I’d recommend using trivy in your CI and only upload if scan has passed
One thing though with ECR scanning though is it’s only at build time. What about vulnerabilities that happen after that?
I don’t think it offers continuous scanning. Haven’t researched the solutions for this. Presumably there’s a lambda, but the naive approach of re-scanning daily would generate a lot of noise and be infeasible for repos with thousands and thousands of images (E.g from CI)
You can trigger a scan of an arbitrary image in ECR at any time. To get “diff” results you’d need to store the entire result for all images yourself, rescan everything daily, compare to your “last scan” result, and alert people if they differ. A bit ugly for sure
Last I checked ECR scanner can’t be tuned. For instance, you can’t say “we don’t care about this check, and here’s why”, so it’s kind of all or nothing
Another way of looking at this is to off load that to the escalation platform, and thus ignoring the alerts you don’t care about and creating incidents for the ones you do
We ended up going with snyk
That way you still can track the MTTR of incidents but don’t lose visibility into the alerts a don’t have to expect generator of the alerts to support allow listing
I see snyk come up all the time and a few of our customers using them.
We demoed Snyk last summer, the Docker bits were nice, the “dependabot” features had some rough edges, some other usability rough edges throughout
2021-01-06
2021-01-07
2021-01-22
Anyone have suggestions for a good postgres monitoring system (inefficient sql, debugging iops spikes) that can run completely on prem?
I’ve used https://pganalyze.com/postgres-analyze-query-performance with great luck. Very quickly see slow queries, zoom in/out based on time, default install hits 3 different databases. I used the Sass version but they have an onprem flavor too.
pganalyze discovers the root cause of critical issues, helps finding missing indices, and lets you optimize slow queries. Learn more!
This one maybe: https://github.com/darold/pgbadger
A fast PostgreSQL Log Analyzer. Contribute to darold/pgbadger development by creating an account on GitHub.