11 Feb 2017
The challenge this week was to find out why the authentication appeared to be broken on the automated mongodb build.
Several weeks ago I had written a puppet module to build a mongodb cluster using a number of arguments,
like number of nodes, nodenames, certificates, etc.
Despite having certificates generated from a CA (Certificate Authority), and the certificate with the client to log on,
this user could do anything. and .auth()
was not needed.
mongo admin --ssl --sslCAFile /etc/mongodb/ssl/mongoCA.pem \
--sslPEMKeyFile /etc/mongodb/ssl/mongo1.pem \
-u mongoReadony -p mongotest --host mongo1
In the /etc/mongod.conf
file, security clusterAuthMode: x509
was set, but security.authorization:
was disabled
It was assumed that specifying net.ssl.mode was enough and the security.authorization setting would be ignored.
Sorry, false assumption.
(Read more...)
02 Feb 2017
Another incidence of a tired admin fixing an outage to cause a bigger outage isn't news as such, however I have to hand it to gitlab with their open honesty about this weeks incident.
After a spam storm created serious (4GB) replication lag on the firms postgresql database cluster, to fix the replication a very very tired on-call team-member then deleted the data folder on the active rather than the replicating server.
The full incident is documented here
I embrace the honesty that they have shown as this enables the whole community to learn from this and offer better services to our clients. This is very much the message in Black Box Thinking by Matthew Syed.
Matthew describes the difference between closed cultures where mistakes are hidden vs an open hostest culture where mistakes are open and much learning and prevention occurs as a result.
As shown by the support on Twitter the DevOps and cloud reliability engineers agree.
Lessons so far? Test your backups, you never know when you will really need them.
With my ethos about servers being disposible, I love destroying and rebuilding servers, to prove in any Disaster Recovery situation, the service can be restored.
This relies on well designed recovery processes and code, keeping the focus away from avoiding failure, to focus on embracing failure and reducing the mean time to recovery.
(Read more...)
30 Jan 2017
Publishing a blog with Jekyll
I had an idea. Why not publish the blog using jekyll and host it on AWS S3?
Working with Puppet and ruby, I’m already very familiar with gems and getting jekyll working on my windows 10 workstation with RubyMine was relatively easy.
(Read more...)
14 Dec 2016
Our Christmas present has arrived! Amazon Web Services have announced general availability of EU London Region.
This has most of the AWS services available but at a slightly higher price than Ireland but cheaper than Frankfurt.
There are other differences, like there are only 2 availability Zones for London currently in use. I expect that to increase to 3 soon along with the demand and growth.
A few services are not available in London region compared to Ireland, so you may need to wait before deploying apps that require one of these:- EFS (Elastic File System), Lambda, ElasticSearch or SES (Simple Email Service).
ECS (Elastic Container Service) is available, so we can continue to grow our Docker workloads.
(Read more...)
12 Dec 2016
This week AWS have announced availability for a new Canada region. This is the 15th region now in service and builds further anticipation for the new London region due online in the next few months.
Amazon web services continue to grow in capacity, not only in the individual Availability Zones, but also across the world so your servers can be as near to your users and reduce latency (network delays) to a minimum.
In Montreal, Canada the basic building block services of EC2 (Elastic compute cloud), S3 (simple storage service) and RDS (Relational Database Service) are already online as well as the other services we expect as standard like SQS and SNS.
(Read more...)