AWS News – re:Invent Recap
Another re:Invent is in the books, and there were dozens of major announcements—too many to summarize effectively in a single post. This recap focuses on the ones most relevant to developers and engineers, which means IoT, HPC and machine learning don’t get a lot of coverage here. Some of the more significant announcements like Control Tower and Security Hub will be expanded upon much further in future blog posts, once we’ve finished wrapping our arms around them.
AWS Control Tower was one of the most exciting announcements of the week. Control Tower gives you a single interface to manage governance and compliance across all of your AWS accounts. Think of this as your way to enforce sameness across each of your accounts, from single sign-on to AWS Config rules and IAM password policies, and a whole bunch of other things.
If you’re running production workloads in AWS, you should have at least 2 and ideally 4 accounts no matter how big or small your project is. Segregating operational, production and dev/test across accounts adds minimal cost and contains the damage in the event of a configuration mistake or security breach in any environment. Control Tower takes away the burdens of ensuring your accounts are all set up as you expect, and it ensures that your policies are actually enforced. This gives you the freedom to grant wider access and empower developers.
Security Hub was a big announcement, with a lot of precious keynote minutes dedicated to it. Security Hub aggregates GuardDuty, CloudWatch, Config and CloudTrail alerts from multiple accounts into a single viewpoint, allowing you to both view and route alerts centrally. As you’d expect from AWS, it supports a variety of inputs and outputs, with a lot of flexibility.
I’m a little on the fence about Security Hub right now. Ideally, each organization should have a single system where security events are aggregated, correlated and triaged. For many security teams, that system is a SIEM. But, SIEMs are woefully bad at monitoring and alerting on event-driven systems like a modern cloud environment. Their comfort zone is in extracting data from logs, and SIEM cloud add-on modules have done little to improve the situation. Security Hub does a much better job at this, but it doesn’t solve the problems a SIEM solves well—leaving you with two “hubs” instead of one. The future features and open source tooling that build up around Security Hub will greatly determine whether this can improve effectiveness rather than just adding one more “single” pane of glass to deal with.
Outposts are AWS’ long delayed answer to Azure Stack. It looks fun, but probably not useful for you. Outposts are pitched as a way to bridge your migration to AWS, bringing features like EC2 and RDS into your data center. But, Outposts are more likely to be used by teams that need to run certain workloads on-prem and want to manage it using the AWS APIs that drive their overall automation. Neat, but the price tag will make it difficult to take advantage of.
AWS finally has an answer to the “you can’t prove you’re compliant with our licensing terms” argument that Microsoft, Oracle and SAP license compliance auditors love to use. License Manager can not only report on usage (including based on underlying hardware capabilities, a common licensing requirement) but also enforce license restrictions that you configure. For example, you can configure rules to prevent your software from starting on a server that has more sockets than you’re licensed for. This all integrates nicely with other AWS services like Service Catalog.
This is useful for anyone running enterprise software in the cloud. It will likely also prove to be useful for other use cases, as the rules could allow you to restrict the number of instances of a computationally expensive resource a dev team can stand up.
AWS has a few changes to the Marketplace that will be valuable to larger teams. The first is a private marketplace, allowing you to effectively create a private App Store in AWS, spanning multiple accounts in an organization. The second is the rollout of a new container registry. While the world may not have needed another public container registry, you should expect this to become considerably more useful over time, both in terms of available containers and ease of utilization in the various container-based AWS products.
Similar in principal to the Trusted Advisor tool, but much more flexible, the Well-Architected Tool is a best practices analyzer for the cloud. Using a combination of question-asking and resource scanning, the tool will create a detailed report of how compliant your application is with the AWS Well-Architected Framework.
If you’re not familiar with this framework, you need to be. Apps need to be built differently to run reliably, cost effectively and secure in AWS, and the framework lays those differences out clearly. The tool gives you an opportunity to quickly assess your application and get notified when it drifts out of compliance.
Compute / Serverless Announcements
Lambda Custom Runtimes and Layers
The changes to Lambda might be my favorite announcement. They’re cool for a lot of reasons. The first change is custom runtimes, allowing you to build your own execution engine. Immediately available options include Ruby, C++ and Rust, with Elixir, PHP and others on the way. You can also write your own custom runtimes, though with the number of mainstream languages currently under development, you probably won’t need to roll your own.
Related to custom runtimes is layers, which allow you to separate out common services and version them independently of your application. For example, if you have a set of custom libraries you deploy with each and every Lambda, you can make that a layer and link it into each of your Lambdas. This allows you to warm your functions faster, roll out changes more quickly, and maintain better awareness of your dependencies, particularly for Lambdas that change infrequently. There is also already a bunch of layers in GitHub from AWS and others, which will speed up the process of creating new functions.
A1 Arm Instances
AWS released its new Graviton Arm processor, which offers extremely low cost EC2 instances. While it may not seem immediately useful, you might want to give it a close look. Many languages like Python, PHP and Java can run natively in both Intel and Arm architectures. For small apps that need to run on instances, it is worth taking a look at this new instance type.
VPCs don’t support transitive routing—routing your traffic from VPC A to C via VPC B. As a result, you’ve either needed to have tons of peering connections and complex routing tables or go down the transit VPC rabbit hole. Commercial providers like Cisco offered solutions to simplify the process of building a Transit VPC, and they charged handsomely for it. With the new Transit Gateway, it is now possible to easily configure hub-and-spoke routing across multiple accounts, data centers and offices via both Direct Connect and VPN.
Transit Gateway won’t put those products out of business completely, but it does nicely handle the most common use case, which shouldn’t have required a complex and expensive service.
AWS Global Accelerator
Accelerator lets you easily distribute your multi-region application to customers, with significantly more control than the basic dynamic routing policies that Route 53 gives you. Accelerator fronts your ELBs and other resources with Anycast IPs that get customer traffic on AWS’s global network as quickly as possible and routes it to the optimal region based on complex routing policies that you can define.
App Mesh is a new service to manage and provide visibility into microservices platforms running on EC2, ECS or EKS. As companies adopt microservices architectures, monitoring, debugging, traceability and even deployment become much more difficult, requiring the microservices themselves to adopt cumbersome frameworks. Service meshes are a different approach that creates an internal proxy between services, which provides entry points for observation and telemetry and also allows for routing policies to be defined.
There are several service meshes available now, commercial and open source. One that is gaining a lot of popularity right now is Istio. While this announcement steps on the toes of some of those solutions, I expect that they’ll continue to thrive even with the release of App Mesh. AWS is likely to focus on ease of deployment for basic use cases, solving the low value problem that really shouldn’t be a problem in the first place. Projects like Istio will do well as they continue to offer a more robust solution.
Though not the most brilliantly named service, Cloud Map provides a service discovery framework similar to Consul. Not nearly as featureful as Consul, it is also not nearly as difficult to work with. Cloud Map is a quick win for service discovery in a basic multiservices design.
DynamoDB got two big updates from AWS, transactions and on-demand utilization. Both are unique in the non-relational database world. Transactions will significantly expand the use cases for DynamoDB, allowing true multi-table ACID transactions.
On demand also simplifies the scaling and pricing for DynamoDB, allowing for per-request pricing. This is useful for dynamic or difficult to produce workloads. However, as with most AWS services that offer multiple pricing models, you will pay a premium for this flexibility. If your workload is predictable over time, you should still use the provisioned model to minimize cost.
Quantum Ledger, or QLDB, is AWS’ answer to the most common blockchain use case—a cryptographically secured ledger. Ledgers are different than traditional database tables in that they cannot be modified over time. When properly designed, a ledger allows a company to assert that data, such as financial transactions, were not tampered with. Reversals or corrections need to come in as subsequent entries. While blockchain ledgers go a step further by making the cryptographic checking distributed across entities, few organizations actually need that. When used with proper encryption key handling and separation of duties controls, organizations can achieve extremely high levels of financial integrity without the pain and drama that comes with blockchain.
AWS gets extra points here for taking away one of the most significant arguments for Blockchain. The best part is that QLDB is not new tech—AWS has been using it internally for years.
Not that we needed yet another time series database with Graphite, InfluxDB, Prometheus and others crowding and confusing this space, but AWS has released their own anyway. While it doesn’t help simplify this unnecessarily confusing space, it does have some benefits. Most people don’t know how to properly love and care for their TSDB, leaving it on autopilot with poor (or no!) backups, little performance optimization, and an ineffective data retention strategy. Timestream is a fully managed system, giving casual TSDB users a safe, cheap way to get started.
Of course, if you want to hook this to Grafana or other tools, expect a slow and bumpy process. Community-supplied dashboards tend to be highly specific to the syntax of the TSDB, difficult to port from one to another. Many also rely on constructs such as group by and tagging queries that may not be supported by Timestream right away. Early on, it will be most beneficial to people building custom applications that require time series capabilities. Expect it to take 6-12 months before it is a useful replacement for an InfluxDB/Grafana monitoring stack.
Aurora Global Databases
Maybe not too useful for the average AWS user, but Aurora Global Database is super cool, letting you asynchronously manage a multi-region database cluster. It isn’t multi-master, so your standby regions will serve read traffic only, but the design allows for reads with a <1s average replication lag. This also allows failover to occur within a minute, which is a feature that may be relevant to smaller teams with very narrow recovery objectives.
I would expect this to result in other trickle-down functionality that makes Aurora better for all use cases over the next year.
S3 saw a number of major enhancements, unsurprising given the way companies are using S3 as part of their big data applications, both as a data lake and as a warm storage repository to reduce EMR costs. S3 is a unique differentiator for AWS that companies like Cloudera and Hortonworks can’t easily replicate. There are a number of S3 enhancements, but we are focusing on the three that we think are most immediately relevant.
Beyond a certain size, a bucket with no catalog becomes unmanageable. You can no longer browse the contents of the bucket in any meaningful way. This makes it impossible to find what you’re looking for if you don’t already know it’s there. And, it makes correcting ACLs (or identifying bad ACLs), applying tags, and other grooming operations impossible. Even the mere act of deleting a large bucket can take dozens of hours and months of time. Batch Operations are a new way to apply basic object operations at scale on buckets of any size, but it was designed with buckets containing billions of objects in mind.
There’s no getting around it. If you are dealing with large buckets, basic things that were previously impossible, now are. This is a big deal.
S3 Intelligent Tiering (S3-INT)
S3 can now automatically move your data to the Infrequent Access storage class based on object access. Generally speaking, objects will automatically move into the IA class after 30 days of inactivity, moving back to standard once they’re accessed.
Intelligent Tiering is determined by the object, not by the bucket. This makes it possible to introduce tiered objects to a bucket after the fact. Using new Batch Operations, you can also introduce tiering to existing buckets.
When Glacier was first announced, it was a cheap way to park content you didn’t need anymore but had to keep. People found all sorts of interesting creative uses for it, and AWS rapidly moved to improve the restore times to fuel demand. At this point, a Glacier restore often takes as little as 15 minutes, and that’s without paying for their expedited restore service. But it also means the slow retrieval service it started as no longer exists. Deep Glacier brings that back. And it’s cheap. Super cheap. At $1/TB/month, it is about 4x cheaper than Glacier. And Glacier wasn’t exactly expensive.
The use cases for this are considerable, but one of the most obvious is to manage your backup retention strategy more cost effectively. While you certainly can’t keep your most recent backups in a backup that takes 12 hours to begin restoring, you can put your older backups there. S3 lifecycle policies that put nice-to-have-but-not-strictly-needed files into Glacier should be changed to use Deep Glacier. Not right this second, since it is not yet in preview. But, it should be GA this summer.