3. December 2017 18:42
by Aaron Medacco
0 Comments

AWS re:Invent 2017 - Day 2 Experience

3. December 2017 18:42 by Aaron Medacco | 0 Comments

The following is my Day 2 re:Invent 2017 experience. Missed Day 1? Check it out here.

Day 2, I started off the day waking up around 11:30am. Needed the rest from the all-nighter drive Monday morning. Cleaning lady actually woke me up when she was making her rounds on the 4th floor. This meant that I missed the breakout session I reserved, Deploying Business Analytics at Enterprise Scale with Amazon QuickSight (ABD311). I'm currently in the process of recording a Pluralsight course on Amazon QuickSight, so I felt this information could be helpful as I wrap up that course. Guess I'll have to check it out later once the sessions are uploaded to YouTube. Just another reason to never drive through the night before re:Invent again.

After getting ready, I exposed my nerd skin to sunlight and walked over to the Venetian. This is where I'd be the majority of the day. I kind of lucked out because all of my sessions for the day besides the one I slept over were in the same hotel, and pretty back-to-back so I didn't have to get creative with my downtime. 

First session of the day was Deep Dive into the New Network Load Balancer (NET304). I was curious about this since the network load balancer's announcement recently, but never had a use case or a reason to go and implement one myself. 

AWS re:Invent 2017

I have to admit, I didn't know it could route to IP addresses.

AWS re:Invent 2017

Should have picked a better seat.

AWS re:Invent 2017

Putting it all together.

The takeaways I got was that the NLB is essentially your go-to option for TCP traffic at scale, but for web applications you'd still be mostly using the Application Load Balancer or the Classic Load Balancer. The 25% cheaper than ALB fact seems significant and it uses the same kinds of components used by ALB like targets, target groups, and listeners. Additionally, it supports routing to ECS, EC2, and external IPs, as well as allowing for static IPs per availability zone.

As I was walking out of the session, there was a massive line hugging the wall of the hall and around the corner for the next session which I had a reservation seat for (thank god). That session was Running Lean Architectures: How to Optimize for Cost Efficiency (ARC303). Nerds would have to squeeze in and cuddle for this one, this session was full. 

AWS re:Invent 2017

Before the madness.

AWS re:Invent 2017

Wasn't totally full, but pretty full.

AWS re:Invent 2017

People still filing in.

AWS re:Invent 2017

Obvious, but relevant slide.

I had some mixed feelings about this session, but thought it was overall solid. On one hand, much of the information was definitely important for AWS users to save money on their monthly bill, but at the same time, I felt a lot of it was fairly obvious to anyone using AWS. For instance, I have to imagine everybody knows they should be using Reserved Instances. I feel like any potential or current user of AWS would have read about pricing thoroughly before even considering moving to Amazon Web Services as a platform, but perhaps I'm biased. There were a fair number of managers in the session and at re:Invent in general, so maybe they're not aware of obvious ways to save money. 

Aside from covering Spot and Reserved Instance use cases, there was some time covering Convertable Reserved Instances, which is still fairly new. I did enjoy the tips and tricks given on reducing Lambda costs by looking for ways to cut down on function idle time, using Step Functions instead of manual sleep calls, and migrating smaller applications into ECS instead of running each application on their own instance. The Lambda example highlighted that many customer functions involve several calls to APIs which occur sequentially where each call waits on the prior to finish. This can rack up billed run time even though Lambda isn't actually performing work during those waiting periods. The trick they suggested was essentially to shotgun the requests all at once, instead of one-by-one, but as I thought about it, that would only work if those API calls were such that they didn't depend on the result of the prior. When they brought up Step Functions it was kind of obvious you could just use that service if that was the case, though.

The presenters did a great job, and kept highlighting the need to move to a "cattle" mentality instead of "pet" mentality when thinking about your cloud infrastructure. Essentially, they encouraged moving away from manual pushing, RDPing, naming instances thinks like "Smeagle" and the like. Honestly, a lot of no-brainers and elementary information but still a  good session.

Had some downtime after to go get something to eat. Grabbed the Baked Rigatoni from the Grand Lux Cafe in the Venetian. The woman who directed me to it probably thought I was crazy. I was in a bit of a rush, and basically attacked her with, "OMG, where is food!? Help me!"

AWS re:Invent 2017

Overall, 7/10. Wasn't very expensive and now that I think about it, my first actual meal (some cold pizza slices don't count) since getting to Vegas. 

Next up was ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT305) inside the Venetian Theatre. I was excited about this session since I haven't done much of anything with ElastiCache in practice, but I know some projects running on AWS that leverage it heavily. 

AWS re:Invent 2017

About 10 minutes before getting mind-pwned.

AWS re:Invent 2017

Random guy playing some Hearthstone while waiting for the session to begin.

AWS re:Invent 2017

Sick seat.

Definitely felt a little bit out of my depth on this one. I'm not someone who is familiar with Redis, outside of just knowing what it does so I was clueless during some of the session. I didn't take notes, but I recall a lot of talk about re-sharding clusters, the old backup and recovery method vs. the new online managed re-sharding available, pros of enabling cluster mode (was clueless, made sense at the time, but couldn't explain it to someone else), security, and best practices. My favorite part of the session involved use cases and how ElastiCache can benefit: IoT, gaming apps, chat apps, rate limiting services, big data, geolocation or recommendation apps, and Ad Tech. Again, I was out of my depth, but I'll be taking a closer look at this service during 2018 to fix that.

After some more Frogger in the hallways, headed over to the Expo, grabbed some swag and walked around all the vendor booths. Food and drinks were provided by AWS and the place was a lot bigger than I expected. There's a similar event occurring at the Aria (the Quad), which I'll check out later in the week. 

AWS re:Invent 2017

Wall in front of the Expo by Registration.

There were AWS experts and team members involved with just about everything AWS scattered around to answer attendee questions which I though was freaking awesome. Valuable face-time with the actual people that work and know about the stuff being developed.

AWS re:Invent 2017

Favorite section of the Venetian Expo.

AWS re:Invent 2017

More madness.

Talked to the guys at the new PrivateLink booth to ask if QuickSight was getting an "endpoint" or a method for connecting private AWS databases soon. Ended up getting the de facto answer from the Analytics booth who had Quinton Alsbury at it. Once I saw him there, I'm like "Oh here we go, this guy is the guy!" Apparently, the feature's been recently put in public preview, which I somehow missed. Visited a few other AWS booths like Trusted Advisor and the Partner Network one, and then walked around all the vendor booths for a bit.

AWS re:Invent 2017

Unfortunately, I didn't have much time to chit chat with a lot of them since the Expo was closing soon. I'll have to do so more at the Quad. Walked over to the interactive search map they had towards the sides of the room to look for a certain company I thought might be there. Sure enough, found something familiar:

AWS re:Invent 2017

A wild technology learning company appears.

Spoke with Dan Anderegg, whose the Curriculum Manager for AWS within Pluralsight. After some talk about Pluralsight path development, I finished my beer and got out, only to find I actually stayed too long and was already super late to my final session for the day, Deep Dive on Amazon Elastic Block Store (Amazon EBS) (STG306). Did I mention how it's hard to try to do everything you want to at re:Invent? 

Ended up walking home and wanting to just chill out, which is how this post is getting done. 

Cheers!

21. March 2017 00:28
by Aaron Medacco
0 Comments

Moving Load From Your Master Database: Elasticache or RDS Read Replica?

21. March 2017 00:28 by Aaron Medacco | 0 Comments

You have a database managed thru the AWS RDS service. Your application's a massive success and your database's workload is now heavy enough that users are experiencing long response times. You decide that instead of scaling vertically by upgrading the instance type of your RDS database, you'd like to explore implementing Elasticache or an RDS read replica into your architecture to remove some of the load from your master database.

But which one should you choose?

Like always, it depends.

RDS read replicas and Elasticache nodes both enhance the performance of your application by handling requests for data instead of the master database. However, which one you choose will depend on your application's requirements. Before I dive too deep into how these requirements will shape your decision, let's talk about what we are comparing first.

Elasticache Or Read Replica

Note: For those already familiar with Elasticache and RDS features, feel free to skip down.

Elasticache Clusters

Elasticache is a managed service provided by AWS that allows you to provision in-memory data stores (caches) that can allow your applications to fetch information with blazing speed. When you use Elasticache, you create a cluster of nodes, which are blocks of network attached RAM. This cluster can then sit in between your application tier and your data tier. When requests that require data come in, they are sent to the cluster first. If the information exists in the cache and the cluster can service the request, it is returned to the requester without requiring any database engine steps or disk reads. If the information does not exist in the cache, it must be fetched from the database like usual. Obviously, their is no contest in performance between fetching data from memory vs. disk which is why the Elasticache service can really give your applications some wheels while also reducing the load on your database.

RDS Read Replicas

RDS read replicas offer an alternative to vertical database scaling by allowing you to use additional database instances to serve read-only requests. Read replicas are essentially copies of your master database where write operations are prohibited except thru asynchronous calls made from the master after the master has completed a write. This means that read replicas may return slightly stale data when serving requests, but will eventually catch up as write propagations invoked from the master complete. Read replicas have additional benefits as well. Since they can be promoted to serve as the master database should the master fail, you can increase your data's availability. The database engine you choose also determines available features. For instance, databases utilizing the MySQL database engine can take advantage of custom read replica indexes which only apply to the replicas. Want to take advantage of covering indexes for expensive queries without forcing your master database to maintain additional writes? With this feature, you'd be able to.

Great. So which is better?

Clearly, both of these services can reduce your master database's workload, while also boosting performance. In order to choose which service makes most sense for your situation, you'll need to answer these kinds of questions:

Can my application tolerate stale data? How stale? 5 minutes? 5 hours?

If you want to store your data in Elasticache nodes for long periods of time, and you need almost exactly current data, read replicas are likely the better option. Read replicas will lag behind the master database slightly, but only by seconds. On the other hand, if you can tolerate data that is more stale, Elasticache will outperform read replicas performance-wise (if data exists in the cache) while also preventing requests from hitting the master database. 

Are the queries generated by my application static? That is, are the same queries run over and over all day? Are they dynamically constructed, using items like GETDATE()?

If the queries being run by your application are ever-changing, your in-memory cache won't be very useful since it will have to continue to update its data store to serve requests it otherwise cannot satisfy. Remember, the cache can only act on queries it recognizes. Not only does this not prevent calls to the database, but it can actually degrade performance because you are effectively maintaining a useless middle man between your application and data tiers. However, Elasticache clusters will shine over read replicas in cases where queries do not change and request the same data sets over and over.

How much data is my application asking for?

The volume of data your Elasticache clusters can store will be limited by the amount of memory you allocate to the nodes in your cluster. If your queries return huge result sets, your cache is going to be occupied very quickly. This can create a scenario where queries constantly compete for the available memory by overwriting what's existing. This won't be very helpful, especially if you aren't willing to dedicate (and pay for) additional memory. Alternatively, read replicas will be able to serve the request regardless of the returned data size, won't incur the penalty of a wasted middle man trip, and will still prevent the request from hitting the master.

What would I do?

My approach would be to consider how effectively you'll be able to use an in-memory cache. If, given all you know about your application, you decide the cache will be able to catch a large portion of the read requests coming in, go with the Elasticache cluster. If you're right, you should notice a difference right away. If you're wrong you can always fall back to implementing read replicas. I've avoided pricing in this post, which is obviously important when making architectural decisions. Elasticache pricing and RDS pricing is available on Amazon Web Services' site. Readers will need to do their own analysis of the kinds of instances and node types they'd provision and how their costs compare to make a decision.

If anyone knows of some additional considerations I should include, leave a comment or reach out to me via e-mail: acmedacco@gmail.com. Shout out to the last guy who caught my miscalculation of pricing in the Infinite Lambda Loop post

Cheers!

Copyright © 2016-2017 Aaron Medacco