Things I learned today (04/08/2021)

AWS Lambda functions can now mount an Amazon Elastic File System (Amazon EFS)

AWS announcement

What is AWS Lambda?

AWS Lambda is FaaS (function as a service) offering. It is an event-driven, serverless computing platform which integrates with many other AWS services. For example you can trigger lambda function from API gateway, S3 event notification, etc.

AWS Lambda runtime includes Python, Node.js. ruby, Java, Go and C#.

It is very useful and cost-effective when you have infrequent and relatively short executions so you don’t need to provision any infrastructure. Lambda has it’s limitations, mainly it’s running time – max 15 minutes. Storage was also a limitation up to this announcement but this is breakthrough.

What is Amazon EFS?

Amazon Elastic File System (EFS) is a cost-optimized file storage (not setup costs, just pay as you use) that can automatically scale from gigabytes to petabytes of data without needing to provision storage. It also allow multiple instances to connect to it simultaneously.

EFS are accessible from EC2 instances, ECS containers, EKS and AWS Fargate and AWS lambda.

Comparing to EBS, EFS is usually more expensive. However, the use case is different. EFS is a NFS file system (which means that it is not supported on Windows instances) and EBS is block storage and is usually not multi-attached (there are some EC2 + EBS configurations which allow multi-attach but that’s not the main use case).

Why does it matter?

By default, lambda can /tmp storage of up to 512Mb this enables working with larger files. This means that you can import large machine learning models or packages. This also means that you can use an up-to-date version of files since it is easy to share.

Additional you can share information or state across invocations since EFS is a shared drive. I would not say it is optimal and generally I would rather to decouple it but it is possible and it is faster than S3.

In some cases it can also enable moving data intensive workloads (in AWS or on-premise) to AWS lambda and save cost.

See more here

AWS Certified Solutions Architect – Associate

That’s me 🙂

Took the exam yesterday and passed. I learned many news things on the way (e.g here, here, here and here) and I believe this knowledge (combined with the hands-on experience) would be very handful for me in the future.

Badge available here – https://www.credly.com/badges/8f7a4dec-70c7-407f-a2fc-775918f0cd64/public_url

Things I learned today (26/07/2021)

ElastiCache for Redis is HIPAA compliant while ElastiCache for Memcached is not


What is ElastiCache?

ElastiCache is “Fully managed in-memory data store, compatible with Redis or Memcached. Power real-time applications with sub-millisecond latency” (here).

Most common use cases for ElastiCache are session store, general cache to increase throughput and decrease the load of other services or database, deployment of machine learning models and real time analytics.

AWS offers two flavours of ElastiCache – ElastiCache for Redis and ElastiCache for Memcached. To understand the difference better and recommendation on how to choose an engine see here.


What is HIPAA?

“The Healthcare Insurance Portability and Accountability Act (HIPAA) is an act of legislation passed in 1996 which originally had the objective of enabling workers to carry forward healthcare insurance and healthcare rights between jobs. “

https://www.hipaajournal.com/hipaa-explained/


Over the years and specifically after 2013 HIPAA rules were updated to fit to the technology development and expand the requirements to include business associates, where previously only covered entities were held to uphold the HIPAA restrictions.


Why does it matter?

Better safe than sorry – If you develop a product that needs to be HIPAA compliant it is better to choose in advance the right and compliant services rather than replacing it later

To read more – 

Things I learned today (23/07/2021)

S3 events notifications supports standard SNS topics and standard SQS queues as destinations but don’t support SNS FIFO and SQS FIFO.

S3 events notifications enables you to be notified whenever a specific event happens in your bucket. To receive the notification you must define the events you are interested in and the destination. The notifications are usually triggered in seconds but sometimes can take longer.

The events are –

  • New object creation
  • Object removal (versioned and non-versioned objects)
  • Object restore (e.g from Glacier)
  • Object lost on a reduced redundancy storage
  • Object Replication

The possible destinations include –

  • SQS – as mentioned above standard queues only and not FIFO queues
  • SNS- as mentioned above standard topics only and not FIFO topics
  • Lambda

If when processing the events you write back to S3 be careful not to create an execution loop

See more here – https://docs.aws.amazon.com/AmazonS3/latest/userguide/NotificationHowTo.html

Bonus for the weekend – bumped into this and I cannot deny (of service) that it can happen to me too

Things I learned today (21/07/2021)

You can use Amazon SNS FIFO (first in, first out) topics and Amazon Simple Queue Service (Amazon SQS) FIFO queues together to provide strict message ordering and message deduplication

AWS documentation


While, SQS FIFO queues were introduced in 2016, SNS FIFO capabilities were introduced only on October 2020.

This capability is important for cases in which the order matters. E.g. bank transactions were you commit a transaction only if the balance remains non-negative.

Messages are grouped and ordered according to the message group ID. When sending a message you must specify a message group ID otherwise the action fails. If all the messages have the same message group ID then all the messages are sent and received in strict order. The message group id can be any value, e.g 12, “hello”, “user_id-123”, etc.


Note that as in the SQS case, the topic name must end with .fifo, a limitation that counts to the 80 characters restriction as well. 

For further reading –

Things I learned today (20/07/2021)

Delay queues let you postpone the delivery of new messages to a queue for a number of seconds

AWS documentation

This means that all the messages which are pushed to this queue would be visible to the consumer after the delay period. The minimum delay which is also the default delay is 0 and the maximum is 15 minutes.

Note that when changing the delay of a queue the behaviour of FIFO queues and standard queues is different – 


For standard queues, the per-queue delay setting is not retroactive—changing the setting doesn’t affect the delay of messages already in the queue.

For FIFO queues, the per-queue delay setting is retroactive—changing the setting affects the delay of messages already in the queue.

AWS documentation

If you need to delay the visibility of specific messages and not all messages in the queue you can use message timers and add an initial invisibility period for a message. This is only supported by standard queues.Note that setting a message timer for individual messages overrides the delay period of the delay queue.

See the image below to understand message timeline in a queue –

See more here –
https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-delay-queues.html
https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-message-timers.html

Things I learned today (19/07/2021)

[AWS] independently map Availability Zones too names for each account

AWS documentation


This means that eu-west-1a in my account is not necessarily the same as eu-west-1a in your account.

Why does this matter? for example if you want to share subnets across accounts. Or maybe you want to ensure that services in different accounts are not in the same availability zone.

So how can you achieve this? use availability zone ids which are unique and consistent identifiers for availability zones.

See more here – https://docs.aws.amazon.com/ram/latest/userguide/working-with-az-ids.html

5 interesting things (02/07/2021)

Conducting a Successful Onboarding Plan and Onboarding Process – I believe that onboarding is important for the entire employment period. It helps setting expectations, getting to the code and being meaningful faster and assure both sides they made the right choice (and if not know it in an early stage). One thing I miss in this plan is the social part which I think is also important – having lunch \ coffee \ etc with not just the mentor.
I look forward to the next part “Conducting a Successful Offboarding Plan and Offboarding Process”. It might sound like a joke, but it is not. Good offboarding process can help the organization learn and grow and leave the employee with a good taste so she might come back in the future or recommend her friends to join \ use the product.

https://blog.usejournal.com/conducting-a-successful-onboarding-plan-and-onboarding-process-6ec1b01ec2ae

The challenges of AWS Lambda in production – serverless is gaining popularity in the last years and specifically AWS lambda. While many times it sounds like a magic solution for scalability and isolation it also has its issues to know. In this post Lucas De Mitri from Sinch presents problems they run into and possible solutions. For a high level view on Lambda functions just read the conclusion part.

https://medium.com/wearesinch/the-challenges-of-aws-lambda-in-production-fc9f14b182be

My Arsenal Of AWS Security Tools – In a preview post I pointed out on ElectricEye a tool to continuously monitor your AWS services for configurations that can lead to degradation of confidentiality, integrity or availability. This github repo aggregates open source tools for AWS security: defensive, offensive, auditing, DFIR, etc. 

https://github.com/toniblyx/my-arsenal-of-aws-security-tools

3 Problems to Stop Looking For in Code Reviews – I find the post title inaccurate but I like the attitude. As a reviewer you should not be bothered by tiny issues that can be enforced by tooling. Few tools are mentioned in the post and I would also add to that githooks which I find very powerful.I also agree with the insight that code reviews usually happen too late in the development process and constantly looking for the balance between letting developers progress and move forward and on the other hand give feedback on the right time.

https://medium.com/swlh/3-problems-to-stop-looking-for-in-code-reviews-981bb169ba8b

The Power of Product Thinking – In a previous post I mentioned that understanding the cost structure and trade-offs between different architecture (cost wise but also performance and feature wise) is a way to become a more valuable team member. Product thinking is another skill that can make you a more valuable and influential team member. This post explains what product thinking is (and isn’t) and completes it by suggesting several practices on how to develop product thinking. Totally liked it and am going to adopt some of the suggested practices .

https://future.a16z.com/product-thinking/

AWS tagging best practices – 5 things to know

I read AWS tagging best practices whitepaper which was published in December 2018 and distilled 5 takeaways.

1. Use cases – tags have several use-cases including:

  • Cost allocation – using AWS Cost Explorer you can break down AWS costs by tag
  • Access Control – AM policies support tag-based conditions
  • Automation – for example tags can be used to opt into or out of automated task
  • AWS Console Organization and Resource Groups – e.g. create a custom console that organizes and consolidates AWS resources based on one or more tags
  • Security Risk Management – use tags to identify resources that require heightened security risk management practices
  • Operations Support – I find this use case tightly related to the automation use case

2. Standardized tag names and tag values

There are only two hard things in Computer Science: cache invalidation and naming things.

Phil Karlton (check here)

A good practice as suggested in the whitepaper is to gather tagging requirements from all stakeholders and only then start implementing but a minimal step can be to define a convention for tags names and values that everyone can follow, see example from the document below.

tag names example


3. Cost allocation tags delay – this is something I experienced personally – “Cost allocation tags appear in your billing data only after you have (1) specified them in the Billing and Cost Management Console and (2) tagged resources with them”. And even then it can take around 24 hours to appear, take it into account.


4. Tag everything – sounds trivial but sometimes organizations tag only some of the resources, tag everything you can to get a more comprehensive and accurate data of your expenses. A nice feature in the Billing and Cost Management Console is the ability to find resources the don’t have a specific tags so you can easily find out what you missed.


5. Tags limitations – until 2016 AWS allowed up to 10 tags for a given resource. The current limit is 50. It definitely allows much more but it is still a limit to bear in mind when creating a tagging strategy. One way to avoid it is by using compound values, e.g. “anycompany:technical-contact = Susan Jones;sue.jones@anycompany.com; +12015551213” rather than a tag for each attribute (e.g. “anycompany:technical-contact-name = Susan Jones”).

5 interesting things – AWS edition (18/06/21)

As I collect items for my posts and wait until I have time to write about them I noticed I have many items related to AWS and decided to have a special edition.


12 Common Misconceptions about DynamoDB – many times our beliefs about certain tools or technology are based on hearing more than doing or doing but not getting into the depth of things and when running into a problem solving it with a solution we already know. This post describes features and qualities of DynamoDB that are sometimes ignored.

https://dynobase.dev/dynamodb-11-common-misconceptions/

Related Bonus – I really liked the link to Alex DeBrie post about single table design with DynamoDB

https://www.alexdebrie.com/posts/dynamodb-single-table/

AWS Chalice – it is not an official offering but rather a python code package for writing serverless applications. The syntax is very similar to Flask while there is a native support for local testing, AWS SAM and Terraform integration, etc. Disclaimer – if you are on multi-cloud I would not move from Flask or FastAPI to Chalice. Also note the used services (AWS lambda, AWS API Gateway, etc.) limits and make sure they don’t limit your app.

https://aws.github.io/chalice/index

Related Bonus – auth0 tutorial on How to Create CRUD REST API with AWS Chalice
https://auth0.com/blog/how-to-create-crud-rest-api-with-aws-chalice/


ElectricEye – “ElectricEye is a set of Python scripts (affectionately called Auditors) that continuously monitor your AWS infrastructure looking for configurations related to confidentiality, integrity and availability that do not align with AWS best practices.”. It is hard to know and follow all AWS best practices and this bundle of scripts is supposed to help uncover those. I have not tried it myself yet but it seems promising.
https://github.com/jonrau1/ElectricEye


My Comprehensive Guide to AWS Cost Control – computing and cloud costs take a big portion of every tech organization those days. Being a more valuable team member also means being aware of the costs and choosing wisely between the different alternatives.

https://corey.tech/aws-cost/


The Best Way To Browse 6K+ Quality AWS GitHub Repositories – most of the time we are not inventing the wheel and someone probably already did something very similar to what we are doing. Let’s browse github to find it and accelerate our process.

https://app.polymersearch.com/discover/aws

Bonus – AWS snowball – I found out that this service exists only this week and it blew my mind – https://aws.amazon.com/snowball/