Jan 27, 2023

AWS Lambda Cold Start

The lambda function runs in an ephemeral environment. It spins up on demand, lasts a brief time, and then is taken down. Lambda Service manages to create and tear down these environments for your function. You don't have control over these environments.

Invocation Requests 

        => AWS Lambda Service

                => 1. Create an Execution Environment to run the function

                      2. Download code into the environment and initializes runtime

                      3. Download packages, dependencies

                      4. Initialize the global variable

                      5. Initialized temp space

                                => Lambda runs your function starting from the Handler method


When an invocation is complete, Lambda can reuse that initialized environment to fulfill the next request, if the next request comes in close behind the first one, in which case the second request skips all of the initialization steps and goes directly to running the handler method.

If you pay attention to the cloud watch log, you can notice big differences in the duration between cold start and warm start lambda. Refer following which is for a simple function that returns a list of strings written in dotnet6.

First Invocation

INIT_START Runtime Version: dotnet:6.v13 Runtime Version

Duration: 24293.23 ms Billed Duration: 24294 ms Memory Size: 500 MB Max Memory Used: 90 MB

Second Invocation

Duration: 9.91 ms Billed Duration: 10 ms Memory Size: 500 MB Max Memory Used: 90 MB


Here are some of the guidelines which you can take as a developer to mitigate cold start

  • Provisioned concurrency - Setting this will keep the desired number of environments always warm. Request beyond the provisioned concurrency (spillover invocation) uses the on-demand pool, which has to go through the cold start steps outlined above. This has a cost implication. You may want to analyze the calling pattern and update provisioned concurrency accordingly to minimize the cost. 
  • Deployment package - Minimize your deployment package size to its runtime necessities. This reduces the amount of time it takes for your deployment package to download and unpack ahead of invocation. This is particularly important for functions authored in compiled languages. Framework/languages which support AOT compilation and tree shaking can have an automated way to reduce the deployment package.
  • AOT - In .Net a language-specific compiler converts the source code to the intermediate language. This intermediate language is then converted into machine code by the JIT compiler. This machine code is specific to the computer environment that the JIT compiler runs on. The JIT compiler requires less memory usage as only the methods that are required at run-time are compiled into machine code by the JIT Compiler. Code optimization based on statistical analysis can be performed by the JIT compiler while the code is running. But on the other hand, JIT compiler requires more startup time while the application is executed initially. To minimize this you can take advantage of AOT support with .Net7. Using this you publish self-contained app AOT compiled for a specific environments such as Linux x64 or Windows x64. This can help reduce the cold start.
  • SnapStart-When you publish a function version, Lambda takes a snapshot of the memory and disk state of the initialized execution environment, encrypts the snapshot, and caches it for low-latency access. 

Dec 6, 2022

serverless vs container

Serverless is a development approach where your cloud service provider manages the servers. This replaces the long-running virtual machine with computing power which comes into existence on demand and disappears immediately after use. The most common AWS serverless services are: Compute - Lambda, Fargate; Data Storage - S3, EFS, DynamoDB; Integration - API Gateway, SQS, SNS, EventBridge, and Step Functions.

Pros of serverless

  • Pay for the time when the server is executing the action.
  • Allows the app to be elastic, it can automatically scale up/down.
  • Much less time managing the servers like provisioning infrastructure, managing capacity, patching, monitoring, etc. For example in the case of lambda, you configure memory and storage based on which you pay.
  • Help reduces development time as you focus less on infrastructure and more on business logic. In my opinion, this is one of the key benefits.
  • Fits well with microservices to build loosely-coupled components.
  • You just take care of the code in terms of testing, security, and vulnerability.


Cons of serverless

  • One of the biggest channels is a cold start. A few technics which can help are SnapStart (currently only supported in java runtime), AOT with .net core 7, and Provisioned concurrency (extra cost).
  • Vendor lock-in, the same code will not run in Azure or GPC. Also, you will have to find a way to run it on the local developer machine. Visual Studio template (like serverless.AspNetCoreWebAPI) helps you create a project with both local entry points (for debugging on the dev machine) and a lambda entry point. This also adds a code separation which can be helpful in case you need to use a different cloud provider.
  • The maximum timeout is 15 minutes, so if you have a long-running process, this may be a challenge. Leveraging the step function may be an option to break long-running tasks.
  • You do not have much control over the server which for most cases should be fine, but in the special case where you would want GPU for processing large video or some machine learning processing, this may not be the right choice.
  • For complex apps, you may have to perform a lot of coordination and manage dependencies between all of the serverless functions.
  • Can't scale over the hard limit. You may have to use a different account/region.


Pros of EKS

  • Portability - you can run it anywhere, so cloud agnostic. You can run the same code on the developer's machine using minikube or Kubernetes in Docker Desktop. Easy to replicate the environment.
  • You have better control over the underlying infrastructure. You define the entire container ecosystem and the server they run. You can manage all the resources, set all of the policies, oversee security, and determine how your application is deployed and behaves. You can debug and monitor in a standard way.
  • No timeout restriction.
  • You have greater control to optimize according to instance type/size by defining affinity/tolerance. You can make use of spot instances to control cost. So by optimizing the resources, you can achieve the biggest saving, but definitely, it will be at the cost of DevOps work. 


Cons of EKS

  • Lot more time to build and manage infrastructure work. 
  • You will need to make sure you are keeping up to date with the container base image and any package you are using. If you don't keep up to date with the short-release cycle it can become difficult to maintain the application.
  • You need to manage to scale up and scale down. You can use technologies like Auto scaling (Horizontal pod autoscaling)/Karpenter(node-based autoscaling) can help.
  • Containers are created/destroyed so this adds complexity in terms of monitoring, data storage, etc compared to running applications in VM. You need to account for this during the application design.

As with everything, it all depends on the use case. Here are some of the guidelines which I use. For any new project, my first preference is serverless for apparent reasons. If it's for a long-running application, and I am not willing to re-architect, my preference is a container. If cost is the factor for your decision, you need to consider extra DevOps time needed to devolve/maintain the k8s solution, and for serverless are you designing an application for continuous use (like a web service) or one-off use case (only a few times in a day). If you have a use case for multiple cloud providers, you need to give thought as eks has better portability, but on the other hand, other cloud providers are providing serverless support and if you maintain separation of concern this may not be a challenge.

Sep 23, 2022

k8s Services

A Kubernetes service enables network access to a set of pods. A service listens on a local port (services.spec.ports.port) and forward them to the selector (services.spec.selector) pods at the target port (services.spec.ports.targetPort). Since pods are ephemeral, a service enables a group of pods, which provide specific functions like web service to be assigned a name and unique IP address (clusterIP)

ClusterIP – The service is only accessible from within the Kubernetes cluster.

NodePort – This makes the service accessible on a static port (high port) on each Node in the cluster. Each cluster node opens a port on the node itself and redirects traffic received on that port to the underlying service. 

LoadBalancer – The service becomes accessible externally through a cloud provider's load balancer functionality. If you inspect the load balancer, you will notice that nodes will be instances where the traffic will be redirected at a specific Instance Port, so you should be able to access the application using the node port

Get Instance Port - kubectl get svc <svc name> -o jsonpath="{.spec.ports[0].nodePort}" 

Get Node IP - kubectl get node -o wide

Ingress

Loadbalancer is the default method for many k8s installations in the cloud, but it adds cost and complexity as every service needs to have its own cloud-native load balancer and hence increased the cost. Along with this, you may need to handle SSL for each application, which can be configured at different levels- like application level, load balancer level, etc and also configure firewall rule. This is where an ingress helps. You can consider this as a layer 7 load balancer that is built in a k8s cluster which can be configured as a k8s object using YAML just like any object. Now even with this you still need to expose this to the outside world load balancer (or may be node port), but that's just going to be a single cloud-native load balancer and all the routing will be configured through the ingress controller.

K8s cluster does not come with an ingress controller by default. There is multiple ingress controller available like AWS Load Balancer Controller, GLBC, and Nginx which are currently being supported and maintained by the k8s project. Along with this Isio is also a popular one that provides a lot of service mesh capabilities. An ingress resource is a set of rules and configurations to be applied to the ingress controller. 


Jun 21, 2022

Argo CD

Argo CD is a declarative, GitOps continuous delivery tool for Kubernetes. This is based on the pull mechanism. You can configure  Argo CD using the following steps after which argocd agent pulls manifest changes and applies them

  • Deploy ArgoCd to K8s cluster. Refer this for more details
  • Create a new Application in ArgoCd either through the command-line tool or the UI or using YAML of custom type Application. Here you connect the source repository to the destination k8s server.

In a typical CI/CD pipeline, when you make a code change, the pipeline will test, build, and create the image, push the image to the hub, and update the manifest file (like update deployment with a new image tag). At this stage, the CD will apply the new manifest to the k8s cluster using tools like kubectl or helm on the cd runner. For CD to work, it need to have access to the k8s and also to the cloud, like AWS. This may have a security concerns as you need to give cluster credentials to an external tool. This can even get more complicated if you have multiple applications from different repositories being deployed to different k8s. Also once you apply the manifest, you do not have direct visibility of the status of the configuration change. These are some of the challenges that gitops tools like Argo CD address as the CD is part of k8s cluster.  Here are some of the benefits of using gitops tool like like Argo CD
  • Source Repository is a single source of truth. Even if someone makes manual changes, Argo CD will detect the actual state (in k8s cluster) to be different than the desired state (application config git repo). You can always disable this by setting it to manual. 
  • Rollback is just a matter of reverting code in the repository.
  • Disaster recovery is easy as you can apply this to any destination. 
  • No cluster credential outside of k8s.
  • This is an extension to k8s API as it uses Kubernetes resources itself like etcd to store data, controller for monitoring/controlling actual state to the desired state. It can be configured using YAML as a custom resource.


It's a good idea to have a separate repo for application code and application config (manifest files). This way you can make configuration changes like updates to config maps without involving the application pipeline. In this case, your application pipeline should perform the following - test, build and push the image and then update the manifest file in the configuration repository after which Argo CD will find that the source and destination are not in sync, so it will apply changes from the source. 

At the time of writing, I ran into some challenges while running Argo Cd on eks fargate profile, but no issues with eks node group.

Sep 22, 2021

kubernetes high level overview

Kubernetes is an open-source system for automating deployment, scaling, healing, networking, persistent storage options, and management of containerized applications. It provides features like Service Discovery, Load Balancing, Storage Orchestration, Automates Rollout/Rollback, Self-healing, Secret and Configuration Management, Horizontal Scaling, Zero downtime deployment, blue-green, canary, Run like the production on the development machine. 

It consists of one or more master nodes that will manage one or more worker nodes that together work as a cluster. The master node will start something call pod on the worker node, which is a way to host container. You will need a Deployment/replica set to deploy the pod. Kubernetes does not deploy the container on a worker node, but it encapsulates it within a kubernetes object called pods. One pod will contain only one same type application container, but you can have different type application containers running in the same pod, like the front end and middle-tier container running in the same pod. In this case, the two containers can talk using localhost since they share the network space and also they share storage space.

Master Node Components (collectively known as the control plane)

API Server - This acts as a front-end to the Kubernetes, users, management devices, command-line interface which talks to the API Server to interact with  Kubernetes.

ETCD store - It's a  key-value pair database that stores information about the cluster

Scheduler -  Responsible for distributing work or containers across nodes. It looks for a newly created container and assigns it to the node. node/pod running comes to life or goes away.

Controller - They are responsible for processing and responding when Node/container goes down, it makes a decision to bring up a new container. They are the brain behind the orchestration.

Worker node components (collectively known as the data plane)

Container Runtime - Underlying software which is used to run container.. like docker, rocket, etc

Kubelet - It's an agent which runs on each node in the cluster, agent is responsible for making sure that the container is running as expected. It runs on the node which registers that node with the cluster to report back and forth with the master node.


Kubectl Explain

Kubectl Explain is a handy command to List the fields for supported resources and also get detailed documentation around that. Its a good way to explore any kind of k8s object. Refer following for few example. Also refer kubectl-commands and cheatsheet 

kubectl explain deployment.spec.strategy.rollingUpdate

kubectl explain deployment.spec.minReadySeconds

kubectl explain pod.spec.tolerations

kubectl explain pod.spec.serviceAccountName

kubectl explain pod.spec.securityContext

kubectl explain pod.spec.nodeSelector

kubectl explain pod.spec.nodeName

kubectl explain pod.spec.affinity

kubectl explain pod.spec.activeDeadlineSeconds

kubectl explain job.spec.

kubectl explain job.spec.ttlSecondsAfterFinished

kubectl explain job.spec.parallelism

kubectl explain job.spec.completions

kubectl explain job.spec.backoffLimit

kubectl explain job.spec.activeDeadlineSeconds


If you want to get more familiar with the imperative way of creating k8s resources refer CKAD Exercises as they can be useful during the CKAD exam. The imperative way of creating resources sometimes saves time which is critical in the exam. You can always patch or edit the resource if that's allowed. 


Apr 12, 2021

Snowflake Introduction

Snowflake is a Fully Managed cloud data platform. This means that it provides everything you need to build your data solution, such as a full-feature data warehouse. It is cloud-agnostic and most importantly you can even replicate between the clouds. 

The architecture comprises a hybrid of traditional shared-disk and shared-nothing architectures to offer the best of both. 

The storage layer organizes the data into multiple micro partitions that are internally optimized and compressed. Data is stored in the cloud storage (storage is elastic) and works as a shared-disk model thereby providing simplicity in data management. The data objects stored by Snowflake are not directly visible nor accessible by customers; they are only accessible through SQL query operations run using Snowflake. As the storage layer is independent, we only pay for the average monthly storage used.

Compute nodes (Virtual Warehouse) connect with the storage layer to fetch the data for query processing. These are Massively Parallel Processing (MPP) compute clusters consisting of multiple nodes with CPU and Memory provisioned on the cloud by Snowflake. These can be started, stopped, or scaled at any time and can be set to auto-suspend or auto-resume for cost-saving. 

Cloud Services Layer handles activities like authentication, security, metadata management of the loaded data, and query optimization

Data is automatically divided into micro-partitions and each micro-partition contains between 50 MB and 500 MB of uncompressed data. These are not required to be defined upfront. Snowflake stores metadata about all rows stored in a micro-partition.Columns are stored independently within micro-partitions, often referred to as columnar storage. Refer this for more details. 

In addition to this, you can manually sort rows on key table columns, however, performing these tasks could be cumbersome and expensive. This is mostly useful for very large table

May 6, 2020

SNS

SNS is the notification service provided by AWS, which manages the delivery of a message to any number of subscribers. It uses the publisher/subscriber model for push delivery of messages. Subscribers are notified using following supported protocols - SQS, lambda, http/s, email, SMS. 

To use SNS, you create a topic and define a policy that dictates who can publish and subscribe to it. You can define policy by configuring the condition to give cross-account access. An SNS request has a Topic where you want to publish to, Subject, Message, MessageAttributes, MessageStructure. 

The subscriber can define subscription filter policy and dead-letter queues. By configuring the subscription filter policy, you can filter the message to be sent to the subscriber based on rules defined on the message attribute. You can assign a redrive policy to Amazon SNS subscriptions by specifying the Amazon SQS queue that captures messages that can't be delivered to subscribers successfully. You can test this by deleting the endpoint like lambda function. 

When you configure dead letter Q, you need to make sure, that SNS has the necessary permission to publish the message to the Q, by adding permission policy in SQS with SNS earn. Once a message is in dead letter Q, you can either have lambda configured to process them and also use cloud watch metrics to monitor dead letter Q.

SNS -> Lambda vs SNS -> SQS -> Lambda

If you have SQS in between SNS and Lambda, it can give you the flexibility of reprocessing. You may be able to set Redrive policy for SQS and set Maximum Receives, which essentially means the message will be received by lambda that many numbers of times before being sent to the dead letter Q. If no Redrive policy is set, then after every visibility timeout, the message will be sent to the lambda, until the message retention period. In case when SNS directly sends the message to lambda, its only one time, and if it fails it will get sent to the dead letter Q if Redrive policy is set. With SQS you can have a retention period of at least up to 14 days.

SQS retry happens after the visibility timeout occurs and the visibility timeout should be more than lambda timeout which essentially makes sure the message is sent to lambda only after lambda processing is completely done, which prevents duplicate processing of the message.

In Sqs (Pull Mechanism) messages are persisted for some (configurable) duration if no consumer is available, whereas in Sns (Push Mechanism) messages are sent to the subscribers which are there at the time when the message arrives.