Sunday, March 30, 2025

GCP - Introduction

March 30, 2025 0

 

GCP is a public cloud vendor like competitors of Azure and AWS.  Customers are able to access server resources housed in Google's data centers around the world on a pay-per-user basis.

GCP offers a suite of computing services to do everything from Cost management to data management to delivering web and video over the web to AI and machine learning tools.

Google's global infrastructure has given 24X7 services around the world with highest speed and reliability. GCP starts with a region and within a region are availability zones. These availability zones are isolated from a single point of failure. Some resources such as HTTP global load balancer are global and can receive requests from any of the Google cloud Edge locations and regions. Others resources like storage can be regional.  The storage is distributed across multiple zones within a region for redundancy.
We need to select the locations depending on the performance, reliability and scalability and security needs of your organization. 

Plan to create a GCP setup:





Policies are inherited from the Organization root folder. It will act as parent of the policies within organization.



Setting the bill account is very important before start the project. We need a billing administrator role to perform this task. we can able to set a budget from project level or billing account level.

Cloud Shell:
    GCP includes command line tools for Google cloud product and services:
        gcloud - Main CLI for GCP cloud
        gsutil   - Cloud storage
        bq - biq query

Sytex of gcloud:
gcloud + component + entity + operation + positional args + flags


Cloud Identify:














Friday, March 28, 2025

Large Language Model [LLM] - Introduction

March 28, 2025 0

 


LLM stands for Large Language Model. It is specifically a deep learning model, trained on massive amounts of text data to understand and generate human language, enabling tasks like text generation, translation. It often sing "Transformer" models which are neural networks that can process relationships within language.


Reasoning LLMs


Traditional LLM workflow



Traditional LLM model is refine a dataset into pretraining workflow. The pretraining send a data into fine tuning model and give a precise collected output data. It will send it to human feed back and correct incase of any mismatch with fine tuning model.

Traditional LLMs
  • Direct pattern based prediction
  • Quick but less reliable on complex tasks
  • No explicit reasoning steps

Reasoning LLM:
  • Language models are designed complex and multiple set problems
  • Break down tasks into logical sub tasks.
  • Generate intermediate reasoning steps "thought processes"
Key Capabilities of Reasoning LLMs:
1) Chain-of-Thought Reasoning
        Internal dialogue approach
        step-by-step problem solving
2) Self consistency
        Verified own answers
        Revisits problematic solutions
3) Structured Outputs
        Organized reasoning steps

Practical Applications of Reasoning LLMs
Data Analysis
Medical diagnostics
Complex data interpretation
Anomaly detection
Background Processing
Batch processing workflows
Overnight analysis jobs
Evaluation Tasks
LLM as judge
Quality assessment
Verification workflows
Limitation of Reasoning LLM
Performance Trade-offs
* Increased latency : extended thinking process leads to significantly longer response times
* Higher resource requirements: ofent require more computational resoures
* cost-implications: More tokens and processing time translate to higher operational costs
DeepSeek:

    DeepSeek applied supervised fine-tuning to refine the models' capabilities. This involved training on datasets containing reasoning and non-reasoning tasks. Notably, reasoning data was generated by specialized "expert models" trained for specific domains such as mathematics, programming, and logic. These expert models were developed through supervised fine-tuning on both original responses and synthetic data generated by internal models like DeepSeek-R1-Lite. The use of expert models allowed DeepSeek to generate high-quality synthetic reasoning data to enhance the primary model's performance.






Sunday, March 9, 2025

Terraform - Part 1

March 09, 2025 0

Terraform Installation

yum install -y yum-utils shadow-utils
yum-config-manager --add-repo https://rpm.releases.hashicorp.com/AmazonLinux/hashicorp.repo
yum -y install terraform
terraform version
terram -help
terraform -help plan

Create AWS user for the terraform setup

Create a user in AWS:

1) Login into Aws console

2) Navigate into IAM

 

3) Click on create user button



4) Set a permission of new user

 


5) Click on created user and we can able to see the option for create a key for that specific user.

 



6) It will prompt the use case of your requirement.

 


7) Choose the Command Line Interface option 

8) Take a note of Access Key and secret access key. We need to define these parameter inside of Terraform while automate the infrastructure.

 



Friday, March 7, 2025

Terraform Introduction

March 07, 2025 0


Terraform Introduction:
Terraform helps user to build, manage or change infrastructure through code. 

Terraform Vs Ansible

IAC [Infrastrucure as code]
  • Manage infrastructure with the help of code
  • It's the code used to provision resources including virtual machines such as instances on AWS , Network infrastructure including gateways etc
  • You write and execute the code to define, deploy, update and destroy your infrastructure
  • Code is tracked in a SCM repository
  • Automation makes the provisioning process consistent, repeatable and updates fast and reliable.
  • Ability to programmatically deploy and configure resources
  • IAC standardize your deployment workflow
  • IAC can be shared, reused and versioned.
• IAC Tools:
    1.Terraform
    2.CloudFormation
    3.Azure Resource Manager
    4.Google Cloud Deployment Manager  
Terraform Overview:
Terraform is an Infrastructure Building Tool (Provisioning Infrastructure)
Written in Go Language
Integrates with configuration management and provisioning tools like Chef, Puppet and Ansible.
Extension of the file is .tf or .tf.json (Json Based)
Terraform maintain a state with the .tfstate extension
Deployment of infrastructure happens with a push-based approach (no agent to be installed on remote machines)
Terraform is Immutable. It can’t be changed after it’s created and destroy is the only option.
Terraform is using a Declarative method, Declarative Language is Describing what you're trying to achieve without instructing how to do it.
Terraform is Idempotent ,what ever looking for you which already is present means don't apply and exit without any changes.
Providers are services or systems that Terraform interacts with to build infrastructure on.
Current Terraform Version is 1.11
Terraform is cloud-agnostic but requires a specific provider for the cloud platform
Single Terraform configuration file can be used to manage multiple providers
Terraform can simplify both management and orchestration of deploying large-scale, multi-cloud infrastructure
Terraform is designed to work with both public cloud platforms and on-premises infrastructure (private cloud)
Terraform Workflow
        1.Scope 2.Author 3.Intilaize 4.Plan 5.Apply
 
Configuration file of Terraform:










Monday, March 3, 2025

Troubleshooting of sendmail issues in Linux machine

March 03, 2025 0

 


Troubleshooting of Sendmail issues:

Sendmail servers can produce some wide range of problems that any Unix server can generate, most daily Sendmail issues fall into just a few categories which is related to mail connection, Sendmail relay configuration and SMTP auth issues.

1) Email not deliverable:

We can able to valid whether user email ID or local domain is able to deliver from the system.

#sendmail -bv usernameEmailID
#sendmail -bv root@localdomain

2) Check the status of sendmail service. the mail server may go down if the server has a high workload.
#systemctl status sendmail
Start the sendmail if the service is down,
#systemctl start sendmail

3) Check the sendmail relay server details in the sendmail.cf configuration file.
#grep ^DS /etc/mail/sendmail.cf
The relay server should be resolve from the host, otherwise the mail request will not able to resolve by DNS and through Transient parse error -- message queued for future delivery error.

4) mqueue is got filled due to mail thread is not able to deliver or in the mail queue.
We used to face a var file system reached 100% utilization due to this problem.
    i) check the current status of mail
        #mailq
    ii) Try to deliver the pending mail queue 
        #sendmail -v -q
    iii) Stop the sendmail service
        #systemctl stop sendmail
    iv) move or delete the mqueue list
        #mv /var/spool/mqueue/* /temporary_location
      v) Start the sendmail service
        #systemctl start sendmail

5) Validated the sendmail functions
       i) send a testmail from the system
            # echo "This is test email" | mailx -v -s "Test mail subject" -S smtp="smtpserver:port" "usermail_ID"
      ii) Open a other terminal and monitor the mail thread 
            #tail -f /var/log/maillog
    
Sendmail Log location : /var/log/maillog

    iii) Check sendmail connectivity from the system
            #nc -vz sendmailsever 25
    iv) check the connection of sendmail
            #ps auxw| grep [a]ccepting
    v) check the system is listening of sendmail
            #netstat  -antp| grep sendmail








Datadog Introduction

March 03, 2025 0

 


Datadog Monitoring Tool:
    Observability is essential for managing modern infrastructure and applications. It brings together real-time metrics from servers, containers, databases, and applications. It achieves this with end-to-end tracing. That’s not all that it can do. It comes up with helpful alerts and fascinating visualizations, offering full-stack observability.

Observability of three metrics [Rate, errors and duration] provide a well rounded view of service performance.
Rate : Monitoring a HTTP and API calls of your services
We will get to know the service over load for monitoring the HTTP and API calls. We will take action if any spikes or sudden drops in the rate of requests. It could indicate issues such as sudden traffic surges, DDos attacks or failures in the upstream.
Errors: Track how many of those request fail.
The error could be server error, database error or failed API calls. We can quickly identify issues with our application or backend system while tracking these errors.
Duration: Measure how long those requests take with latency
High latency will degrade a user experience particularly for real time or interactive applications. Monitoring of duration will allow us to detect a performance bottlenecks before they affect our users.

Below are few of the Datadog monitoring types and we can create and use it.

APM: Monitor application performance monitoring (APM) metrics or trace queries.
Metric: Compare values of a metric with a user-defined threshold.
Logs: Alert when a specified type of log exceeds a user-defined threshold over a given period of time.
Database Monitoring: Monitor query execution and explain plan data gathered by Datadog.
Error Tracking: Monitor issues in your applications gathered by Datadog.
Real User Monitoring: Observe user behavior and monitor frontend performance.
Synthetic Monitoring: Simulate user actions to test API endpoints or website functionality.
Anomaly: Detect anomalous behavior for a metric based on historical data.
Cloud Network Monitoring: Monitor cloud-specific network configurations and traffic.

Infrastructure Monitoring: 
Datadog can monitor the performance and health check of our entire infrastructure. This includes servers, containers, databases, and cloud services. It provides:
  • Metrics collection
  • Visualizations and dashboards
  • Alerting
  • Anomaly [behaves differently than usual] monitoring
  • Infrastructure maps
  • Logs and traces integration
  • Automation
Application Performance Monitoring (APM)
Datadog offers APM functionality to monitor and optimize the performance of your applications. It provides detailed visibility into application code, dependencies, and performance bottlenecks. With it, you can track response times, error rates, and throughput. What’s more, you can gain visibility into the performance of individual requests.

Distributed Tracing
Distributed tracing, Datadog allows teams to trace requests as they flow through your complex, distributed systems. It helps you to:
  • Identify latency issues
  • Understand dependencies between services
  • Troubleshoot performance problems across microservices architectures
  • You can see how you can easily identify the root causes of application performance issues. It collects data moving between services.
Log Management
Datadog enables centralized log management. This allows you to collect, index, search, and analyses logs from various sources. You can aggregate Datadog logs from multiple systems and applications. You can set up alerts based on log events. Also, you can collect the customized logs from the servers.
Real-time Metrics and Dashboards
 Datadog provides real-time metrics and customizable dashboards. A Datadog dashboard helps to visualize and monitor the health and performance of our systems. You can create visualizations, charts, and graphs to:
  • Track key metrics
  • Set up alerts based on thresholds
  • Share dashboards with your team
Collaboration and Notifications
Datadog offers collaboration features that allows teams to work together effectively. You can annotate and share graphs, dashboards, and alerts. You can even set up notifications via email, SMS, or third-party integrations. Want to integrate incident management tools like Slack, Jira, and PagerDuty? No problem. You can even collaborate on troubleshooting and resolving issues.
Integration and Extensibility
Datadog integrates with a wide range of tools and services. As you can imagine, this makes it easy to collect data from various sources. You can integrate Datadog with all the popular cloud platforms. Don’t stop there, you can integrate it with all your existing workflows.

Saturday, March 1, 2025

Creating AWS Load Balancer Controller under EKS in the AWS environment

March 01, 2025 0


 AWS Load Balancer Controller:

Architecture diagram


Associates an OIDC provider with your EKS cluster:

eksctl is a CLI tool for EKS cluster in AWS. We can able to map the existing OIDC provider into EKS cluster through below CLI command.

#eksctl utils associate-iam-oidc-provider --cluster test-demo-cluster  --approve --region us-east-2

Created an IAM role for the EKS cluster:

An Amazon EKS cluster IAM role is required for each cluster. Kubernetes clusters managed by Amazon EKS use this role to manage nodes and the legacy Cloud Provider uses this role to create load balancers with Elastic Load Balancing for services.

Creating the Amazon EKS cluster role:

You can use the AWS Management Console or the AWS CLI to create the cluster role.
AWS Management Console
Open the IAM console at https://console.aws.amazon.com/iam/.
Choose Roles, then Create role.
Under Trusted entity type, select AWS service.
From the Use cases for other AWS services dropdown list, choose EKS.
Choose EKS - Cluster for your use case, and then choose Next.
On the Add permissions tab, choose Next.
For Role name, enter a unique name for your role, such as eksClusterRole.
For Description, enter descriptive text such as Amazon EKS - Cluster role.
Choose Create role.

AWS CLI
a) Copy the following contents to a file named EKS-loadbalancer-policy.json.
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "Service": "eks.amazonaws.com" }, "Action": "sts:AssumeRole" } ] }

b) Create an IAM policy:

#aws iam create-role \
  --role-name AWSLoadBalancerControllerIAMPolicy  \
  --assume-role-policy-document file://"EKS-loadbalancer-policy.json"

Set up an IAM service account in an EKS cluster, allowing the AWS Load Balancer Controller to manage AWS Load Balancers on behalf of the Kubernetes cluster.

  • Creates a Kubernetes ServiceAccount named aws-load-balancer-controller.
  • Associates it with an IAM Role (AmazonEKSLoadBalancerControllerRole).
  • Attaches the AWSLoadBalancerControllerIAMPolicy.
  • Allows Kubernetes to use AWS IAM for authentication.

eksctl create iamserviceaccount \
  --cluster=alb-demo-cluster \
  --namespace=kube-system \
  --name=aws-load-balancer-controller \
  --role-name AmazonEKSLoadBalancerControllerRole \
  --attach-policy-arn=arn:aws:iam::<aws-account-id>:policy/AWSLoadBalancerControllerIAMPolicy \
  --region us-east-2 \
  --approve

Validated the controller:
#kubectl get deployment -n kube-system aws-load-balancer-controller

Step 2: Install AWS Load Balancer Controller:

Install the AWS Load Balancer Controller.
Installs the AWS Load Balancer Controller in the kube-system namespace.
Links it to the existing aws-load-balancer-controller service account.
#helm install aws-load-balancer-controller eks/aws-load-balancer-controller \
  -n kube-system \
  --set clusterName=alb-demo-cluster \
  --set serviceAccount.create=false \
  --set serviceAccount.name=aws-load-balancer-controller

Step 3: Validated the load balancer:

#kubectl get deployment -n kube-system aws-load-balancer-controller

GCP - Introduction