Error Handling – Core Design Decision

Error handling in a software is very critical.
We often under-engineer our implementations around it.
Handling a few generic error messages is the easy part.

But,
1. How can the software recover gracefully from these error messages?
2. How can the customer experience not degrade post the error?
3. How is the error logged and iterated upon with an intelligent fix?

These are the core questions that come to my mind to have a clean implementation around error handling in software development.

#software #design #errorhandling #builditbetter

AWS – Extending EBS Block linked to EC2 Instance

Say you are working on an EC2 instance with an EBS block provisioned.
But later you find that the storage already provisioned is insufficient. You might need to increase the volume size.

It’s pretty straight-forward on the AWS console. With the click of a button you can tell AWS to increase your EBS size.

Post this change,
Though AWS takes a few mins to extend this volume, this does not reflect on the EC2 instance.

Manual commands need to be run, to extend an existing EBS volume from the small to the newsly assigned bigger size.

Here’s the steps that got it working for me after connecting to the particular EC2 instance related to the volume.



1. df -hT -> Confirm existing storage size

2. lsblk -> display all information of the volumes.

NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT

nvme1n1 259:0 0 30G 0 disk /data nvme0n1 259:1 0 16G 0 disk

└─nvme0n1p1 259:2 0 8G 0 part /

└─nvme0n1p128 259:3 0 1M 0 part

3. sudo growpart /dev/nvme0n1 1

4. sudo resize2fs /dev/nvme0
sudo resize2fs /dev/nvme0n1

This results in the EC2 instance now having access to the whole upgraded EBS volume.

Signing out,
VJ

3 Sneaky Cyber Security Threats to watch out for in 2022.

3 Cyber Security Threats to watch out for in 2022.

2022 seems to be an interesting time in the Cyber Security landscape as the number of cyber crimes are increasing at an alarming rate. Three sneaky threats to watch out for are :-

Magecart Attack

Magecart is a type of data skimming that is used by attackers to capture sensitive information. Attackers are termed as ‘Threat Actors’ in the Cyber Security domain and, from here on in this article, we will refer to them in the same way.

In Magecart Attacks, threat actors capture sensitive information like email addresses, passwords, credit card information through malicious code they implant in websites. They sell this stolen data in the dark web. These attacks mostly happen on consumer facing browser/apps.

Credential Stuffing Attack

In this type of attack, threat actors use a list of compromised user-credentials to breach multiple systems. Many users reuse usernames and passwords across multiple platforms and their accounts can potentially be compromised with this method. The attacks are usually carried out with the help of a well automated system of software bots. Statistically about 0.1% of breached credentials result in a successful login on a new service. Sadly even now, many users keep the same password on multiple platforms, thereby making them plum victims to these sophisticated threat actors.

Password Spraying Attack

Password spraying, as the name goes, ‘sprays’ a single password across multiple usernames on a platform to get unauthorized access into it. Contrary to brute-force attacks that try out multiple passwords on a single username, this attack uses a password only once with a username before moving on to the next username. Hence, this neatly avoids an account from being locked-out due to multiple login attempts. Thus the threat actor remains undetected by the system and continues to be on the prowl, searching for vulnerable accounts.

AWS Elastic IP Pricing: A tricky affair

Elastic IP Pricing is tricky 🙂 Contrary to their pay-as-you-go model, AWS Elastic IP charges are a pay-as-you-don’t-use model.

As observed on AWS documentation, when either of these conditions are met, then Elastic IP Addresses are NOT CHARGED :-

  • The Elastic IP address is associated with an EC2 instance.
  • The instance associated with the Elastic IP address is running.
  • The instance has only one Elastic IP address attached to it.
  • The Elastic IP address is associated with an attached network interface, such as a Network Load Balancer or NAT gateway.

To summarise, basically this means Elastic IPs are only charged if they are idle or not attached to any AWS resource. So just ensure that every Elastic IP you provision is being actively used 🙂

Problem Solved!

Signing Out,
VJ


This article was previously published on Medium.

Use of Correlation IDs to track various downstream service interactions

With a large number of services interacting to provide any given end-user capability,
a single initiating call can end up generating multiple more downstream service calls.
For example, consider the example of a customer being registered. The customer fills
in all her details in a form and clicks submit. Behind the scenes, we check validity of
the credit card details with our payment service, talk to our postal service to send out
a welcome pack in the post, and send a welcome email using our email service. Now
what happens if the call to the payment service ends up generating an odd error?
We’ll talk at length about handling the failure in Chapter 11, but consider the diffi‐
culty of diagnosing what happened.
If we look at the logs, the only service registering an error is our payment service. If
we are lucky, we can work out what request caused the problem, and we may even be
able to look at the parameters of the call. Now consider that this is a simple example,

and that one initiating request could generate a chain of downstream calls and maybe
events being fired off that are handled in an asynchronous manner. How can we
reconstruct the flow of calls in order to reproduce and fix the problem? Often what
we need is to see that error in the wider context of the initiating call; in other words,
we’d like to trace the call chain upstream, just like we do with a stack trace.
One approach that can be useful here is to use correlation IDs. When the first call is
made, you generate a GUID for the call. This is then passed along to all subsequent
calls, as seen in Figure 8-5, and can be put into your logs in a structured way, much as
you’ll already do with components like the log level or date. With the right log aggre‐
gation tooling, you’ll then be able to trace that event all the way through your system:

15-02-2014 16:01:01 Web-Frontend INFO [abc-123] Register
15-02-2014 16:01:02 RegisterService INFO [abc-123] RegisterCustomer …
15-02-2014 16:01:03 PostalSystem INFO [abc-123] SendWelcomePack …
15-02-2014 16:01:03 EmailSystem INFO [abc-123] SendWelcomeEmail …
15-02-2014 16:01:03 PaymentGateway ERROR [abc-123] ValidatePayment …

Book: Building Microservices by Sam Newman

Synthetic Monitoring: An effective way of monitoring a Microservices Application

While monitoring a Microservices application, most of the time we look at values of CPU and Memory of various instances to understand if the system is doing well or not. Given below is another approach to monitor a system without actually checking low level machine stats like CPU and Memory usage of individual instances. This is an approach to understand the overall health of an application.

“I first did this back in 2005. I was part of a small ThoughtWorks team that was build‐
ing a system for an investment bank. Throughout the trading day, lots of events came
in representing changes in the market. Our job was to react to these changes, and
look at the impact on the bank’s portfolio. We were working under some fairly tight
deadlines, as the goal was to have done all our calculations in less than 10 seconds
after the event arrived. The system itself consisted of around five discrete services, at
least one of which was running on a computing grid that, among other things, was
scavenging unused CPU cycles on around 250 desktop hosts in the bank’s disaster
recovery center.
The number of moving parts in the system meant a lot of noise was being generated
from many of the lower-level metrics we were gathering. We didn’t have the benefit of
scaling gradually or having the system run for a few months to understand what good
looked like for metrics like our CPU rate or even the latencies of some of the individ‐
ual components. Our approach was to generate fake events to price part of the portfo‐
lio that was not booked into the downstream systems. Every minute or so, we had
Nagios run a command-line job that inserted a fake event into one of our queues.
Our system picked it up and ran all the various calculations just like any other job,
except the results appeared in the junk book, which was used only for testing. If a re-
pricing wasn’t seen within a given time, Nagios reported this as an issue.”

Book: Building Microservices by Sam Newman

How much Virtualization is too much?

This is one of the best explanations of virtualization I’ve read:-


“Virtualization allows us to slice up a physical server into
separate hosts, each of which can run different things. So if we want one service per host,
can’t we just slice up our physical infrastructure into smaller and smaller pieces?
Well, for some people, you can. However, slicing up the machine into ever increasing
VMs isn’t free. Think of our physical machine as a sock drawer. If we put lots of wooden
dividers into our drawer, can we store more socks or fewer? The answer is fewer: the
dividers themselves take up room too! Our drawer might be easier to deal with and
organize, and perhaps we could decide to put T-shirts in one of the spaces now rather than
just socks, but more dividers means less overall space.”

Book: Building Microservices by Sam Newman

Solr: Using Analysers & Filters to Analyse Queries

A filter may also do more complex analysis by looking ahead to consider multiple tokens at once, although this is less common. One hypothetical use for such a filter might be to normalize state names that would be tokenized as two words. For example, the single token “california” would be replaced with “CA”, while the token pair “rhode” followed by “island” would become the single token “RI”.

Because filters consume one TokenStream and produce a new TokenStream, they can be chained one after another indefinitely. Each filter in the chain in turn processes the tokens produced by its predecessor. The order in which you specify the filters is therefore significant. Typically, the most general filtering is done first, and later filtering stages are more specialized.

<fieldType name="text" class="solr.TextField">
  <analyzer>
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.StandardFilterFactory"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.EnglishPorterFilterFactory"/>
  </analyzer>
</fieldType>

This example starts with Solr’s standard tokenizer, which breaks the field’s text into tokens. Those tokens then pass through Solr’s standard filter, which removes dots from acronyms, and performs a few other common operations. All the tokens are then set to lowercase, which will facilitate case-insensitive matching at query time.

 

The last filter in the above example is a stemmer filter that uses the Porter stemming algorithm. A stemmer is basically a set of mapping rules that maps the various forms of a word back to the base, or stem, word from which they derive. For example, in English the words “hugs”, “hugging” and “hugged” are all forms of the stem word “hug”. The stemmer will replace all of these terms with “hug”, which is what will be indexed. This means that a query for “hug” will match the term “hugged”, but not “huge”.

Conversely, applying a stemmer to your query terms will allow queries containing non stem terms, like “hugging”, to match documents with different variations of the same stem word, such as “hugged”. This works because both the indexer and the query will map to the same stem (“hug”).

Word stemming is, obviously, very language specific. Solr includes several language-specific stemmers created by the Snowball generator that are based on the Porter stemming algorithm. The generic Snowball Porter Stemmer Filter can be used to configure any of these language stemmers. Solr also includes a convenience wrapper for the English Snowball stemmer. There are also several purpose-built stemmers for non-English languages. These stemmers are described in Language Analysis.

 

Courtesy: lucene.apache.org

Method: ActiveRecord::Base.import

Defined in:
lib/activerecord-import/import.rb

.import(*args) ⇒ Object

Imports a collection of values to the database.

This is more efficient than using ActiveRecord::Base#create or ActiveRecord::Base#save multiple times. This method works well if you want to create more than one record at a time and do not care about having ActiveRecord objects returned for each record inserted.

This can be used with or without validations. It does not utilize the ActiveRecord::Callbacks during creation/modification while performing the import.

Usage

Model.import array_of_models
Model.import column_names, array_of_values
Model.import column_names, array_of_values, options

Model.import array_of_models

With this form you can call import passing in an array of model objects that you want updated.

Model.import column_names, array_of_values

The first parameter column_names is an array of symbols or strings which specify the columns that you want to update.

The second parameter, array_of_values, is an array of arrays. Each subarray is a single set of values for a new record. The order of values in each subarray should match up to the order of the column_names.

Model.import column_names, array_of_values, options

The first two parameters are the same as the above form. The third parameter, options, is a hash. This is optional. Please see below for what options are available.

Options

  • validate – true|false, tells import whether or not to use \
    ActiveRecord validations. Validations are enforced by default.
  • on_duplicate_key_update – an Array or Hash, tells import to \
    use MySQL's ON DUPLICATE KEY UPDATE ability. See On Duplicate\
    Key Update below.
  • synchronize – an array of ActiveRecord instances for the model that you are currently importing data into. This synchronizes existing model instances in memory with updates from the import.
  • timestamps – true|false, tells import to not add timestamps \ (if false) even if record timestamps is disabled in ActiveRecord::Base
  • +recursive – true|false, tells import to import all autosave association if the adapter supports setting the primary keys of the newly imported objects.