In many CRM platforms, customers can connect multiple sources of data for different analytics and ML use cases such as Recommendations, Churn Prediction, Identity Resolution etc.

In this post we look at one of such use case which is Identity Resolution i.e. identifying which profile are the same. For e.g. users of a customer can come through multiple channels such as app, website, mobile browser, whatsapp, api etc. with different profiles.

Doing membership tests is one of the common use-cases required by many applications.

- In web crawling we want to know whether an URL has been crawled or not.
- In recommendations systems we want to know whether an item has already been recommended to a user or not.
- To prevent DDOS attacks and IP Spoofing we want to know whether an IP address has been blacklisted or not.
- In ad-words, we want to know whether there are any ads corresponding to the words and n-grams present in a query.
- In search we want to know whether a query term is indexed…

Collection of interesting and challenging Dynamic Programming problems and solutions (in Python).

**Problem 1:**

Given a string containing just the characters `'('`

and `')'`

, find the length of the longest valid (well-formed) parentheses substring. (https://leetcode.com/problems/longest-valid-parentheses/)

**Solution:**

Let dp[i] be the length of longest valid parenthesis ending at index i.

Note that dp[i] = 0 if s[i] = ‘(‘ because parenthesis cannot end with ‘(‘.

If s[i] = ‘)’ and s[i-1] = ‘(‘, then dp[i] = dp[i-2]+2, because valid parenthesis is of the form ‘A()’ where A is a valid parenthesis of length ≥ 0.

Else if s[i] = ‘)’ and…

Combinatorics problems are one of the most fascinating category of DS & Algo. Although the problems looks daunting at first but generally they have an elegant solution at the end.

But very few companies ask problems related to combinatorics mostly due to the fact that some of these problems could be very hard for a candidate to come up with a solution within 30 minutes if the candidates have not seen similar problems before. In this post I have tried to compile some of the ‘easier’ combinatorics problems.

The solution for most of these problems comes under the category of…

In this post we are going to solve problems related to one of my favourite data structures-Tries. This is probably going to be the longest post in this series as the variety of problems that can be solved using Tries are quite a lot. Apart from standard Trie implementation I will demonstrate Radix Tree, which is a compressed Trie (memory optimized Trie).

I will not get into explaining what a Trie data structure is, as that can be found on several websites. This post will mainly deal with some variety of problems that can be solved using Tries.

**Problem 1:**

…

In this post we are going to look at data structures for storing stream of numbers in a way such that the following operations can be efficiently performed (although the data structures we are going to visit here can do a lot more):

- Current minimum and maximum numbers.
- Median of all numbers seen so far.
- Smallest (Largest) number greater (smaller) than a given number.
- Count of all smaller (or larger) numbers given a number.
- Binary search on numbers seen so far.

For 1 and 2, we can use a Min-Heap and a Max-Heap. 3 and 4 are special cases of…

In the last post on binary search optimization, we found out that we can solve optimization problems using binary search under certain conditions. In this post we are going to look at another class of optimization problems that can again be solved using binary search technique. This class of problem is called the Longest Increasing Subsequence. I fondly call it the Russian Doll problem.

In its most simplest form, we are given an array A of size N. Each element of the array A is a real number. …

Binary Search is probably one of the most ‘interesting’ algorithm from our high school and sophomore college computer science course. But most of us have encountered binary search either in the context of searching in a sorted array or some tweaked versions of similar problem:

- Find an element in a rotated sorted array (with and without duplicates allowed).
- Find an element in the 2D sorted array, only rows sorted, only columns sorted and both row and column sorted versions.

While these problems are interesting enough to puzzle even the most well prepared interview candidates. But in this post I will…

In this post we are going to look at a problem that is very relevant from a network analysis point of view. In a circuit network of wires, we would like to identify vulnerable connections that could be a single point of failure. Single point of failures like these are known as Articulation Points.

In a social network we would like to identify people who sits at the vulnerable points and hold together multiple sub-networks. Targeting these people for marketing offers and promotions and preventing them from churning will ensure that the network is still connected. …

In this post we are going to look at one of the ‘simplest’ Dynamic Programming variation — Bitmasking. When I say ‘simplest’ I mean that ‘Bitmasking + DP’ problems are easy to identify and also most bit-masking problems have a more or less similar approach.

Some of the common traits of Bitmasking problems are:

- Path finding problems — Enumerate all possible paths or paths based on certain conditions.
- For a given input array A, depending in which order we go through the elements of the array A our objective function will change.
- They require exponential runtime.
- For folks into competitive…

Machine Learning Engineer, Algorithms and Programming