My journey for becoming a Data Scientist


Data Scientist

Embedding in ML:

Have you ever been in that situation where your dataset has a lot of categorical variables and you try to create a model. One of the solutions is using Hot Encoding. In One_Hot Encoding each categorical column is splitted to many columns depending on the number of categories present in the column. Each column is mapped with 0 or 1s. Normally 1 represents an action that happened and 0 represents the action that did not happen.


Knowledge Tracing

In this blog post I am going to talk about how we can create a model to track a student’s knowledge. Before I start to talk about the model, let’s briefly look at what is knowledge tracing, why it is important, and how it can help students or any person who wants to learn new things.


Ordinal Encoding vs. One-Hot Encoding

Normally our data set is a combination of the numerical and categorical variables or columns. Since machines can only understand the numerical variables, we need to find a way to use the categorical variables in our models. For solving this problem, we should convert the categorical variables to the numbers. This process is called categorical encoding. There are several categorical encoding techniques. Here I am going to briefly talk about which Encoding techniques are appropriate when we have Ordinal variables or nominal variables.


HeatMap & Correlogram:

Introduction :

In this post I am going to briefly talk about correlograms. Correlograms are a convenient way to show the relation between numeric variables. Using the different shades of colors in diagrams helps users to have a better understanding of the relation among the variables. Before I start to talk about correlograms, let’s talk about Heatmap. Correlogram is a variant of the heatmap that replaces each of the variables on the two axes with a list of numeric variables in the dataset. [1]


Introduction to Supervised Learning in Machine Learning:

When I started learning Machine Learning, I was found it confusing, especially when I want to use the many algorithms, techniques, and methods which are useful for different types of the datasets. How should I use them? So, I decided to put everything in order. First, let’s talk about what exactly ML does.