machine-learning – Data Insights

Okay, so I think that it has been two weeks or a little more since the last post of my 100 Days of Machine Learning. I made at least 100 posts, one for each day. This post is meant to finally close this part of my journey and give my thoughts on this 100 days journey. I took some time before writing this post to make sure that I got to think about the questions I will try to answer below and things I will reflect on.

Why the 100 Days of Machine Learning?

When I got started with Machine Learning late last year, I had to go through a whole bunch of drivel before I found the substance of things that I would need to learn. I started off by watching most of the videos of Andrew Ng’s Intro to Machine Learning course and I could create simple machine learning solutions by the new year 2018. I spent another six months watching a lot of tutorials, reading books about machine learning, then neural networks, deep learning.
However, I was clearly lacking the practice to gain the confidence needed in creating deployable machine learning and deep learning models. I had gone so far down the theory rabbit hole that I had to do something drastic to pull myself out of it. This was when I watched this video of Siraj Raval’s where he talked of the 100 Days of Machine Learning and its focus on learning/implementation. The idea was simple; spend an hour a day, at least, for 100 days to learn and implement machine learning. I had nothing to lose, so I took on the challenge. I started the 100 Days of Machine Learning in July 2018 and by the end of July, I had spent upwards to 150 hours on just implementation. This was when I started to gain some confidence in my skills and I have only built on top of that.

What areas am I looking forward to working in?

When I started with 100 Days of Machine Learning, I wasn’t quite sure what areas I wanted to work on. I attribute it to the vast amount of things that machine learning and deep learning are applied to. If you take a casual look at Kaggle, you would see a whole wide array of projects; ranging from housing prices to stock movements to predicting cancer by looking at slides. It was about 60 or so days into my 100 Days journey that I found what I was looking for. I implemented a 2-D Convolutional Neural Network to predict patients who had pneumonia. This Neural Network reached accuracy a high as 90%. This was the most satisfying exercise I had done in implementing machine learning and artificial intelligence. I had finally found the sense of purpose when it came to ML/AI which I had spent almost a year looking for. I come from a medical background and maybe it is just how my mind is wired but I felt the most fulfilled when I was either implementing or developing Machine Learning/Deep Learning solutions to problems in the domain of medicine.
I will be focusing on using Deep Learning to automate features and the process surrounding disease detection and diagnosis.

What will I be doing from here on?

Now that I know for a fact that I have some form of a grasp on the theory of Deep Learning, I am focused on developing the skills that are needed to develop end-to-end deep learning solutions. I have worked with AWS EC2 and SQL. I will be spending the next few weeks to develop such skills and create a sample project that I hope to have by the end of the year, if not very early next year. I would try to make it a relatively simple project to get it done and then move on to more complex projects. Next year is all going to be creating more Deep Learning based projects in the domain of medicine and helping alleviate some of the pressures around the bureaucracy of Health Care.

Things I learned from this experience.

Machine Learning worflow form data cleaning to deploying machine learning/deep learning algorithms.
Creating Kaggle/data science workflows.
Learn that most of data science/machine learning is just data pre-processing.
When to take a break after being stuck on a problem for a while.
Mindset needed to develop Deep Learning solutions.
The roadmap to applying Deep Learning to medical and health care problems.

Projects I developed/am proud of.

I am proud of every project that I developed in this time. This is the list of the ones that I developed with short intros:

THIS BLOG!!!!
Breaking down the viewer ratings of movies based on their genre.
- https://github.com/sanster9292/100-days-of-ML-MovieLens-data-
Exploring the link between a country’s population and its fertility rate.
- https://www.kaggle.com/sanwal092/fertility-rate-vs-population
Predicting if Jack would have survived in the movie Titanic using Machine Learning.
- https://www.kaggle.com/sanwal092/would-jack-have-lived
Using Neural Networks to predict pneumonia in patients by looking at their x-ray scans.
- https://www.kaggle.com/sanwal092/intro-to-cnn-using-keras-to-predict-pneumonia
Data Analysis of a survey conducted by Kaggle. I broke the data down by gender, educational backgrounds and the salaries Data Scientists make.
- https://www.kaggle.com/sanwal092/gender-education-and-salaries
FINAL PROJECT: Training a Neural Network to predict the brand of cars by looking at their images.
- https://www.kaggle.com/sanwal092/identifying-audi-bmw-and-mercedes-with-a-cnn

Soooo….

I have loved sharing details about what I am working on. It has served as a way to keep me honest. I will be starting another blog to document my next phase of the journey. I am not sure if it will be on WordPress though. I am working on some very neat things and I would love everyone to join me in this. I will be sharing a lot of my work on these platforms:
LinkedIn: https://www.linkedin.com/in/sanwal-yousaf/
GitHub: https://github.com/sanster9292
Kaggle: https://www.kaggle.com/sanwal092
Join me as I share the details of my adventures down this data wormhole!!!
Until then, it is a wrap for me on this blog!!!!

giphy

November 11th, 2018,

So I think developing the intuition behind Regularization has been the hardest thing I have done since understanding backpropagation. I had to really use the Feynman method for learning this concept. I had to write down over and over again what I understood, then look for things that I couldn’t quite explain in the simplest language without throwing tech lingo.

Regularization, as I understand it in Neural Networks so far, is penalizing the network when it is overfitting the data. This is done by introducing a regularization operator and if that is too high, then the weights of the layers are reduced though not quite zeroed out. This reduces the complexity of the Neural Network and fixes the overfitting problem.

I had to repeat this process probably around 10 times and I think I understand most of it. I am going to repeat this process a few more times and when I can explain this process in the simplest way without resorting to convoluted words, I will be satisfied. Once I am satisfied with my understanding of Regularization, I am going to write an explainer about it, sometime this week.

Category: machine-learning

100 Days of Machine Learning: A retrospective and a look ahead

Why the 100 Days of Machine Learning?

What areas am I looking forward to working in?

What will I be doing from here on?

Things I learned from this experience.

Projects I developed/am proud of.

Soooo….

Day 98 of 100: I FINALLY UNDERSTAND REGULARIZATION….WELL ALMOST!!

Day 97 of 100: Understanding Regularization

Day 96 of 100: Predicting car’s make and model from an image.

Day 95 of 100: Getting ready for the final project for the 100 Days of ML