I am a data scientist with a background in the pharmaceutical industry. This portfolio is a compilation of respositories for analysis or for exploration of machine learning algorithms.
Projects
Patient Selection for Drug Testing using EHR Data
Built a machine learning model to identify diabetes patients that are likely to be treated with a novel diabetes drug, thereby ensuring effective reach out of the medical representatives to the corresponding physicians.
Sentiment Analysis of Patient Review of Drugs
Performed NLP Sentiment Analysis using a Random Forest Classifier and AWD-LSTM Deep Learning model to predict the sentiment of online drug reviews to positive, negative, or neutral.
Pneumonia Detection
In this project, data from the NIH Chest X-ray Dataset was analyzed and trained with a Convolutional Neural Network (CNN) to classify a given chest x-ray for the presence or absence of pneumonia. This project culminates in a model that can predict the presence of pneumonia with human radiologist-level accuracy.
Covid-19 Quarantine Emissions Impact Model
Created a chart to vizualize the impact on emissions from reduced mobility due to Covid-19 quarantine in the USA.
Plagarism Checker
Plagarism Checker using MinHash and Jaccard Similarity in finding similar items.
Network and Communities Clustering
This project showcases algorithms for network community analysis and clustering. Specifically, the Girvan-Newman community detection algorithm and the k-mediods algorithm.
Probabilistic Latent Semantic Analysis with the EM Algorithm
This project used PLSA with the EM algorithm to determine topics from a corpus of news articles.