Data Science & ML Projects 

My Research Assistant

My Research Assistant

An end-to-end Retrieval-Augmented Generation (RAG) web app built to interactively answer questions related to my peer-reviewed publications. This project is an AI research assistant, which retrieves relevant responses from dense academic papers, researching Supermassive Black Holes, Active Galactic Nuclei (AGNs), Blazars, and QPO analysis of blazars.

PythonLangChainRAGNLPStreamlit
Stock Market Forecasting

Stock Market Forecasting

S&P 500 represents the 500 most valuable companies of the US stock market. In this project, I utilize multiple statistical and machine learning models to forecast the market trend. See for yourself how different models stack up against eachother and how effective are they to forecast impending market crash.

PythonPandasHMMLearnTime Series Analysis
ML Classifications of Fermi-LAT Blazars

ML Classifications of Fermi-LAT Blazars

The raw data from the Fermi-LAT telescope is analyzed for classification of BLL and FSRQ types of blazars. Three classifier ML algorithms were trained for it, Decision Tree (DT), XGBoost DT (GBDT), and Random Forest (RF). The GBDT classifier was found to be the most accurate with accuracy >90%.

PythonXGBoostScikit-learnRandom Forest
Image style transfer using TensorFlow

Image style transfer using TensorFlow

Transfer the artistic style of a 'style' image to a 'content' image. In this project, I utilize a pretrained image layer filtering algorithm VGG19 to transform the style of an image.

PythonTensorFlowVGG19Deep Learning
ArcGIS Pro Mapping of Tigerbird Habitat in Saluda Basin

ArcGIS Pro Mapping of Tigerbird Habitat in Saluda Basin

Using ArcGIS Pro, I created a habitat map of the Tigerbird in the Saluda Basin. The map uses intersection, buffer, and other spatial analysis tools. The data for this project was provided by Clemson University.

ArcGIS ProData AnalysisSpatial AnalysisData Visualization
ArcGIS Pro Mapping of population density in the state of Georgia

ArcGIS Pro Mapping of population density in the state of Georgia

Using ArcGIS Pro, I created a map of the population density in the counties in the state of Georgia. The map shows a gradient color scale for the counties with population less than 100,000. The data for this project was sourced from the US Census Bureau. I also show the distribution of the total population in each county.

ArcGIS ProData AnalysisData Visualization

Hangman ML Solver

This project implements a N-Gram model to learn the word pattern from a dictionary to solve the Hangman Word Game. The N-Gram model is currently ~66% accurate when tested with training dictionary.

PythonN-GramNLPJupyter

Enhancing Periodicity Analysis Accuracy Through Phase Fold Amplitude Minimization (PFAM) Technique

In time-series astronomy, periodicity study is one of the most useful tools to understand an astrophysical system. Research papers on periodicity study often show phase-folded plots to emphasize the presence of periodicity. Depending on the quality of the data points, we can use the phase-folded light curves (LCs) to further enhance the accuracy of the observed period. I discuss an amplitude minimization of the phase-folded LC, which can enhance the accuracy of observed periods in astrophysical LCs.

PythonTime Series AnalysisAstronomyStatistics

Can Blazar flares in gamma-ray LCs be explained by jet angle and geometry?

In Prep...

PythonAstrophysicsSupermassiveBlack HolesData Analysis