Projects
Automated Content Curation and Recommendation Platform
Python, MongoDB, Kafka, Kubernetes, Redis, Google Cloud
-
Designed and implemented a scalable content recommendation system focusing on user-to-topic mapping, utilizing MongoDB to store user profiles and topic vectors created via advanced topic modeling (e.g., BERTTopic).
-
Developed real-time data ingestion pipelines using Kafka and Kubernetes, integrating data from external APIs (e.g., News, Reddit, YouTube) and employing microservices to generate personalized feeds and summaries.
-
Optimized system performance and response times through Redis caching for pre-generated feeds and implemented monitoring dashboards via Prometheus and Grafana to track metrics like cache hit ratios, query performance, and resource utilization.
Clinical Natural Language Technology for Health Care
NLP, OCR, Computer Vision, LLMs, Flask
-
Developed an automated medical document analysis tool using Flask, integrating OCR (Tesseract) for text extraction and NLP (spaCy, Hugging Face) for entity recognition and summarization.
-
Built a deep learning model (ResNet) for classifying medical images such as X-rays and MRIs using PyTorch, achieving 90%+ accuracy and enhancing diagnostic capabilities.
-
Developed a user-friendly interface using Flask and integrated error handling mechanisms, including real-time logging and exception tracking, to ensure the seamless processing of documents for healthcare professionals with minimal downtime and enhanced reliability.
Stock Market Prediction System
Data Mining, Statistics, Machine Learning, ETL, Regression, GANs
-
Utilized regression and time series analysis to build predictive models, resulting in a 20% improvement in trading accuracy and implemented advanced data mining techniques & machine learning algorithms (regression, GANs) to enhance prediction performance.
-
Ensured data integrity and accuracy in trade and transaction reporting through robust ETL processes, improving overall reliability.
-
Demonstrated strong analytical and problem-solving skills in a fast-paced, data-driven environment, enabling more accurate market trend predictions and decision-making insights and it is my take on combining ML algorithms with large data for advance analytics and results.
Benzene Level Analysis in North Carolina
Data Mining, Machine Learning, ETL
-
Collected air quality data via RESTful APIs and conducted proper cleaning, addressing missing values/outliers to ensure data integrity.
-
Applied machine learning models (linear regression, random forests) to predict benzene concentration levels, achieving a model accuracy of 85%.
-
Created interactive visualizations using Tableau and Matplotlib, allowing stakeholders to explore real-time spatiotemporal patterns of benzene.
Analysis of Football Match Performance
Power BI, Python (Pandas, NumPy), Statistical Analysis, Data Visualization
-
Conducted an in-depth statistical analysis of football match data, examining the correlation between various factors to derive performance insights.
-
Utilized Power BI to develop interactive visualizations, highlighting key performance metrics and trends that impacted the outcome of the match.
-
Employed Python for data cleaning and transformation, ensuring the integrity and accuracy of the dataset, and performed statistical modeling to draw actionable conclusions and presented findings in a comprehensive report, providing data-driven recommendations to improve performance.
Asteroid Detection System
Data Mining, Machine Learning, Convolutional Neural Networks (CNN), Statistics
-
Developed a CNN model to detect hazardous celestial bodies, achieving 92% detection accuracy through transfer learning and hyperparameter tuning and leveraged statistical methods and data mining techniques to identify critical features, improving model’s precision and recall metrics.
-
Integrated the model into a real-time analysis platform, allowing for continuous monitoring and threat detection, enhancing preparedness for astronomical events and collaborated with stakeholders to deliver a scalable solution.
Covid Prediction System
Data Mining, Machine Learning, Convolutional Neural Networks (CNN), Symptom Analysis
-
Developed a CNN-based prediction engine that assesses COVID-19 infection likelihood based on patient symptoms, to enable early diagnosis and provide symptom-based recommendations that help local health management teams optimize resource allocation and manage daily reported cases.
-
Implemented advanced data preprocessing techniques, including outlier detection, imputation for missing values using k-nearest neighbors (KNN), and normalization of patient health metrics, resulting in an 18% increase in model accuracy.
-
Collaborated with healthcare authorities to integrate the model into existing infrastructure, enhancing patient care and operational efficiency.
Restaurant Management System
HTML/CSS, JavaScript, MySQL, Node.js
-
Developed a full-stack web application for managing online food orders, integrating a user-friendly front-end interface with a robust MySQL database for efficient order tracking and management.
-
Implemented real-time order status updates and seamless payment processing, reducing order processing time by 25% and improving operational efficiency and enhanced the user experience through responsive design and streamlined workflows, leading to a 30% increase in user satisfaction.
Interactive Sales Performance Insights Dashboard: Airline Industry
Tableau, Power BI, SQL, Python
● Designed and developed an interactive dashboard to monitor key sales metrics and customer behavior, enabling real-time decision-making.
● Integrated data from multiple sources to create a unified, dynamic view of sales performance.
● Implemented interactive filters and drill-down capabilities, leading to a 40% improvement in decision-making speed.
● Reduced reporting time by 30% by automating data updates and visualization refreshes. collaborated with cross-functional teams to ensure the dashboard met business requirements and supported strategic initiatives.