Hello & Welcome
I Am A Data

An eloquent tale of my journey
Hi 👋, I'm Saikrishna Paila, an accomplished data science professional with a Master's degree in Data Science from The George Washington University. I am adept in leveraging advanced analytical techniques, including predictive modeling, machine learning, and natural language processing, to uncover meaningful patterns and trends within complex datasets. I have a proven track record in credit card churn analysis, supply chain management database optimization, and retail analytics for subscription insights, achieving high prediction accuracy through robust data preparation and model tuning.
- Website: www.saikrishnapaila.com
- Phone: +1 (571)-353-9563
- City: Arlington, VA
- Age: 22
- Degree: Masters
Languages
Tools
Certificates
Projects
Skills I've Developed and Mastered
With a solid foundation in Data Science and Computer Science, I specialize in merging technical expertise with business insights to develop solutions that drive significant competitive advantages across various sectors.
- Advanced Data Analytics and Visualization
- Machine Learning and Predictive Modeling
- Deep Learning Algorithms
- Large Language Model


Work Experience
Jan 2025 – Present

AI Engineer (Capstone Fellow), The World Bank, Washington, D.C., USA
I built an AI-powered legal chatbot using Retrieval-Augmented Generation (RAG) by integrating LangChain, Streamlit, Pinecone, and LLaMA via the Groq API, enabling users to ask legal questions and receive accurate, document-grounded responses. As part of this, I collected and processed 478 legal documents from Ghana and Sierra Leone using web scraping and Google Vision OCR (90% accuracy), chunked the content, generated embeddings, and indexed them in Pinecone for semantic search. The full-stack LLM application was architected and deployed on Google Cloud Platform (GCP) using Docker, FastAPI, and CI/CD pipelines, ensuring scalable access and automated backend updates for internal stakeholders.
March 2025 – Present

Student Technical Support Specialist Washington, D.C., USA
Guided students on software setup, debugging code, managing Git workflows, and resolving Python, R, and SQL issues across Jupyter and VS Code environments.
Jan 2025 – Present

Graduate Instructional Assistant – Data Visualization, The George Washington University, Washington, D.C., USA
Provided guidance to undergraduate students in mastering data visualization techniques, using tools such as Python, and RStudio.
Assisted in designing assignments and grading projects that emphasized effective storytelling through visual analytics.
Delivered hands-on sessions to enhance students' skills in creating interactive dashboards and data-driven narratives.
Supported students in leveraging statistical methods and visualization principles to analyze and present data effectively.
Aug 2024 – Dec 2024

Graduate Instructional Assistant – Data Science Capstone, The George Washington University, Washington, D.C., USA
Guided undergraduate students in the Data Science Capstone course, assisting with data extraction, preprocessing, and analysis to develop comprehensive final-year projects.
Led discussions on project strategy, model evaluation, and data visualization, enhancing students' abilities to communicate findings effectively and adhere to data science best practices.
Education
Anticipated May 2025

Master of Science, Data Science, The George Washington University, Washington DC.
Relevant Coursework: Data Mining, Data Warehousing, Visualization of Complex Data, Machine Learning, Cloud Computing, Algorithm Design, NLP for Data Science, Linux for DevOps, Foundational Pedagogy GradAsst.
Jun 2019 - Jul 2023

Bachelor of Technology, Computer Science, Amrita Vishwa Vidyapeetham, India.
Relevant Coursework: Natural Language Processing, Machine Learning, Distributed Systems, DBMS, Time Series Analysis, Mining of Massive Datasets, Social Network Analysis.
Projects

AI Meets Law: Transforming Legal Research
Integrated Retrieval-Augmented Generation (RAG) and fine-tuned LLMs for efficient legal research, featuring a user-friendly Streamlit interface.

Text Classification Using SciBERT
Implemented multi-label classification with SciBERT, leveraging LSTM, CNN, and Fully Connected layers for scientific text categorization.
Data-Driven Insights for Apple Stock Market Predictions
Developed robust time-series forecasting models, including LSTM with XGBoost and Voting Regressor, and enhanced ARIMA models.

Advanced Data Visualization of Los Angeles Crime Patterns
Developed and optimized a data visualization platform for analyzing Los Angeles crime trends using Tableau and Python.
Credit Card Churn Analysis: Predictive Modeling for Customer Retention
Developed a churn prediction model using Random Forest, Logistic Regression, and Decision Tree algorithms.
Supply Chain Management Database Efficiency Analysis
Investigated the efficiency of MySQL, MongoDB, and Neo4j in supply chain management by benchmarking their performance.
Languages & Tools


Multi-Agent Financial Analytics Engine
Abstract: Built a multi-agent AI bot for real-time financial insights using OpenAI, Groq, and Yahoo Finance APIs.
Tools & Technologies Used: Python, FastAPI, OpenAI API, Groq API, Yahoo Finance API.
Conclusion: Delivered a web-based solution that automates financial analysis and insight generation.

AI Meets Law: Transforming Legal Research with RAG and Fine-Tuned LLM
Abstract: Integrated RAG and fine-tuned LLMs for efficient and precise legal research.
Tools & Technologies Used: Python, Streamlit, FAISS, GPT fine-tuning, SentenceTransformer
Conclusion: Delivered an AI solution for precise legal research with user-friendly interaction.

Text Classification Using SciBERT
Abstract: Multi-label text classification with SciBERT and a multi-head architecture for scientific abstracts.
Tools & Technologies Used: Python, SciBERT, PyTorch, LSTM, CNN
Conclusion: Delivered a high-performing text classifier leveraging SciBERT and innovative deep learning techniques.

Data-Driven Insights for Apple Stock Market Predictions
Abstract: Developed time-series forecasting models to analyze Apple Inc.'s stock data spanning nearly four decades.
Tools & Algorithms Used: Python, LSTM, XGBoost, Voting Regressor, ARIMA
Conclusion: Improved accuracy in stock predictions through feature engineering, hyperparameter optimization, and ensemble techniques.

Advanced Data Visualization of Los Angeles Crime Patterns
Abstract: Created a data visualization platform for analyzing Los Angeles crime trends using a 900K record dataset.
Tools Used: Tableau, Python
Conclusion: Enhanced law enforcement strategies through interactive dashboards highlighting key geographic and temporal patterns.

Credit Card Churn analysis
Abstract: Developed a churn prediction model to enhance customer retention in the credit card industry.
Tools & Algorithms Used: R, Random Forest, Logistic Regression, Decision Tree
Conclusion: Significantly improved customer loyalty strategies through analysis of transaction data.

Supply Chain Management Database Efficiency Analysis
Abstract: Investigated the efficiency of various databases in supply chain management using DataCo's dataset.
Tools Used: MySQL, MongoDB, Neo4j
Conclusion: Optimized database selection for supply chain applications based on data retrieval speed and network pathfinding performance.

Retail Analytics: Utilizing ML for Subscription Insights
Abstract: Predictive analytics framework using Random Forest and Neural Network models for Customer Shopping Preference analysis.
Tools & Algorithms Used: Python, Random Forest, Neural Network
Conclusion: Achieved 97% prediction accuracy in customer subscription behavior through advanced data normalization, categorical encoding, and hyperparameter optimization techniques.

Indoor Multi-Camera Human Tracking
Abstract: Developed a state-of-the-art multi-camera people detection and tracking system for indoor environments.
Tools Used: Python, OpenCV, TensorFlow, Keras
Conclusion: Achieved substantial performance improvements in open environments by successfully addressing occlusion challenges.
Achievement Hub
Graduate Instructional Assistant – Data Science Capstone
Guided undergraduates through their end-to-end capstone projects—from idea generation to final presentations. Grateful to Professor Sushovan Majhi for his mentorship and trust throughout the semester.
Fine-Tune Your LLM
Through this course, I honed advanced skills in fine-tuning large language models to suit specific business and technical needs.
RAG and Fine-Tuning Explained
This course enhanced my understanding of Retrieval-Augmented Generation (RAG) and fine-tuning techniques for LLMs, showcasing their practical applications.
TensorFlow: Working with NLP
In this course, I acquired hands-on skills in TensorFlow for building natural language processing (NLP) models efficiently.
GPT-4 Foundations: Building AI-Powered Apps
This course provided me with foundational knowledge to build AI-powered applications using GPT-4, focusing on software development and generative AI principles.
Introduction to Generative AI with GPT
This achievement highlights my foundational knowledge in the principles of generative AI using GPT models, focusing on their capabilities and application scenarios.
Generative AI: Working with Large Language Models
I gained advanced skills in large language models (LLM), natural language processing (NLP), and generative AI.
Introduction to Prompt Engineering for Generative AI
I gained essential skills in natural language processing, generative AI & Prompt Engineering.
Generative AI: Introduction to Large Language Models
I gained valuable insights into the architecture and applications of large language models and generative AI.
Introduction to Large Language Models
I gained foundational knowledge in the architecture and functionality of large language models (LLMs).
Customer Service Role at GW Commencement Day
I worked at the GW Commencement Day on June 19, 2024, where I played a key role in organizing and facilitating the event. My responsibilities included providing excellent customer service to graduates and guests, ensuring a smooth and memorable experience for everyone involved.
Learning Design Thinking
I acquired advanced competencies in design thinking methodologies for the effective visualization and interpretation of complex datasets.
Design Thinking: Prototyping
I acquired advanced competencies in design thinking methodologies and prototyping techniques.
Student Athletic Assistant at George Washington University
This picture reflects my role as a Student Athletic Assistant at George Washington University, where I ensured excellent guest services and resolved attendee concerns at basketball events.
DataCamp Intermediate SQL
This certification from DataCamp affirms my intermediate skills in SQL, focusing on complex queries and database manipulation for data analysis.