Welcome! to portfolio of Virendra Singh Kaira (M.Tech, Data Science)

Deployed Projects

• English to Hindi Translator Anuvadak

Welcome to our English to Hindi translator application! This tool leverages the powerful LLM (Large Language Model) "Helsinki-NLP/opus-mt-en-hi" model, which is renowned for its high-quality translation capabilities. By utilizing the Langchain-HuggingFace framework, we have created a seamless user experience that allows you to effortlessly translate English text into Hindi.

Description of the image


• Synergizing RAG and LLMs for Enhanced Text Generation- Quantum Drug Discovery RAG-LLM-Synergy

RAG (Retrieval-Augmented Generation) represents an innovative approach that merges the capabilities of retrieval systems with generative models. By drawing on external knowledge sources, this architecture significantly enhances the ability to produce accurate and contextually relevant outputs.Trainning of this process start with by loading PDF documents from the local database which contains papers related to Quantum Drug Discovery. It then splits the loaded documents into manageable text chunks, creating embeddings from these chunks using the "BAAI/bge-base-en-v1.5 model". These embeddings are stored in a "FAISS" database, which enables efficient similarity searches for relevant documents. After setting up a retriever for fetching the most pertinent documents based on user queries, the code prepares to generate responses using the "BLOOM" model, known for its multilingual capabilities and high performance.

Description of the image


• Data Statistics DataVoyager

To analyze your data and extract key statistical insights, you first need to upload your CSV file. Once the file is uploaded, statistical measures such as the mean (average value of the dataset), median (the middle value when data is sorted), and mode (the value that appears most frequently) will be calculated to give you a clear understanding of the central tendencies in your data. Additionally, the variance (a measure of data spread indicating how much the values deviate from the mean) and standard deviation (a related measure that indicates how dispersed the data is relative to the mean) will also be computed. These statistics provide a comprehensive overview of your dataset, helping you understand both the distribution and variability of the data points.

Description of the image

• Email Spam Detection SpamShield

Email Spam Classifier is being deployed using Python's Flask framework on Render, a Platform-as-a-Service (PaaS) provider. The classifier was trained using a logistic regression machine learning model to effectively identify and filter spam emails. Flask was employed for the backend to facilitate deployment on the cloud, providing a scalable and accessible solution that allows users to classify emails in real-time. The project demonstrates the integration of machine learning with cloud-based web applications for efficient spam detection.

Description of the image

• Machine Learning Model Premium Predictor

The Random Forest Regressor is an ensemble learning method used for regression tasks, leveraging the power of multiple decision trees to enhance predictive performance and robustness. By constructing a "forest" of decision trees during training, it aggregates their predictions to produce a more accurate and stable output. Each tree is built using a random subset of the data and features, which helps to reduce overfitting and improve generalization. This approach enables the model to capture complex patterns in the data while minimizing the impact of noisy or irrelevant features.

Description of the image

• Deep Learning Model Image Classifier

The MobileNet model, pre-trained with weights from the ImageNet dataset, is a lightweight and efficient convolutional neural network designed for mobile and embedded vision applications. Developed by Google, MobileNet is known for its use of depthwise separable convolutions, which reduce computational complexity and model size while maintaining high accuracy. By leveraging the ImageNet dataset, a large-scale collection of images labeled across 1,000 categories. MobileNet achieves robust feature extraction capabilities. This makes it well-suited for various image classification tasks.

Description of the image