Skip to content

ayansk11/ayansk11

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

46 Commits
 
 

Repository files navigation


MS Computer Science @ Indiana University, Bloomington


LinkedIn  HuggingFace  Email  AWS Certified


What I'm Working On

Research Assistant — Cybersecurity AI @ Indiana University, Kelley School of Business

  • Building a hierarchical LLM + RL red team adversary for autonomous penetration testing in the CybORG CAGE Challenge 4 enterprise network (9 subnets, 1 red vs 5 blue defenders + 48 green agents)
  • Designed a 3-layer reward shaping system (negated opponent rewards + milestone bonuses + prerequisite penalties) to enable RL training from zero environment signal
  • Evaluated 13 LLMs (0.6B–70B) across 4 inference backends on NVIDIA H100 GPUs - found that no LLM achieves meaningful attack success zero-shot, and model size does not correlate with performance
  • PPO with action masking | vLLM | Curriculum learning | SLURM/HPC on Big Red 200
  • Presenting at SPIE Defense + Security 2026 - April 2026, National Harbor, MD

Featured Projects


Agentic Cybersec Threat Analyst

3-agent LangGraph pipeline that transforms raw CVEs into defense playbooks - ingests NVD, CISA KEV & OTX feeds, maps threats via hybrid RAG over 19K+ MITRE ATT&CK techniques, and auto-generates Sigma detection rules. FastAPI + React + Qdrant.

LangGraph RAG MITRE ATT&CK FastAPI Qdrant


FinSight

Four specialized LangGraph agents (Document, Quantitative, Risk, Synthesis) dissect SEC filings (10-K, 10-Q, 8-K) with hierarchical PageIndex navigation. Local Ollama + Groq cloud fallback. FRED & Finnhub market data. 129 unit + 10 E2E tests.

LangGraph Ollama SEC Filings Groq Finnhub


FinSent-CoT

16,944-sample balanced dataset pairing financial texts with expert Chain-of-Thought reasoning distilled from Qwen3-235B on H100 GPUs. SFT + GRPO formats for on-device LLMs (0.5B–8B). Multi-stage QA with 15.5% rejection rate.

Qwen3-235B SFT GRPO H100 Chain-of-Thought


FinistralAI

Mistral-7B fine-tuned for financial sentiment with LoRA adapters (rank 16, alpha 32). BF16 precision, gradient checkpointing & DeepSpeed multi-GPU training. Published on HuggingFace Hub.

Mistral-7B LoRA DeepSpeed BF16 HuggingFace


DDoS Mitigation

Three-tier defense-in-depth: XDP/eBPF kernel-level rate limiting (10M pps), P4 BMv2 in-network per-flow detection (1024 flows), and BGP FlowSpec/RTBH upstream blackholing. Tested on Jetstream2 with Mininet + FRR.

XDP/eBPF P4 BGP FlowSpec Mininet Jetstream2


Brain Tumor Classification ★ 8

CNN classifying brain tumor MRI scans into 4 categories (Glioma, Meningioma, Pituitary, No Tumor) across 7K+ images. End-to-end pipeline from Kaggle download to prediction.

CNN TensorFlow Medical Imaging Kaggle


Drug Typology Classification ★ 8

Comparative ML study - Logistic Regression, Random Forest, SVM & Voting Ensemble with precision-recall trade-off analysis for minimizing clinical false positives.

Scikit-learn Random Forest SVM Clinical ML


FinEdu.AI ★ 13

Conversational financial education assistant powered by LLaMA-2-7B with RAG over curated financial terminology. Gradio web UI with beginner-friendly explanations and real-world examples.

LLaMA-2 RAG Gradio Financial NLP


Experience

Research Assistant - Cybersecurity AI
@ Indiana University, Kelley School of Business
Hierarchical LLM+RL red team adversary for CybORG CAGE Challenge 4 - RL controller hit 8x heuristic baseline via custom 3-layer reward shaping; benchmarked 13 LLMs (0.6B–70B) across 4 backends on H100s; Presenting @ SPIE 2026 paper
Research Assistant - Software Developer
@ Indiana University, Kelley School of Business
LLM annotation pipeline with Tree-of-Thoughts + Self-Consistency - labeled 167K+ patent abstracts & 22K+ AV disengagement incidents with 7-way classification using OpenAI & Gemini APIs
Software Developer - AI
@ OCG Technologies, Singapore
Fine-tuned Llama 2 chatbot with LoRA/PEFT on SageMaker - 83% accuracy boost; RAG with OpenSearch for 75% support overhead reduction
Software Developer - Data
@ Visual Labs, Mumbai
3 TB drone imagery ETL pipeline — PySpark + OpenCV preprocessing, AWS Lambda orchestration, DynamoDB storage, Power BI dashboards; 45% stakeholder outcome improvement

Tech Stack

Languages Python Java C SQL JavaScript P4 HTML/CSS
ML / DL PyTorch TensorFlow Scikit-learn CNNs XGBoost
RL PPO GRPO GSPO ORPO Curriculum Learning
LLMs vLLM SFT PEFT LoRA/QLoRA RAG LangGraph Transformers Ollama llama.cpp OpenAI API Gemini API
Infra AWS SageMaker Lambda S3 DynamoDB OpenSearch SLURM/HPC (H100) Docker
Web / APIs FastAPI React Streamlit Gradio
Data PySpark Pandas NumPy OpenCV BeautifulSoup
Visualization Power BI Matplotlib Seaborn Altair
Tools Git Weights & Biases DeepSpeed pytest Unsloth HuggingFace Hub


If it can learn, I can build it.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors