Open to work · Dublin, Ireland

Data Engineer & Analytics Expert


Who I am

About Me

Aniket Ghadge
Open to Work

Data Engineer with 3+ years across financial services and education technology in India and Ireland. Saved analysts ~8–10 hours/week through pipeline automation. Contributed to an estimated €20,000+ in annual operational savings for an SME banking client through data lifecycle and storage migration work.


Verified credentials

Certifications


Core competencies

Skills & Tools

ETL & Data Engineering
Azure Data FactoryDatabricksApache Kafka SnowflakeDelta Lake (DLT)SSIS HadoopTeradataUnity Catalog
☁️
Cloud Platforms
Azure SynapseAzure DevOpsADLS Azure MonitorAWS S3AWS EMR AWS GlueEC2LambdaECS
💻
Programming & Databases
PythonPySparkSQL / T-SQL NoSQLMSSQLPostgreSQL Feature EngineeringDDL & DMLCTEsWindow Functions
🔧
DevOps & Engineering
CI/CDGitHub ActionsJenkins DockerGitAgile / Scrum JiraRESTful APIs
🧠
ML & AI
PyTorchTensorFlowScikit-learn Hugging FaceLangChainRAG LLMsEmbeddingsNLPHDBSCAN / SOMSPSS
📊
BI & Visualization
Power BITableauSSRS DAXExcel (Power Query)KPI Tracking GDPR ComplianceData Lineage

Career timeline

Work Experience

Jan 2025 – Present
Starlite Infotech
Ireland
Data & Analytics Engineer
  • Improved reporting accuracy by 25% by designing production-grade Power BI dashboards with DAX-driven KPIs and optimised semantic models, translating ambiguous business requirements into structured analytics across 50M+ records
  • o Prototyped a real-time Retrieval-Augmented Generation (RAG) solution leveraging LLMs with document chunking, NLP-driven embedding generation, and semantic retrieval to automate case routing and improve internal query resolution accuracy by ~25%
  • Identified and documented pipeline inefficiencies and automation opportunities across 3+ concurrent reporting workstreams, producing functional specifications, data mapping documents, and process workflows using JIRA and Confluence
  • Acted as business partner to delivery, finance, and commercial teams, translating data centre reporting complexities into structured functional and technical requirements aligned with commercial priorities
  • Applied Power Query (M) and DAX to build complex financial and operational calculations, partnering with senior stakeholders to validate scope and maintain data accuracy and business clarity throughout the delivery lifecycle
Feb 2024 – Jun 2024
Brickfield Education Labs
Dublin, Ireland
Data Science Intern
  • Built cloud-based ETL pipelines processing 100GB+ of NoSQL DynamoDB data for real-time accessibility via AWS S3 and EC2.
  • Integrated ML techniques (HDBSCAN, SOM, anomaly detection) to enhance accessibility issue identification by 40%, driving actionable business insights.
  • Delivered GDPR-compliant Power BI dashboards visualizing KPIs for accessibility; presented findings to product and legal stakeholders.
  • Applied SQL optimization (CTEs, window functions, indexing) to ensure analytical queries met interactive analysis timeframes expected by analysts.
Mar 2023 – May 2023
Capita
Ireland
Progress Developer
  • Maintained and tuned complex SQL logic supporting pension management systems, ensuring queries met regulatory reporting timelines.
  • Partnered with business analysts to validate data correctness and resolve production issues within agreed SLAs.
  • Contributed to consistent system availability through cross-team technical collaboration and support.
Aug 2021 – Feb 2023
Consisty System
Pune, India
Data Engineer
  • Delivered production-grade ETL/ELT pipelines using Python, SQL, Azure Data Factory, and Databricks, handling 12TB+ of monthly transactional data from REST APIs, flat files, and enterprise databases.
  • Led on-prem Hadoop migration to Databricks; implemented DLT and Unity Catalog for unified governance, schema enforcement, and lineage tracking.
  • Stabilized PySpark-based distributed processing, reducing pipeline runtimes from multiple hours to within SLA-defined batch windows.
  • Implemented Kafka-based streaming pipelines for event ingestion, maintaining minute-level processing latency aligned with business monitoring needs.
  • Resolved 95%+ of critical data quality issues within SLAs using Azure Monitor, GitHub Actions, and detailed logging for root cause analysis and auditing.
  • Maintained CI/CD pipelines in Azure DevOps with Git-based reviews and Dev/Test/Prod promotion, contributing to stable releases with minimal rollbacks.
  • Collaborated in Agile/Scrum teams via Jira; presented pipeline designs, data-quality findings, and operational trade-offs to cross-functional stakeholders.
Mar 2020 – Jun 2020
Wisdom Sprouts Solutions
Pune, India
Data Engineer Intern
  • Automated integration of web-scraped data into MSSQL databases, reducing manual intervention in data processing workflows.
  • Improved SQL query performance for reporting dashboards, reducing execution time significantly.

Selected work

Featured Projects

🤖
📄 Publication in progress
Deepfake Detection in AV1 Compressed Videos
Deep LearningComputer VisionPyTorchEfficientNet-B0Bi-LSTM

Built a deepfake detection model using EfficientNet-B0 for spatial feature extraction and 3-layer Bi-LSTM for sequential pattern recognition across compressed video frames. Preprocessed data by extracting frames, detecting/cropping faces, and analyzing noise and color distribution. Evaluated using Precision, Recall, F1-score, Confusion Matrix and ROC-AUC — achieving 90%+ accuracy on FaceForensics++ across raw and AV1-compressed formats. Applied SPSS statistical modelling to enhance detection accuracy.

Personalised Healthcare Chatbot using LLM & RAG
LLMsRAGLangChainGPT-4Pinecone

Intelligent Q&A system combining LLMs with RAG over 20K+ MedQuAD medical records. Applied BERT embeddings with Pinecone vector DB for semantic search, and BM25 ranking to improve document scoring — increasing chatbot precision by 20%. Integrated LangChain and OpenAI GPT-4 for dynamic, context-aware patient responses. Deployed via Azure ML pipelines for experimentation and performance monitoring.

End-to-End Data Engineering — Adventure Works
Azure Data FactorySynapsePySparkPower BI

Scalable ETL pipeline using Azure Data Factory and Synapse Analytics to ingest, cleanse, and transform raw sales, customer, and inventory data into a structured warehouse. Designed interactive Power BI dashboards to visualize KPIs — revenue trends, regional sales, product profitability. Implemented data governance with Azure RBAC and Key Vault.

Parcel Delivery System Modelling & Optimisation
OptimisationLinear ProgrammingPythonPuLP

Mathematical models and optimization algorithms (linear programming, genetic algorithms) to simulate and optimize logistics and supply chain systems. Built Python-based simulations with NumPy, SciPy, and PuLP to validate model accuracy and predict system behavior under dynamic conditions — reducing manual intervention by 40%.

View all on GitHub ↗

Academic background

Education

First Class Honours · Top 10% · Dean's List 2024
MSc in Data Analytics
National College of Ireland
Sep 2023 – Oct 2024 · Dublin, Ireland
Advanced Data Modelling, Data Governance & Ethics, Data Engineering, Machine Learning, Statistics. Ranked in top 10% academically and honoured on the Dean's List.
Grade: 7.07 / 10
BEng in Computer Science
AISSMS IOIT · Savitribai Phule Pune University
Aug 2017 – Jun 2021 · Pune, India
Software engineering, algorithms, databases, and systems design. Volunteered and programmed for technical events at Alacrity (2017–2021).
Achievements
Dean's List 2024 — Ranked in the top 10% academically at National College of Ireland.
Research Publication — Collaborating to publish deepfake detection research in a peer-reviewed journal.
Technical Volunteer — Programmed and volunteered for technical events at Alacrity (Aug 2017 – Jun 2021).

Let's connect

Open to New Roles

Actively seeking Data, AI and Analytics roles across Ireland. If you have a position, a collaboration, or just want to connect — I'd love to hear from you.

📧
Email
[email protected]
📱
Phone
(+353) 899 827 371
📍
Location
Ireland
// Currently open to · Ireland
Data Engineering
Cloud Data Architecture
Analytics Engineering
Business Analytics
Machine Learning Engineering
Data Science
AI / Generative AI Engineering
MLOps & AI Platform Engineering
Send a Message