Vinh X. Nguyen

AI Data Engineer Leader
📧 nxv.can@gmail.com 📞 (+1) 548.994.4264 📍 Waterloo, Ontario, Canada

AI Data Engineer Leader with 9+ years engineering big data platforms — from infrastructure to semantic layers. Expert in AWS, Snowflake, Databricks, and LLM-powered multi-agent systems.

About Me

Modeling: OLTP, OLAP, Star schema, Schema-on-read, Delta Lakes, SCD, Data Vault 1.0 & 2.0, Medallion.
Engineering: Led Big Data batch/streaming/ETL at 200B+ events, 30TB text on AWS + Snowflake + Spark + Databricks.
Agentic LLM: Accelerated time-to-insight with LLM-powered multi-agents using MCP, LangGraph, LangChain, and RAG analyzing $1.5B financial transactions.
Architecture: Architect Community Lead at TymeBank (GOTyme – 1M users); defined integration & data patterns across 5 engineering teams.
Cloud-First: Full-stack AWS (security, network, compute, messaging, analytics); Databricks; Snowflake.
Financial Impact: Detected multi-million-dollar revenue leakage. Improved financial reconciliation accuracy to 99.97%.
Mentorship: University lecturer & engineering mentor — ML, algorithms, blockchain, performance & cost optimization.
Sectors: Bank (GOTyme, UBS), Insurance (Manulife, Prudential), Automotive (Cox), Analytics (KPMG, Ryte).

Skills & Expertise

🤖 AI Agentic LLM

  • Multi-agent workflows (LangGraph, LangChain)
  • Model Context Protocol (MCP)
  • GPT-4, Claude, Qwen; RAG pipelines
  • Vector Databases, Dynamic Tool Calls
  • Financial Anomaly Detection Agents

🗄️ Data Modeling

  • OLTP / OLAP at 200B+ records
  • Star Schema, SCD Types
  • Data Vault 2.0
  • Medallion Architecture (500M+ events)
  • Schema-on-read, Delta Lakes

⚙️ Data Engineering

  • Spark (EMR/Glue), Databricks, Delta Lake
  • Kafka, Kinesis, SQS/SNS, DynamoDB
  • Batch, micro-batch & streaming pipelines
  • S3 + Glue Catalog, Snowflake, MySQL
  • AWS Lambda, Sagemaker, Jupyter

☁️ Cloud & Infrastructure

  • AWS: ECS, Kinesis, VPC, WAF, Route53
  • API Gateway, CloudFront, ELB, NAT
  • Snowflake: Snowpipe, Materialized Views
  • Databricks: Delta, Cost-optimization
  • IaC: Terraform, CloudFormation

📊 Analytics & Visualization

  • Tableau, PowerBI, QuickSight
  • Grafana, Datadog, CloudWatch, ELK
  • NLP: TF-IDF, Lemmatization, POS tagging
  • 30TB text processing, 30K req/day
  • Google Analytics (2M/mo)

🔐 Security & Systems

  • PKI, Certificate Pinning (1M+ users)
  • Blockchain, Cryptography
  • Event-driven, Backpressure, Async
  • QoS, MPLS, Load Balancing
  • WAF, VPC Peering, IAM

Work Experience

Data Engineering Lead | Data Product Lead | Data Integrity Lead
FPT Canada — Ontario, Canada
Apr 2021 – Present
  • Architected LangGraph multi-agent workflows for financial anomaly detection.
  • Built custom VS Code Extension (MCP) to integrate Copilot/DBs into dev workflows.
  • Engineered 200B+ row pipelines on AWS/Snowflake with 99.97% enrichment quality.
  • Reduced query time 73% (30s → 8s); improved mapping accuracy from 60% to 99.97%.
  • Audited million-dollar revenue leakage; resumed 2 products by re-engineering ETL.
  • Built a self-served data platform used by 7+ cross-functional teams (DS, Analytics, Finance).
LangGraphAWSSnowflake MCPDynamoDBSpark
Data Product Owner | Data Solution Architect | Architect Community Lead
Tyme Global (GOTyme Digital Bank) — Vietnam
Jul 2019 – Apr 2021
  • Chaired 100+ architecture reviews with up to 20 attendees weekly.
  • Standardized integration patterns used by 5 development teams.
  • Championed Certificate Pinning for 1M+ users against MitM attacks.
  • Launched 3 products: Personal Lending, ID Payment, Bancassurance.
  • Managed 500M+ transaction pipelines; Snowflake DW (2B records), Databricks Delta (50M).
DatabricksSnowflakeGoogle Analytics AWS EMRCertificate Pinning
Technical Architect | Java Recruitment Lead
NFQ Asia — Ho Chi Minh City, Vietnam
Apr 2017 – Jul 2019
  • Engineered NLP pipeline processing 30TB of text using TF-IDF, Lemmatization, POS tagging.
  • Built serverless keyword analytics on AWS handling 20K–50K requests/day.
  • Slashed processing latency 95% (60s → 3s).
  • Investigated and resolved $10K Datadog logging issue.
NLPAWS LambdaTF-IDF ServerlessDatadog
Principal Software Engineer | Technical Architect
NashTech Global — Vietnam
May 2016 – Apr 2017
  • Engineering lead for 20-member team for Banking, Insurance, and KYC.
  • Upskilled 10 developers via daily 30-min training sessions.
  • Re-engineered workflows (10+ iterations) to reduce production incidents.
JavaBankingInsuranceKYC
Technical Training Manager | Solution Architect
FPT Software — Philippines & Vietnam
Jun 2014 – May 2016
  • Established Cebu Dev Center by recruiting 22 Java and .NET engineers.
  • Created training programs for 60+ developers; launched new-graduate program.
  • Achieved 90% recruitment success rate through refined training and hiring processes.
Java.NETTeam BuildingTraining
University Lecturer
Dong Nai Technology University & Nong Lam University — Vietnam
2009 – Present
  • Courses: Data Structures & Algorithms, ML, Blockchain, Cryptography, PKI, Discrete Maths.
  • Network Multiplex Communication at University of Bordeaux (Vietnam Branch).
Machine LearningBlockchainCryptographyDSA

Architecture & Journey

Career Timeline

From training engineers in Vietnam & the Philippines to leading AI data engineering in Canada.

timeline
    title 17+ Years in Engineering Leadership
    2009 : University Lecturer (DNTU / NLU)
    2014 : FPT Software - Training Manager / Solution Architect (PH and VN)
    2016 : NashTech Global - Principal Engineer / Architect
    2017 : NFQ Asia - Technical Architect (NLP at 30TB)
    2019 : Tyme / GOTyme Bank - Architect Community Lead (1M+ users)
    2021 : FPT Canada - Data Engineering Lead (200B+ events, LangGraph agents)
      

Skills Mind-Map

Six pillars across AI, data, cloud, and security.

mindmap
  root((Vinh X. Nguyen))
    AI Agentic LLM
      LangGraph / LangChain
      MCP
      RAG and Vector DBs
      Anomaly Detection
    Data Modeling
      OLTP / OLAP 200B+
      Star / SCD
      Data Vault 2.0
      Medallion
    Data Engineering
      Spark / Databricks
      Kafka / Kinesis
      Snowflake / Delta Lake
      Lambda / Sagemaker
    Cloud
      AWS Full-Stack
      Snowflake
      Databricks
      Terraform / CFN
    Analytics
      Tableau / PowerBI
      Datadog / Grafana
      NLP TF-IDF
    Security
      PKI / Cert Pinning
      Cryptography
      WAF / IAM
      Event-driven
      

Signature Architecture #1 — LangGraph Multi-Agent Anomaly Detection

FPT Canada — LLM agents auditing $1.5B in financial transactions.

flowchart LR
    Tx[(Transactions DW)] --> Orchestrator{LangGraph Orchestrator}
    Orchestrator --> Rules[Rules Agent]
    Orchestrator --> Stats[Statistical Agent]
    Orchestrator --> LLM[LLM Reasoning Agent]
    LLM --> RAG[(Policy RAG / Vector DB)]
    LLM --> MCP[MCP Tools to DBs]
    Rules --> Reviewer[Reviewer Agent]
    Stats --> Reviewer
    LLM --> Reviewer
    Reviewer --> Human[Human-in-the-Loop]
    Reviewer --> Cases[(Case Store)]
      

Signature Architecture #2 — 200B+ Event ETL Platform

FPT Canada — 99.97% enrichment quality across AWS, Snowflake, Databricks.

flowchart LR
    Sources[(Sources)] --> Stream[Kinesis / Kafka]
    Stream --> Bronze[S3 Bronze]
    Bronze --> Spark[Spark on EMR / Databricks]
    Spark --> Silver[S3 Silver / Delta]
    Silver --> Gold[S3 Gold / Delta]
    Gold --> SF[(Snowflake DW)]
    Gold --> Glue[Glue Catalog]
    SF --> BI[Tableau / PowerBI / QuickSight]
    Spark --> QA[Quality and Reconciliation]
    QA --> Alerts[CloudWatch / Datadog]
      

Signature Architecture #3 — Certificate Pinning for 1M+ Users

GOTyme Digital Bank — mitigating MitM at mobile-banking scale.

flowchart LR
    CA[Internal CA / PKI] --> Cert[Server Cert + Backup Pin]
    Cert --> SDK[Mobile SDK]
    SDK --> App[Banking App 1M+ users]
    App --> API[(Banking APIs)]
    Rotation[Rotation Pipeline] --> Cert
    App --> Telem[Telemetry / Crash Analytics]
    Telem --> Flags[Feature Flags / Staged Rollout]
      

Education & Certifications

Master in Computer Science
Université Pierre et Marie Curie (Paris 6) — France
2011–2013
AWS Certified Data Analytics Specialty
AWS — Canada
2023
Statistics for Data Analysis
McMaster University — Ontario, Canada
2025
BSc in Computer Science
Nong Lam University — Vietnam
2025
Advanced Training for Banking Architect Leads
AWS Training Center — Vietnam
2020