Page cover

Hi there!

👋 I’m Thoai. I work in the cloud and data platform space, mostly around Kubernetes, Kafka, Spark, and the tools that make modern data systems run smoothly.

I enjoy digging into how data actually flows through systems, how storage works under the hood (pages, blocks, execution…), and how to turn a bunch of scattered services into a clean, maintainable pipeline.

Recently, I’ve been focusing more on Data Engineering, including:

  • designing reliable and scalable data platforms

  • technical stacks such as Iceberg, Lakehouse, Trino, Spark, etc

  • data modeling techniques

  • building ETL/ELT pipelines

  • and so on ...

I created this blog/docs site to capture what I learn, what I experiment with, and the mistakes I run into along the way — hopefully it helps someone else, or at least helps future me.

If you’re into data, distributed systems, or just want to debug a burning pipeline together, feel free to reach out.


My Experiences

location-dot Ho Chi Minh City, Vietnamarrow-up-right

circle-exclamation
chevron-right🧰 Responsibilitieshashtag
  • Designed and integrated a unified feature store based on Feast into a legacy data platform, supporting batch and real-time feature engineering and API-based online serving, with GitOps-style governance and versioning; operated at scale with 14M MAUs, ~2M streaming events/day, and <200 ms feature retrieval latency.

  • Built batch and streaming ETL / feature engineering pipelines using Spark and Airflow; developed internal frameworks and libraries to define & automate streaming pipeline deployment and management, allowing ML Engineers and Analysts to focus on business logic instead of infrastructure complexity.

  • Owned and improved the Risk data & ML platform (Spark, Airflow, HDFS, and related systems), ensuring stable daily operation of 50–60 Spark applications, each processing up to 500M–1B records per run, through performance tuning, monitoring, and operational best practices.

chevron-right🚀 Stackshashtag

location-dot Ho Chi Minh City, Vietnamarrow-up-right

circle-check
chevron-right🧰 Responsibilitieshashtag

Contributed as a core member of the Data Platform team at a cloud service provider, delivering a platform-as-a-service (PaaS) for real-time ingestion, distributed processing, governed access, and self-service analytics to enterprise clients:

  • Engineered a full-fledged LakeHouse platform with comprehensive data governance for Spark and Trino, integrating OAuth2-based identity propagation, fine-grained access control and dynamic data masking through Apache Ranger, automated lineage tracking via OpenMetadata, and standardized encryption at rest with S3 SSE-C.

  • Developed high-throughput CDC pipelines (100GB/day, 5K TPS) using Kafka Connect & Debezium, migrating 500+ PostgreSQL tables to ClickHouse, Iceberg, and S3 to enable low-latency analytics and historical storage.

  • Built a self-service Spark environment on JupyterHub by developing a custom Profile Manager with secure session provisioning, LakeHouse integration (Iceberg on S3), and dynamic environment configuration.

  • Enhanced Spark orchestration experiences by creating custom Airflow plugins integrated with Spark Operator, allowing modular job submission, runtime tracking, and real-time log streaming via Airflow UI.

  • Built unified monitoring dashboards using Prometheus and Grafana to track pipeline SLAs (latency, throughput, error rate) and detect anomalies across Spark, Kafka, and Airflow.

  • Collaborated on provisioning and operating key data platform components (Airflow, Trino, Superset, Kafka Connect) on Kubernetes, and supported deployment automation via FastAPI-based internal tools.

chevron-right🚀 Stackshashtag

location-dot Hanoi, Vietnamarrow-up-right

circle-check
chevron-right🧰 Responsibilitieshashtag
  • Researched Kafka architecture and deployment feasibility, and designed Kafka-as-a-Service solutions on both VMs and Kubernetes.

  • Deployed Kafka on Kubernetes using Strimzi and Implemented end-to-end monitoring with JMX, Telegraf, Prometheus, Grafana, and alerting through Telegram.

  • Built Kong plugins and integrated API gateway into microservices on Kubernetes.

chevron-right🚀 Stackshashtag

My Educations

location-dot Hanoi, Vietnamarrow-up-right

circle-check

The program was a 5-year engineering track (Russian-system), which is internationally mapped as a B.Sc., though locally recognized as an engineer’s degree.


My Contributions

1

githubFeast

  • Optimized MySQL Online Store write performance by implementing batch insert and transaction grouping, significantly reducing write latency. #5699arrow-up-right

  • Introduced HDFS Registry backend, allowing teams to manage Feast feature definitions on Hadoop-compatible file systems. #5655arrow-up-right

  • Added HDFS Staging support for Spark Offline Store, enabling distributed materialization and more efficient large-scale feature computation. #5635arrow-up-right

2

  • Refactored Hikari connection pool logic to prevent NPEs, avoid memory leaks, and improve thread safety using ConcurrentHashMap. #1048arrow-up-right

3

  • Revived and upgraded an abandoned Helm chart to fully support NiFi 2.x, redesigning its clustering, state management, and configuration system to run natively and reliably on Kubernetes without Zookeeper.

Feel free to contact me at

Or just download my resume

Last updated