Data & Platform Engineering

Building data solutions tailored to your needs

I think as a Data Engineer

I turn open problems into systems people rely on — digging out the real need, designing the solution end to end, and shaping it around how people actually work.

LinkedIn Email

My real stack: storage and query engines feed one backend that powers the tools people use.

01 · Approach

The problem defines the path.
Finding it is the work.

The obvious first pass

Ingest the data, learn the schemas, ship the dashboards. Necessary groundwork, but not yet the capability teams would truly reach for.

Embed & discover

Worked beside the people closest to the data to find where it actually hurt.

Decide and ship it, solo

Architecture, stack, deployment, roadmap, vision: every call, mine.

Give me a direction, not a spec.

I turn ambiguity into infrastructure, and own every decision in between.

02 · Impact

What changed because the work exists.

~0%

Faster to root-cause

No more downloading archives, unzipping, and estimating where the fault might be. It surfaces the exact failing cases, and the sessions behind a complaint, so you inspect the right ones, not a random sample.

Sources, one interface

Postgres, MongoDB, Elasticsearch, cloud storage, SharePoint and Jira, all reachable through one interface.

Roles rely on it

Adopted across engineering, product, and customer-facing teams.

Modules, one tool

Search, extraction, analytics, and reporting, under a single interface.

One tool across 5 product families · shipped solo in ~6 months.

03 · The platform

A debugging & analytics platform, from scratch.

Telemetry that used to be locked away is now a self-serve tool teams open every day.

Zero-build frontend

Hand-bundled vanilla JS with no framework, so there is almost nothing to maintain.

Feature "fences"

Each feature is walled off behind its own markers, so any one can be removed in a single pass without touching the rest.

Graceful degradation

If a dependency or module fails to load, the rest of the page keeps working.

Federation to ClickHouse

Moved hot-path queries off live federation into denormalized ClickHouse tables, so common lookups stay fast.

Cache and warmer

Disk-backed result cache, pre-warmed on the pipeline's refresh cadence, so we recompute only when the data changes.

Adaptive queries

Search fan-out splits into chunks and reclassifies heavy vs light at runtime.

Shipped into one tool

Archive extractionMedia transcodingCross-source search SSE / NDJSON streamingInteractive analyticsAI-assisted insights Self-serve samplingRBAC & audit

04 · Inside the platform

Four engines under one roof.

On-demand extraction, cross-source search, live analytics, and self-serve sampling, all on one FastAPI backend.

Archive extraction

Production .tar / .tar.gz / .zip pulled from cloud storage and unpacked on demand.

ffmpeg

NumPy

SciPy matplotlib base64 inline

Session & entity search

Look up a user, org, or product, then stream its sessions from across every source.

ClickHouse

Trino NDJSON / SSE adaptive fan-out

Analytics

Interactive dashboards for data rate, quality tiers, comparisons, and AI-written insights.

ECharts μPlot

pandas

Claude insights

Sampler

Filter, count, and pull every matching archive with one generated command.

Python

gsutil / GCS

aws / S3

Structured · pipeline

A scheduled ETL pipeline

Trino federates the sources; a daily, Dockerized batch job lands rollups in ClickHouse for the dashboards.

Unstructured · on demand

Media & signal

Image frames · video · radar / IQ signal · run logs · nested JSON blobs.

05 · Process automation

Results that report themselves.

A pipeline I built so quality reaches the team with no human in the loop.

When a test run finishes, results land in ClickHouse, refresh the dashboards, and Teams posts the channel a link straight to the session.

06 · Roles as lenses

A problem-solver first.

I start with the problem, not a job title. Solving it well usually means owning the whole thing: the data, the platform, the interface, the deployment, and what it costs to run. Each part sized to what the company needs now, and built to scale later.