Self-hosted game analytics for studios that want control and transparency ๐Ÿ•น๏ธ

Run your Game Analytics on your own infrastructure.

Own your raw event data, reduce analytics costs, and avoid recurring SaaS pricing. DataQuery is a self-hosted analytics pipeline for game studios looking for a practical alternative to Firebase Analytics, Amplitude, devtodev, etc.

No subscriptions. No per-user fees. No per-event pricing.

Default supported setup: VM + GCS + BigQuery + Dataform + Metabase
๐Ÿ‘พ Producers (game)
Backend endpoint, JS SDK; game SDK (soon)
โ†’
โš™๏ธ Collector + NATS + Raw WriterDocker containers on VM
โ†’
๐Ÿงฑ Parquet in GCS
โ†’
๐Ÿ”Ž BigQuery + Dataform
โ†’
๐Ÿ“Š Metabase

When your game grows, analytics costs grow with it

Many studios start with Firebase Analytics, Amplitude, or devtodev because setup is fast. The problem appears later: more players means more in-app events produced, more storage, more querying, and more cost pressure.

๐Ÿ’ธ

Usage-based pricing becomes expensive

Once your game starts generating serious event volume, analytics pricing stops feeling lightweight. High daily event counts can quickly turn a convenient tool into a recurring cost problem.

๐Ÿ”’

Your data lives inside someone elseโ€™s system

Raw event access, exports, and downstream flexibility are often limited by the platform you chose early. That creates lock-in exactly when you need more control.

๐Ÿงฐ

Building everything yourself takes too much time

A custom analytics pipeline gives you freedom, but most studios do not want to spend weeks assembling ingestion, storage, modeling, and BI from scratch.

Why I built DataQuery

Built from a real game analytics cost problem

I work in data engineering and I have also spent time around game development teams where analytics volume grows fast and platform limits become real. I've seen what happens when Firebase Analytics hits the 1 million events per day limit: either you pay for the enterprise version or you start losing data.

DataQuery comes from that exact problem. Teams need a simpler way to run analytics on infrastructure they control, without committing to recurring SaaS pricing or building the whole pipeline from zero.

More about me: dataengineer.fi

A practical self-hosted analytics setup for game studios

DataQuery gives you a deployable analytics pipeline that runs on your infrastructure and keeps raw event data portable from the beginning.

๐Ÿ› ๏ธ One-time setup, customer-owned deployment

I deploy the system into your infrastructure for a one-time 500 EUR payment. No recurring service fee from me. If your team prefers, you can also install it yourself for free.

  • One-time setup instead of recurring vendor pricing
  • Runs on infrastructure you control
  • Self-serve option for technical teams
๐Ÿ–ฅ๏ธ 1 VM to start
๐Ÿณ Docker containers
๐Ÿ“ˆ Scale later when needed

๐Ÿ“ฆ Portable raw event layer

Events are written to the Blob storage in Parquet format (Google Cloud Storage by default). That means your raw event layer stays portable and can be moved, replayed, or connected to other systems later without locking your tracking model to one vendor.

  • Parquet raw storage for portability
  • BigQuery as the current default query layer
  • More downstream databases support coming soon
Events โ†’ Parquet โ†’ BigQuery โ†’ OBT via Dataform โ†’ BI

Default pipeline and deployment model

The default DataQuery setup is designed to be simple to understand, practical to run, and easy to evolve later.

๐Ÿš€

Producer layer

Backend event endpoint is supported now. JavaScript SDK support is available. Game SDKs are coming soon.

  • Backend producers supported
  • JS SDK available (for Chrome extensions)
  • Game SDKs coming soon
โš™๏ธ

Ingestion layer

The default setup runs on one VM with Docker containers for collector-api, raw-writer, and NATS JetStream.

  • Simple starting point
  • NATS JetStream buffers events
  • Scale components later as needed
๐Ÿงฑ

Storage and modeling

Raw events are stored in Google Cloud Storage as Parquet and connected to BigQuery as an external table. OBT data model is provided and done in Dataform. You can use Dataform or dbt to transform data into reporting-friendly output tables in BigQuery on your own.

๐Ÿ“Š

BI layer

Metabase is the suggested open-source BI layer and can be deployed separately, for example on Google Cloud Run or dedicated VM.

Event model designed for flexible analytics

DataQuery uses a simple event envelope with stable top-level fields and flexible custom payload areas.

{
  "event_id": "uuid",
  "app_id": "mygame.prod",
  "environment": "prod",
  "event_name": "level_completed",
  "event_timestamp": "2026-04-06T06:00:00.000Z",
  "received_at": "2026-04-06T06:00:00.412345+00:00",
  "user": {
    "user_id": "123",
    "user_pseudo_id": "anon_abc",
    "session_id": "sess_xyz"
  },
  "device": {
    "platform": "android",
    "app_version": "1.4.2",
    "os_version": null,
    "device_model": null,
    "locale": null,
    "timezone": null,
  },
  "event_params_json": "{
    "level": 12,
    "duration_sec": 84
    }",
  "user_properties_json": "{
    "payer": true,
    "country": "FI"
    }",
  "traffic_source_json": "{}",
  "geo_json": "{}",
  "consent_json": "{}",
  "ingest_request_id": "5d89a0c9-8e70-4108-b5e8-49bf9b2896d8",
  "ingest_user_agent": "iOS ...",
  "ingest_ip_hash": "d8e0...",
  "nats_stream": "EVENTS",
  "nats_sequence": 12345
}

Core fields

event_id, app_id, event_name, and event_timestamp define the event envelope and make ingestion, replay, and downstream processing predictable.

Identity fields

user_id, user_pseudo_id, and session_id support both authenticated and anonymous analytics patterns.

Flexible fields

event_params and user_properties are designed for custom sub-fields. These can later be transformed in Dataform into explicit reporting columns and output tables.

Two ways to use DataQuery

Choose the path that fits your studioโ€™s technical capacity and urgency.

Free self-serve
โ‚ฌ0
For technical teams who want to deploy and operate the stack themselves.
  • Access to the self-serve repository
  • Deployment documentation
  • Core ingestion and storage pipeline
  • Suitable for technical teams comfortable with setup work

Access is shared after a short form, not as instant anonymous download.

Who DataQuery is for, and who it is not for

This is not meant to be everything for everyone. It works best for teams with some analytics maturity who want ownership without building the whole stack from zero.

โœ…

Who DataQuery is for

  • Game studios with a Product Analyst or product/data owner
  • Teams with a small data team but no dedicated data engineer
  • Studios comfortable working with SQL, BigQuery, or BI tools like Metabase
  • Teams that want ownership of data without building the whole pipeline from scratch
๐Ÿšซ

Who it is not for

  • Teams expecting a polished plug-and-play SaaS dashboard
  • Studios with no one comfortable querying data
  • Teams that want zero operational ownership at all

Frequently asked questions

No. DataQuery is a self-hosted analytics pipeline deployed into your infrastructure.

You do. The default supported setup uses your own VM and Google Cloud services where needed.

No. The service offer is a one-time 500 EUR setup payment. The self-serve version is free.

The default setup starts simple, usually on one VM, and can be scaled later as your traffic increases. Cloud VM is easy to scale, usually computing performance can be increased just by updating settings in the interface.

I personally worked with Google Cloud Platform and UpCloud, these two I would recommend.

Use my refferral link for UpCloud (https://signup.upcloud.com/?promo=UR4689) to get 25 EUR as bonus. VM costs start at 3 EUR/month.

Raw events are written to the blob storage in Parquet format (Google Cloud Storage by default).

Yes. Portability of the raw layer is one of the main design goals.

Yes. Backend producers are supported now. JavaScript support is included, and game SDK support is planned.

It is designed for teams that want to move away from those tools and own their analytics stack, though migration details depend on your current event setup.

Yes. I'm working on it.