Reference architectures

Reference architectures for common MooseFS deployments

MooseFS is a general-purpose POSIX file system, so the same software backs very different deployments — from a handful of nodes in one rack to multi-site clusters holding petabytes. The archetypes below sketch the shapes our customers run today; each one is grounded in the public stories on the testimonials page, not in marketing imagination.

Detailed diagrams and sizing guides are coming. Until they land, the best path is a short conversation: tell us what you’re storing, how much of it, and how it gets read, and we’ll reply with a written architecture suggestion.

Archetypes

Four shapes we see most often

Each archetype below is a starting point, not a fixed recipe. Real clusters mix and match — the same MooseFS deployment can host VM disks, archives and analytics datasets side by side on different storage classes.

Hyperconverged hosting

Compute, network and storage on the same nodes — chunkservers run beside the hypervisor on every box. Hot-swappable disks and rolling upgrades mean a node can drop without taking the service with it. Hosting providers and managed-service operators run this shape today — see the hyperconverged infrastructure stories.

Capacity-tier archive

Many chunkservers full of large, slower disks, with erasure coding (up to nine parity sums in the Pro edition) to keep the cost per usable terabyte down. Snapshots and the built-in trash bin give a soft-delete safety net. Backup, archive and bulk content-delivery workloads run this way today — see storage and CDN customers.

Streaming & media ingest

A single POSIX namespace spread across many chunkservers, so concurrent writes (recording) and parallel reads (playback) hit different disks instead of bottlenecking on one box. Webinar platforms, internet TV operators and post-production teams run this shape — see the streaming and broadcast stories.

Analytics & research data lake

Hundreds of compute clients reading and writing in parallel against a single mount, with metadata kept in two or more copies for durability. Sequencing labs, statistical-genomics groups and large-scale internet analytics platforms run this way — Gemius has been doing it since 2005: 6 PB and 300,000 events per second, 24/7. See the analytics and research stories.

Next steps

Talk to us about your deployment

Until the detailed diagrams and sizing guides are published, the fastest way to a useful answer is a short message: rough capacity, the workload pattern (sequential, random, mixed), and any constraints you already know about (network, sites, hardware in hand). We’ll come back with a written architecture suggestion — no sales calls in the meantime.

Talk to us about your deployment    How MooseFS is put together