SplitFire AI: Three Layers of Complexity
A practical look at SplitFire AI's three-layer architecture: data access, audio inference, and the application layer.
Understanding SplitFire AI's Architecture
The easiest way to explain SplitFire AI is to break it into three layers: data access, audio inference, and the app itself. Each layer solves a different problem, and keeping them separate makes the system easier to understand and improve.
┌─────────────────────┐
│ SplitFire AI App │
└─────────────────────┘
▲
│
┌───────────────────────────┐
│ AI Audio Inference │
│ Platform │
└───────────────────────────┘
▲
│
┌───────────────────────────────────┐
│ Data Provider Platform │
│ api.splitfire.ai │
└───────────────────────────────────┘
Layer 1: Data Provider Platform
At the bottom is the Data Provider Platform at api.splitfire.ai. It handles authentication and access to music sources through OAuth.
Community-powered access
This layer connects users to the services they already use:
- Spotify
- YouTube Music
- Apple Music
Why this layer matters
The data provider layer gives us a clear place to handle:
- Transparency in how access works
- Privacy through OAuth-based authentication
- Extensibility as support expands to more services
- Reliability through a focused service boundary
Its job is straightforward: authenticate users, request the right permissions, and move audio metadata or access into the rest of the pipeline while respecting platform rules and user privacy.
Layer 2: AI Audio Inference Platform
The middle layer is the AI Audio Inference Platform. This is where stem separation and related audio processing happen.
Built on open-source models
We rely on pre-trained open-source models rather than treating the core audio problem as something that has to be rebuilt from scratch.
That approach gives us a few practical benefits:
- Strong starting quality from models trained on large music datasets
- Faster iteration because we can improve the product without redoing the foundation
- Better visibility into how the models work
- Ongoing improvement through open-source research and community contributions
Using open-source models is not just a philosophical choice. It helps us move faster, inspect the system more clearly, and build on work that has already been tested in real-world audio tasks.
Layer 3: SplitFire AI Application Layer
At the top is the SplitFire AI application layer. This is the part people actually use.
Multi-platform access
We want SplitFire AI to be available wherever people work:
- Web application
- Mobile apps
- Desktop clients
- API access
Keeping the interface practical
The app layer is where the technical work becomes usable. It includes things like:
- A simple workflow for uploading, separating, and exporting audio
- Preview controls for checking stems during processing
- Mix controls for muting, soloing, or adjusting parts
- Export options based on format and quality needs
- Integrations that fit into existing music workflows
The goal here is not to show complexity. It is to hide the right amount of it.
How the layers work together
When someone uses SplitFire AI, the flow looks like this:
- The Data Provider Platform authenticates with the music service and retrieves the relevant access.
- The AI Audio Inference Platform runs stem separation using pre-trained open-source models.
- The Application Layer presents the results through controls that make sense in practice.
Why the layering matters
This structure is useful for a few reasons.
Separation of concerns
Each layer has a clear responsibility, which keeps the system easier to maintain.
Scalability
Different parts of the platform can scale independently as usage changes.
Flexibility
We can add new services, improve models, or ship new product features without having to redesign everything at once.
Transparency
The combination of a clearly defined data layer and open-source model layer makes the system easier to explain and inspect.
Community at every layer
The community shows up in different ways across the stack:
- Layer 1: a community-oriented data access layer
- Layer 2: open-source models improved by researchers and developers
- Layer 3: product decisions shaped by user feedback
What this means for users
A layered system is only useful if it improves the end result. In practice, it means:
- Access to existing music libraries
- Quality from strong open-source audio models
- Trust through a more inspectable architecture
- Convenience across devices and platforms
- Privacy through scoped authentication flows
Looking ahead
This structure also gives us room to keep improving:
- More services in the data provider layer
- Better open-source models in the inference layer
- Better workflows and features in the app layer
That is the basic idea behind SplitFire AI: separate the system into clear layers, keep each part understandable, and make the final experience easier for musicians to use.
Slap it!