Tools¶
DeepFabric uses Spin, a WebAssembly framework, to execute tools during dataset generation. Tools run in isolated sandboxes, producing authentic training data based on real execution results.
Why Real Execution Matters¶
Traditional synthetic data generators simulate tool outputs, which creates unrealistic training data. With Spin, tools execute against real state:
# Simulated (unrealistic)
Agent: read_file("config.json")
Result: {"setting": "value"} # LLM hallucinated this
# Real execution (accurate)
Agent: read_file("config.json")
Result: FileNotFound # Actual state
Agent: write_file("config.json", content)
Result: Written 42 bytes # Real operation
Architecture¶
┌─────────────────┐ ┌─────────────────┐
│ DeepFabric │────▶│ Spin Service │
│ (Python) │ │ (WASM Host) │
└─────────────────┘ └────────┬────────┘
│
┌──────────────────┼──────────────────┐
▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────┐
│ VFS │ │ Mock │ │ GitHub │
│Component │ │Component │ │Component │
└──────────┘ └──────────┘ └──────────┘
Components are WebAssembly modules that handle specific tool categories:
| Component | Purpose | Tools |
|---|---|---|
| VFS | Virtual filesystem | read_file, write_file, list_files, delete_file |
| Mock | Dynamic mock execution | Any tool loaded via MCP |
| GitHub | GitHub API (experimental) | Issues, PRs, commits |
Quick Start¶
# Install Spin (macOS)
brew install fermyon/tap/spin
# Build and run
cd tools-sdk
spin build
spin up
The service runs at http://localhost:3000.
Configure DeepFabric to use it:
Session Isolation¶
Each dataset sample gets an isolated session. Files created during one sample don't affect others:
# Session A: Creates config.json
# Session B: config.json doesn't exist
# After sample generation, session is cleaned up
Next Steps¶
- Spin Setup - Installation and running
- VFS Component - Virtual filesystem tools
- Mock Component - Dynamic tool mocking
- MCP Integration - Loading tools from MCP servers
- Custom Tools - Creating your own components