Welcome to GlassGen
GlassGen is a flexible synthetic data generation service that allows you to create realistic test data for your applications. It supports various data types and formats, making it easy to generate the data you need for testing and development.
Features
- Multiple Data Generators: Generate various types of data including personal information, business data, and more. Extend with custom generators using the custom schema example
- Flexible Configuration: Configure your data generation using simple JSON schemas
- Multiple Output Sinks: Support for various output sinks including CSV Files, Kafka (Confluent and Aiven), Webhook, and custom sinks
- Rate Control: Fine-tune data generation with configurable records per second (RPS)
- Special Operators: Advanced features like controlled event duplication to simulate real-world scenarios
- Extensible Architecture: Create custom generators and sinks to meet your specific needs
Quick Start
- Install GlassGen:
pip install glassgen
- Create a configuration file:
{
"schema": {
"id": "$uuid",
"name": "$name",
"email": "$email",
"age": "$intrange(18,65)"
},
"sink": {
"type": "csv",
"params": {
"path": "output.csv"
}
},
"generator": {
"rps": 1000,
"num_records": 5000
}
}
- Run GlassGen using either the CLI or Python SDK:
Using CLI
glassgen generate --config config.json
Using Python SDK
import glassgen
import json
# Load configuration from file
with open("config.json") as f:
config = json.load(f)
# Start the generator
glassgen.generate(config=config)
Or directly in your code:
import glassgen
config = {
"schema": {
"id": "$uuid",
"name": "$name",
"email": "$email",
"age": "$intrange(18,65)"
},
"sink": {
"type": "csv",
"params": {
"path": "output.csv"
}
},
"generator": {
"rps": 1000,
"num_records": 5000
}
}
glassgen.generate(config=config)
Documentation
Contributing
We welcome contributions! Please check out our GitHub repository (opens in a new tab) for more information.