JSON Data Validator
Abstract
A comprehensive JSON schema validation tool that validates JSON data against custom schemas, provides detailed error reporting, supports batch validation, and includes data generation capabilities. This project demonstrates advanced data validation, schema design, and file processing techniques.
🎯 Project Overview
This project creates a powerful JSON validation system with:
- Custom JSON schema definition and validation
- File and string validation capabilities
- Batch validation for multiple files
- Detailed error reporting with path information
- Sample data generation from schemas
- Validation statistics and reporting
- Interactive command-line interface
✨ Features
Schema Management
- Custom Schemas: Define and load custom JSON schemas
- File-based Schemas: Load schemas from JSON files
- Built-in Schemas: Pre-defined schemas for common data types
- Schema Information: View detailed schema structure and requirements
Validation Capabilities
- File Validation: Validate JSON files against schemas
- String Validation: Validate JSON strings directly
- Batch Validation: Validate multiple files with pattern matching
- Real-time Validation: Immediate feedback with detailed error messages
Error Reporting
- Detailed Errors: Path-based error reporting with context
- Error Categories: Type mismatches, missing fields, constraint violations
- Validation History: Track all validation attempts and results
- Export Reports: Generate comprehensive validation reports
Data Generation
- Sample Data: Generate valid sample data from schemas
- Schema Analysis: Understand schema structure and requirements
- Testing Support: Create test data for development and testing
🛠️ Technical Implementation
Class Structure
jsondatavalidator.py
class ValidationError:
def __init__(self, path, message, expected=None, actual=None):
# Detailed error information with context
# Path tracking for nested validation errors
class JSONSchema:
def __init__(self, schema):
# Schema definition and validation logic
# Recursive validation for nested structures
def validate(self, data, path="root"):
# Main validation entry point
# Returns validation status and error list
class JSONDataValidator:
def __init__(self):
# Validator management and schema storage
# Validation history and statistics tracking
jsondatavalidator.py
class ValidationError:
def __init__(self, path, message, expected=None, actual=None):
# Detailed error information with context
# Path tracking for nested validation errors
class JSONSchema:
def __init__(self, schema):
# Schema definition and validation logic
# Recursive validation for nested structures
def validate(self, data, path="root"):
# Main validation entry point
# Returns validation status and error list
class JSONDataValidator:
def __init__(self):
# Validator management and schema storage
# Validation history and statistics tracking
Key Components
Schema Validation
- Type Checking: Validate data types (string, number, boolean, array, object)
- Constraint Validation: Check length, range, pattern, and enum constraints
- Required Fields: Ensure all required fields are present
- Nested Validation: Recursive validation for complex data structures
Error Handling
- Path Tracking: Precise error location with JSON path notation
- Context Information: Expected vs. actual value reporting
- Error Aggregation: Collect all validation errors in single pass
- User-friendly Messages: Clear, actionable error descriptions
Data Processing
- JSON Parsing: Robust JSON file and string parsing
- File Operations: Safe file handling with error recovery
- Pattern Matching: Glob pattern support for batch operations
- Data Serialization: Export validation results and reports
🚀 How to Run
-
Install Python: Ensure Python 3.7+ is installed (uses built-in modules)
-
Run the Validator:
python jsondatavalidator.py
python jsondatavalidator.py
-
Getting Started:
- Explore pre-loaded sample schemas
- Validate sample JSON files
- Create custom schemas for your data
- Generate sample data for testing
💡 Usage Examples
Basic Validation
jsondatavalidator.py
# Create validator instance
validator = JSONDataValidator()
# Add a simple schema
user_schema = {
"type": "object",
"required": ["name", "email"],
"properties": {
"name": {"type": "string", "minLength": 2},
"email": {"type": "string", "pattern": r"^.+@.+\..+$"}
}
}
validator.add_schema("user", user_schema)
# Validate data
user_data = {"name": "John", "email": "john@example.com"}
is_valid, errors = validator.validate_data(user_data, "user")
if is_valid:
print("✅ Validation successful!")
else:
for error in errors:
print(f"❌ {error}")
jsondatavalidator.py
# Create validator instance
validator = JSONDataValidator()
# Add a simple schema
user_schema = {
"type": "object",
"required": ["name", "email"],
"properties": {
"name": {"type": "string", "minLength": 2},
"email": {"type": "string", "pattern": r"^.+@.+\..+$"}
}
}
validator.add_schema("user", user_schema)
# Validate data
user_data = {"name": "John", "email": "john@example.com"}
is_valid, errors = validator.validate_data(user_data, "user")
if is_valid:
print("✅ Validation successful!")
else:
for error in errors:
print(f"❌ {error}")
File Validation
jsondatavalidator.py
# Validate JSON file
is_valid, errors = validator.validate_file("data.json", "user")
# Batch validate multiple files
results = validator.batch_validate("data/*.json", "user")
for file_path, (is_valid, errors) in results.items():
print(f"{file_path}: {'✅' if is_valid else '❌'} ({len(errors)} errors)")
jsondatavalidator.py
# Validate JSON file
is_valid, errors = validator.validate_file("data.json", "user")
# Batch validate multiple files
results = validator.batch_validate("data/*.json", "user")
for file_path, (is_valid, errors) in results.items():
print(f"{file_path}: {'✅' if is_valid else '❌'} ({len(errors)} errors)")
Schema Management
jsondatavalidator.py
# Load schema from file
validator.load_schema_from_file("user_schema.json", "user")
# Get schema information
schema_info = validator.get_schema_info("user")
print(json.dumps(schema_info, indent=2))
# Generate sample data
sample_data = validator.create_sample_data("user")
print(json.dumps(sample_data, indent=2))
jsondatavalidator.py
# Load schema from file
validator.load_schema_from_file("user_schema.json", "user")
# Get schema information
schema_info = validator.get_schema_info("user")
print(json.dumps(schema_info, indent=2))
# Generate sample data
sample_data = validator.create_sample_data("user")
print(json.dumps(sample_data, indent=2))
🎨 Interactive Features
Main Menu Options
- Validate JSON File - Validate single file against schema
- Validate JSON String - Validate JSON input directly
- Batch Validate Files - Validate multiple files with patterns
- Load Schema from File - Import schema from JSON file
- Add Schema Manually - Define schema interactively
- View Schema Info - Explore schema structure
- Generate Sample Data - Create valid sample data
- View Validation Statistics - Analytics dashboard
- Export Validation Report - Generate detailed reports
- List Available Schemas - View loaded schemas
Pre-loaded Sample Schemas
User Schema
{
"type": "object",
"required": ["name", "email", "age"],
"properties": {
"name": {"type": "string", "minLength": 2, "maxLength": 50},
"email": {"type": "string", "pattern": "^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}$"},
"age": {"type": "integer", "minimum": 0, "maximum": 150},
"status": {"type": "string", "enum": ["active", "inactive", "pending"]}
}
}
{
"type": "object",
"required": ["name", "email", "age"],
"properties": {
"name": {"type": "string", "minLength": 2, "maxLength": 50},
"email": {"type": "string", "pattern": "^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}$"},
"age": {"type": "integer", "minimum": 0, "maximum": 150},
"status": {"type": "string", "enum": ["active", "inactive", "pending"]}
}
}
Product Schema
{
"type": "object",
"required": ["name", "price", "category"],
"properties": {
"name": {"type": "string", "minLength": 1, "maxLength": 100},
"price": {"type": "number", "minimum": 0},
"category": {"type": "string", "enum": ["electronics", "clothing", "books"]},
"tags": {"type": "array", "items": {"type": "string"}, "maxItems": 10}
}
}
{
"type": "object",
"required": ["name", "price", "category"],
"properties": {
"name": {"type": "string", "minLength": 1, "maxLength": 100},
"price": {"type": "number", "minimum": 0},
"category": {"type": "string", "enum": ["electronics", "clothing", "books"]},
"tags": {"type": "array", "items": {"type": "string"}, "maxItems": 10}
}
}
🔧 Advanced Features
Constraint Types
String Constraints
- minLength/maxLength: String length validation
- pattern: Regular expression pattern matching
- enum: Allowed values from predefined list
Number Constraints
- minimum/maximum: Numeric range validation
- type: Distinguish between integer and float
Array Constraints
- minItems/maxItems: Array length validation
- items: Schema for array elements
- uniqueItems: Ensure array elements are unique
Object Constraints
- required: List of required properties
- properties: Schema for object properties
- additionalProperties: Control extra properties
Error Reporting
jsondatavalidator.py
# Example validation error
ValidationError(
path="root.users[0].email",
message="String does not match pattern",
expected="email pattern",
actual="invalid-email"
)
jsondatavalidator.py
# Example validation error
ValidationError(
path="root.users[0].email",
message="String does not match pattern",
expected="email pattern",
actual="invalid-email"
)
Batch Validation
jsondatavalidator.py
# Validate all JSON files in directory
results = validator.batch_validate("data/*.json", "user")
# Advanced pattern matching
results = validator.batch_validate("**/config.json", "config")
jsondatavalidator.py
# Validate all JSON files in directory
results = validator.batch_validate("data/*.json", "user")
# Advanced pattern matching
results = validator.batch_validate("**/config.json", "config")
📊 Statistics and Reporting
Validation Statistics
jsondatavalidator.py
{
'total_validations': 25,
'successful_validations': 20,
'failed_validations': 5,
'success_rate': 80.0,
'schema_usage': {
'user': 15,
'product': 10
},
'total_errors': 12,
'loaded_schemas': ['user', 'product', 'config']
}
jsondatavalidator.py
{
'total_validations': 25,
'successful_validations': 20,
'failed_validations': 5,
'success_rate': 80.0,
'schema_usage': {
'user': 15,
'product': 10
},
'total_errors': 12,
'loaded_schemas': ['user', 'product', 'config']
}
Validation Report
{
"generated_at": "2024-01-01T12:00:00.000000",
"total_validations": 25,
"successful_validations": 20,
"failed_validations": 5,
"results": [
{
"timestamp": "2024-01-01T12:00:00.000000",
"schema_name": "user",
"is_valid": true,
"error_count": 0,
"errors": []
}
]
}
{
"generated_at": "2024-01-01T12:00:00.000000",
"total_validations": 25,
"successful_validations": 20,
"failed_validations": 5,
"results": [
{
"timestamp": "2024-01-01T12:00:00.000000",
"schema_name": "user",
"is_valid": true,
"error_count": 0,
"errors": []
}
]
}
🛡️ Error Handling
- JSON Parsing: Graceful handling of malformed JSON
- File Operations: Safe file reading with error recovery
- Schema Validation: Comprehensive error checking and reporting
- Type Safety: Runtime type checking and validation
📚 Learning Objectives
- JSON Schema: Understanding JSON schema specification
- Data Validation: Implementing robust validation logic
- Error Handling: Comprehensive error reporting systems
- File Operations: Safe file handling and processing
- Pattern Matching: Working with glob patterns and regular expressions
🎯 Real-world Applications
API Development
- Validate API request/response data
- Ensure data consistency across services
- Generate API documentation from schemas
- Create test data for API testing
Configuration Management
- Validate application configuration files
- Ensure required settings are present
- Check configuration value constraints
- Generate default configuration templates
Data Processing
- Validate data imports and exports
- Check data quality before processing
- Ensure data format consistency
- Generate sample data for testing
🚀 Potential Enhancements
- JSON Schema Draft Support: Implement full JSON Schema Draft 7/2019-09
- Custom Validators: Allow custom validation functions
- Web Interface: Create Flask web application
- Database Integration: Store schemas and validation results in database
- CLI Tool: Command-line interface for automation
- Plugin System: Extensible validation framework
- Performance Optimization: Streaming validation for large files
- Schema Registry: Centralized schema management system
🔍 Schema Examples
Complex Nested Schema
jsondatavalidator.py
config_schema = {
"type": "object",
"required": ["app_name", "version", "database"],
"properties": {
"app_name": {"type": "string", "minLength": 1},
"version": {"type": "string", "pattern": r"^\d+\.\d+\.\d+$"},
"database": {
"type": "object",
"required": ["host", "port"],
"properties": {
"host": {"type": "string"},
"port": {"type": "integer", "minimum": 1, "maximum": 65535},
"ssl": {"type": "boolean"}
}
},
"features": {
"type": "array",
"items": {"type": "string"},
"uniqueItems": True
}
}
}
jsondatavalidator.py
config_schema = {
"type": "object",
"required": ["app_name", "version", "database"],
"properties": {
"app_name": {"type": "string", "minLength": 1},
"version": {"type": "string", "pattern": r"^\d+\.\d+\.\d+$"},
"database": {
"type": "object",
"required": ["host", "port"],
"properties": {
"host": {"type": "string"},
"port": {"type": "integer", "minimum": 1, "maximum": 65535},
"ssl": {"type": "boolean"}
}
},
"features": {
"type": "array",
"items": {"type": "string"},
"uniqueItems": True
}
}
}
Sample Generated Data
{
"app_name": "sample_string",
"version": "1.0.0",
"database": {
"host": "sample_string",
"port": 42,
"ssl": true
},
"features": ["sample_string"]
}
{
"app_name": "sample_string",
"version": "1.0.0",
"database": {
"host": "sample_string",
"port": 42,
"ssl": true
},
"features": ["sample_string"]
}
🏆 Project Completion
This JSON Data Validator demonstrates:
- ✅ Complete JSON schema validation implementation
- ✅ Comprehensive error reporting and tracking
- ✅ Batch processing and file operations
- ✅ Data generation and schema analysis
- ✅ Interactive user interface
- ✅ Statistics and reporting capabilities
Perfect for beginners learning data validation concepts and intermediate developers working with JSON data processing and API development!
Was this page helpful?
Let us know how we did