Skip to main content

Baselinr

Open-source data quality and observability platform for SQL data warehouses

Automated Data ProfilingData ProfilingQuality Score95% - ExcellentSchema StatusStable - No changes detectedAnomalies2 detected - Review recommended

Automated Setup

Zero-touch configuration—automatically recommends tables and columns to monitor, suggests data quality checks, and can auto-apply configurations. Or configure everything manually with full control—your choice.

Drift DetectionDrift Monitoring Dashboard!Drift DetectedTimeValueNormalDrift

Comprehensive Monitoring

Profile your data, detect schema and statistical drift, identify anomalies, validate data quality rules, and perform root cause analysis—all in one platform.

Python-FirstPythonDBAPIETLDagPyMLSQLData

Developer-First

Built for data engineers who want transparency and control. CLI-first with Python SDK, YAML/JSON configuration, and native integrations with dbt, Dagster, and Airflow.

Automated Data ProfilingData ProfilingQuality Score95% - ExcellentSchema StatusStable - No changes detectedAnomalies2 detected - Review recommended

Automated Profiling

Continuously profile your data warehouse with column-level metrics, distributions, and schema tracking. Intelligent table discovery reduces configuration overhead.

Drift DetectionDrift Monitoring Dashboard!Drift DetectedTimeValueNormalDrift

Drift Detection

Detect schema and statistical drift using multiple strategies with type-specific thresholds. Advanced statistical tests (KS, PSI, Chi-square) for rigorous detection.

Anomaly DetectionAnomaly Detection⚠ Anomaly!

Anomaly Detection

Automatically detect outliers and seasonal anomalies using learned expectations with multiple detection methods (IQR, MAD, EWMA, trend/seasonality, regime shift).

Data Quality Monitoring85%Quality Score

Data Validation

Rule-based data quality validation with built-in validators for format, range, enum, null checks, uniqueness, and referential integrity. Custom validators supported.

Advanced Statistical TestsStatistical TestsKSTestPSIIndexχ²Chi-SquareHEntropyScore

Root Cause Analysis

Automatically correlate anomalies with pipeline runs, code changes, and upstream data issues using temporal correlation, lineage analysis, and pattern matching.

Multi-Database SupportPostgreSQLSnowflakeBigQuery++MySQL, SQLite, Redshift, and more...

Multi-Database Support

Works seamlessly with PostgreSQL, Snowflake, SQLite, MySQL, BigQuery, and Redshift. Unified API across all supported databases.

Production-Ready Features

Every feature is designed to meet the demands of production workloads with enterprise-grade capabilities.

Expectation LearningLearning ExpectationsFrom Historical Data

Expectation Learning

Automatically learns expected metric ranges from historical profiling data, including control limits, distributions, and categorical frequencies for proactive anomaly detection.

Web Dashboardlocalhost:8000Data Profiling DashboardMetrics OverviewStatisticsQuality Score: 95%Tables Profiled: 42Drift Events: 2

Web Dashboard & AI Chat

Interactive web dashboard for visualizing profiling runs and drift detection. AI-powered chat interface for natural language data quality investigation.

CLI & APITerminal$ baselinr profileProfiling database...✓ Complete$ baselinr drift --table users✓ Drift detectedAPIGET/api/profilesGet all profiling runsGET/api/driftQuery drift eventsPOST/api/eventsCreate alert events

CLI & Python SDK

Comprehensive command-line interface and powerful Python SDK for programmatic access. Perfect for automation, integration, and custom workflows.

Event & Alert Hooks3Event & Alert HooksReal-time notifications

Event & Alert Hooks

Pluggable event system for real-time alerts and notifications on drift, schema changes, anomalies, and profiling lifecycle events. Integrate with Slack, email, or custom systems.

Partition-Aware Profiling2024-01Latest2024-02Recent2024-03Sample2024-04More...Partition-Aware Profiling

Partition-Aware Profiling

Intelligent partition handling with strategies for latest, recent_n, or sample partitions. Optimize profiling for large partitioned datasets.

Focus on What Matters

Data Lineage

Multi-source lineage extraction from dbt, Dagster, SQL parsing, and query history. Visual lineage graphs with interactive exploration and drift impact analysis.

Use Cases

See how teams are using Baselinr to solve real-world data quality challenges.

Data Quality Monitoring85%Quality Score

Automated Data Quality Setup

Turn on comprehensive data quality monitoring with minimal effort. System automatically recommends tables and columns, suggests checks, and can auto-apply configurations.

Schema Change DetectionBeforeid: intname: stringemail: stringAfterid: intname: stringemail: string+ phone: string

Root Cause Investigation

When anomalies occur, automatically correlate with pipeline runs, code changes, and upstream data issues to identify root causes. AI-powered chat for interactive investigation.

Statistical Drift DetectionStatistical Drift DetectionBaselineCurrentDrift Detected

Pipeline Integration

Integrate with Airflow, Dagster, and dbt to validate data quality in your pipelines. Fail builds when critical issues are detected. Native orchestration support.

Ready to Get Started?

Join developers building better data quality monitoring with Baselinr.