What is Cauzen?
An overview of the Cauzen causal analysis platform.
Cauzen is a causal analysis platform that guides you from raw data to causal answers in three steps. Upload a dataset, build a model of cause-and-effect relationships, then ask causal questions in natural language and get statistically grounded estimates.
How it works
Cauzen is organized as a three-phase workflow. Each phase builds on the previous one:
1. Data Wrangling
Upload a CSV file, configure metadata, and review automatic recommendations for columns that should not be modeled.
2. Causal Modeling
Build a causal graph that represents cause-and-effect relationships. Use statistical discovery with LLM refinement, draw edges manually, or combine both.
3. Causal Inference
Ask causal questions in natural language. Cauzen uses your data to build compatible formal queries, then returns estimates and AI explanations.
What makes Cauzen different
Most analysis tools answer correlation questions: "Is X related to Y?" Cauzen answers causal questions: "Does X cause Y, and by how much?"
This distinction matters whenever you want to make decisions, not just observations. For example:
- "If we reduce drug dosage, what happens to side effects?" (not just: are dosage and side effects correlated?)
- "If we change this policy, how does outcome change?" (not just: are policy and outcome associated?)
Cauzen uses the language of directed acyclic graphs (DAGs) and do-calculus to reason about these questions rigorously.
Recent capabilities
Cauzen now includes several safeguards to keep analyses closer to the data you uploaded:
- Data Wrangling recommends excluding identifier-like, constant, duplicate, sequential-date, and linearly dependent columns before modeling.
- Causal Modeling validates the kept columns before discovery and can apply recommended exclusions when the encoded data matrix would make discovery fail.
- Discover streams progress from the backend, shows a statistical partial graph when available, then applies LLM refinement and displays the refinement reasoning.
- Causal Inference sends the current dataset with natural-language questions so interpreted values match the observed variable types, ranges, and categorical shapes.
- Result explanations and AI interpretations support formatted markdown, and identifiable-but-not-estimable effects display their symbolic estimand instead of failing.
Before you begin
To use Cauzen you need a CSV file containing tabular data — rows of observations and columns of variables. The columns should represent quantities that plausibly have cause-and-effect relationships between them.
Cauzen works best when:
- Your dataset has at least a few hundred rows
- Your columns are numeric or categorical (not free text)
- You have some domain knowledge about which variables might cause which
Ready to get started? Head to Getting Started.