Cauzen
Cauzen Docs

What is Cauzen?

An overview of the Cauzen causal analysis platform.

Cauzen is a causal analysis platform that guides you from raw data to causal answers in three steps. Upload a dataset, build a model of cause-and-effect relationships, then ask causal questions in natural language and get statistically grounded estimates.

How it works

Cauzen is organized as a three-phase workflow. Each phase builds on the previous one:

What makes Cauzen different

Most analysis tools answer correlation questions: "Is X related to Y?" Cauzen answers causal questions: "Does X cause Y, and by how much?"

This distinction matters whenever you want to make decisions, not just observations. For example:

  • "If we reduce drug dosage, what happens to side effects?" (not just: are dosage and side effects correlated?)
  • "If we change this policy, how does outcome change?" (not just: are policy and outcome associated?)

Cauzen uses the language of directed acyclic graphs (DAGs) and do-calculus to reason about these questions rigorously.

Recent capabilities

Cauzen now includes several safeguards to keep analyses closer to the data you uploaded:

  • Data Wrangling recommends excluding identifier-like, constant, duplicate, sequential-date, and linearly dependent columns before modeling.
  • Causal Modeling validates the kept columns before discovery and can apply recommended exclusions when the encoded data matrix would make discovery fail.
  • Discover streams progress from the backend, shows a statistical partial graph when available, then applies LLM refinement and displays the refinement reasoning.
  • Causal Inference sends the current dataset with natural-language questions so interpreted values match the observed variable types, ranges, and categorical shapes.
  • Result explanations and AI interpretations support formatted markdown, and identifiable-but-not-estimable effects display their symbolic estimand instead of failing.

Before you begin

To use Cauzen you need a CSV file containing tabular data — rows of observations and columns of variables. The columns should represent quantities that plausibly have cause-and-effect relationships between them.

Cauzen works best when:

  • Your dataset has at least a few hundred rows
  • Your columns are numeric or categorical (not free text)
  • You have some domain knowledge about which variables might cause which

Ready to get started? Head to Getting Started.