Cauzen
Cauzen Docs

Causal Inference

Ask causal questions and get statistically grounded estimates.

The Causal Inference phase is where you ask questions about cause and effect and get quantitative answers. You can ask questions in natural language, or build queries directly using the Query Builder.

What is a causal query?

A causal query asks: "What would happen to an outcome if we intervened on one or more variables?"

This is different from a statistical correlation query. Cauzen uses the do-calculus notation to express this formally:

P(outcome = value | do(treatment = value))

This reads as: "What is the probability that [outcome] equals [value], given that we set [treatment] to [value]?"

You don't need to understand the notation to use Cauzen — the Query Builder handles it visually, and you can also just type a natural language question.

Asking a question in natural language

Type your question in the input field at the top of the page and click Interpret. For example:

What would be the probability of a patient being readmitted if we increased the medication dosage?

Cauzen sends the question together with the current dataset, causal graph, and metadata. This lets the backend interpret the question using the values and variable types actually present in your data. Cauzen then normalizes the structured result before filling in the Query Builder.

That normalization prevents common impossible query values. For example, if the dataset stores education-num as numbers from 0 to 14, Cauzen will not pre-fill a treatment value like high/low or render a comparison operator like >10 in a scalar value field. It will use a compatible scalar value when one can be determined, or leave the value empty for you to choose.

AI-prefilled queries are still drafts. Review the outcome, treatment values, conditions, and estimand type before estimating.

Building a query manually

The Query Builder card lets you construct a causal query step by step.

Variable dropdowns include only columns that are currently kept in Data Wrangling. If a column is excluded, it will not appear as an outcome, treatment, or condition variable.

Estimand type

Choose what you want to estimate:

TypeSymbolUse when...
ProbabilityPYour outcome is a specific value (e.g., "probability of readmission = 1")
Expected ValueEYour outcome is a numeric quantity (e.g., "expected blood pressure")

Outcome variable

Select the variable whose value you want to predict. In Probability mode, also enter the specific value you're interested in.

Treatments

Treatments are the variables you are intervening on — the "levers" you want to pull. Click + to add a treatment, select a variable from the dropdown, and enter the value you want to set it to.

For categorical columns, enter the observed category label, such as White or Asian-Pac-Islander, rather than a numeric encoding. Matching is case-insensitive, and Cauzen will display the observed spelling before running the estimate.

You can add multiple treatments to model simultaneous interventions.

Each treatment variable can appear only once in the do() block. If an AI interpretation returns conflicting assignments for the same treatment variable, Cauzen keeps a single compatible assignment in the builder.

Conditions

Conditions let you restrict the query to a subpopulation using scalar values. For example, if your data contains an age_group column, you might add a condition of age_group = 65_plus. The current Query Builder value fields are scalar fields, so write values rather than comparison expressions.

Click + to add a condition. Conditions are optional.

Running the query

Once you've selected an outcome and at least one treatment, click Estimate. Cauzen sends the query and your causal graph to the inference engine and returns results.

Interpreting results

Each query produces a result card in the history below the Query Builder. Results are shown newest first.

The estimate

If the causal effect is identifiable from your graph, you'll see:

  • Causal Effect Estimate — the point estimate (e.g., 0.73 for a probability, or 142.5 for an expected value)
  • Confidence Interval — the range within which the true effect likely falls (e.g., [0.68, 0.78])
  • Confidence Level — how confident the estimate is (e.g., 95%)
  • Estimation Method — the statistical method used
  • P-Value — the statistical significance of the estimate

The AI explanation

Below the numerical results, Cauzen generates a plain-language explanation of what the estimate means in the context of your question and data. This appears after a brief pause as the explanation is generated. Explanations and AI interpretations can include markdown formatting such as emphasis, lists, and bold text.

If explanation generation fails, the numeric estimate stays visible and Cauzen shows a notification instead of treating the whole result as failed.

Not identifiable

Sometimes the causal effect cannot be computed from the available graph. This happens when the graph structure doesn't provide enough information to isolate the causal pathway. If this occurs, Cauzen will tell you the effect is not identifiable and display the query expression.

To address this, review your causal graph in the Causal Modeling phase — you may be missing edges that encode relevant confounding relationships.

Identifiable but not estimable

Some effects are identifiable from the graph but cannot yet be estimated automatically. In that case, Cauzen shows the query expression, explains that automated estimation is not yet supported, and renders the symbolic estimand expression. It does not request an AI explanation because there is no numeric estimate to explain.

Result states

Each result card moves through the following states:

StateDescription
InterpretingCauzen is translating your natural language question
BuildingWaiting for you to review and confirm the query
EstimatingRunning the causal inference calculation
DoneResults are ready
ErrorSomething went wrong — the error message will explain what

If an error happens after a query has already been built, Cauzen keeps the Query Builder controls in the result card so you can adjust the query and try estimating again.

Query history

All queries from your current session are saved in the history below the Query Builder. You can scroll back through previous results at any time.

Causal inference results are only as reliable as your causal model. If your graph is missing important confounders or contains incorrect edges, the estimates may be misleading. Always validate results against domain knowledge.