MODEL: openai OpenAI: o3 Deep Research

Provider: openaiSource: fullmodelKey: openai/o3-deep-research-2025-06-26

Overall

74.0 / 100

Updated Jan 30

A) Absolute Metrics (must exist for all models)

Model key	openai/o3-deep-research-2025-06-26
Display name	OpenAI: o3 Deep Research
Provider	openai
Canonical slug	Missing status: missing missing:reasons note:missing_field:canonicalSlug
Context length	200000
Max output tokens	Missing status: missing missing:reasons note:missing_field:maxOutputTokens
Pricing input per 1M	0.00001
Pricing output per 1M	0.00004
Modalities	text+image+file->text
Supports tools	Missing status: missing missing:reasons note:missing_field:supportsTools
Supports JSON	Missing status: missing missing:reasons note:missing_field:supportsJson
Release date	2025-10-10T20:54:21.000Z
Training cutoff	Missing status: missing missing:reasons note:missing_field:trainingCutoff

Adoption & Decision

adoptedSource: openrouter

Reasons:

meets required fields

References: None

Status

adopted

Reasons (from decisions.json):

No published decision record.

Score Summary

Overall: 74 / 100

Overall = Σ(weight × score)

Category Totals

Category	Weight	Score	Contrib
C1 Performance	0.40	0	0.00
C2 Spec	0.20	0	0.00
C3 Cost	0.12	0	0.00
C4 Speed	0.10	0	0.00
C5 Reliability	0.08	0	0.00
C6 Features	0.06	0	0.00
C7 Transparency	0.04	0	0.00

Category Scores (0–100)

performance73
safety80
adoption66
openness58
cost100

Top drivers

Missing item evidence: score withheld (missing evidence (official-page only)).
Score verified from available inputs and evidence.
Evidence links are missing, so the score is withheld until sources are provided.
Missing item evidence: score withheld (missing inputs, evidence (official-page only)).
missing:reasons — Evidence indicates missing/failed; penalty applied per policy.

C) Evidence (4 tiles/cards; reasons required if not ok)

Evidence Quality: 0/100

Based on evidence availability only. Does not affect the score.

Show quality breakdown

official_page not_found +0.00
dev_activity not_found +0.00
paper not_found +0.00
audit not_found +0.00

Official Page

type: official_page

⚠️ not_found

No reliable source was found.

url: No link provided.

SPEC VIOLATION: Spec missing evidence: evidence URLs are required for this score.

reasons (1)

codes:

missing:type:official_page

How this affected scoring

No scoring items rely on official-page evidence.

extracted

No extracted data.

Dev Activity

type: dev_activity

⚠️ not_found

No reliable source was found.

url: No link provided.

SPEC VIOLATION: Spec missing evidence: evidence URLs are required for this score.

reasons (1)

codes:

missing:type:dev_activity

How this affected scoring

No scoring items rely on dev-activity evidence.

extracted

No extracted data.

Paper

type: paper

⚠️ not_found

No reliable source was found.

url: No link provided.

SPEC VIOLATION: Spec missing evidence: evidence URLs are required for this score.

reasons (1)

codes:

missing:type:paper

How this affected scoring

No scoring items rely on paper evidence.

extracted

No extracted data.

Independent third-party security audit

type: audit

⚠️ not_found

No reliable source was found.

url: No link provided.

SPEC VIOLATION: Spec missing evidence: evidence URLs are required for this score.

reasons (1)

codes:

missing:type:audit

How this affected scoring

No scoring items rely on audit evidence.

extracted

No extracted data.

No evidence rule configured for this item (spec config missing).

Raw Inputs

Manual (curated)

Status: missing

Missing / invalid reasons (1)

manual: missing:manual_map

From OpenRouter

Status: ok

Key	Value
context_length	200000
pricing_input_per_1m	0.00001
pricing_output_per_1m	0.00004
pricing_currency	USD
modality	text+image+file->text
release_date	2025-10-10T20:54:21.000Z

From Hugging Face

Status: missing

missing_raw_inputs:huggingface

From GitHub

Status: missing

missing_raw_inputs:github

From arXiv

Status: missing

missing_raw_inputs:arxiv

Ops (speed/reliability)

Status: missing

missing_raw_inputs:ops

Score formulas (display only)

Overall score

Overall = weighted_sum(category_scores)

Overall is a weighted sum of category scores.
This is documentation; the UI does not recompute scores.
Exact weights are defined in the scoring implementation/spec; this panel avoids hardcoding coefficients unless they are guaranteed to match.

Category formulas

Performance

performance

Performance = weighted_sum(performance_submetrics)

Each category score is computed from its sub-metrics per the scoring spec.
Uses available benchmark inputs; missing inputs/evidence may withhold item scores per policy.

Safety

safety

Safety = weighted_sum(safety_submetrics)

Each category score is computed from its sub-metrics per the scoring spec.
Evidence requirements may cap or withhold safety-related items.

Adoption

adoption

Adoption = weighted_sum(adoption_submetrics)

Each category score is computed from its sub-metrics per the scoring spec.
Based on provider/source signals; missing data is shown explicitly in Raw Inputs.

Openness

openness

Openness = weighted_sum(openness_submetrics)

Each category score is computed from its sub-metrics per the scoring spec.
Based on disclosures, docs, and availability of primary sources.

Cost

cost

Cost = function(pricing_inputs)

Each category score is computed from its sub-metrics per the scoring spec.
Computed from pricing inputs; shown as documentation only.

D) Full Breakdown (every item must show score + inputs + used evidence + why)

Missing or failed evidence inputs trigger fixed penalties per policy. No placeholder states are hidden.

MODEL: openai OpenAI: o3 Deep Research

A) Absolute Metrics (must exist for all models)

Adoption & Decision

Status

Score Summary

C) Evidence (4 tiles/cards; reasons required if not ok)

Official Page

Dev Activity

Paper

Independent third-party security audit

D) Full Breakdown (every item must show score + inputs + used evidence + why)

Links

Other

Item	Score	Input	Evidence	Explanation
General benchmarks	— WITHHELD: Missing item evidence: score withheld (missing evidence (official-page only)).	benchmark=74.65	Official page (ok): https://openai.com Official page (ok): https://openrouter.ai/models/openai/o3-deep-research-2025-06-26 WITHHELD: Missing item evidence: score withheld (missing evidence (official-page only)). Details Evidence status Evidence 1 Status: ok URL: present Reasons: missing:reasons Evidence 2 Status: ok URL: present Reasons: missing:reasons Penalty Penalty: present Penalty reasons: missing:reasons Spec checks withheld: evidence or data withheld Withheld reasons: missing:reasons	Formula — Default — Why Missing item evidence: score withheld (missing evidence (official-page only)).
Coding benchmarks	— WITHHELD: Missing item evidence: score withheld (missing evidence (official-page only)).	benchmark=81.44	Official page (ok): https://openai.com Official page (ok): https://openrouter.ai/models/openai/o3-deep-research-2025-06-26 WITHHELD: Missing item evidence: score withheld (missing evidence (official-page only)). Details Evidence status Evidence 1 Status: ok URL: present Reasons: missing:reasons Evidence 2 Status: ok URL: present Reasons: missing:reasons Penalty Penalty: present Penalty reasons: missing:reasons Spec checks withheld: evidence or data withheld Withheld reasons: missing:reasons	Formula — Default — Why Missing item evidence: score withheld (missing evidence (official-page only)).
Math & chat benchmarks	— WITHHELD: Missing item evidence: score withheld (missing evidence (official-page only)).	math=64.52 chat=81.23 arena=54.82 vendor=89.42	Official page (ok): https://openai.com Official page (ok): https://openrouter.ai/models/openai/o3-deep-research-2025-06-26 WITHHELD: Missing item evidence: score withheld (missing evidence (official-page only)). Details Evidence status Evidence 1 Status: ok URL: present Reasons: missing:reasons Evidence 2 Status: ok URL: present Reasons: missing:reasons Penalty Penalty: present Penalty reasons: missing:reasons Spec checks withheld: evidence or data withheld Withheld reasons: missing:reasons	Formula — Default — Why Missing item evidence: score withheld (missing evidence (official-page only)).
Safety documentation	0	safetySection=No highRisk=No	Official page (ok): https://openai.com Official page (ok): https://openrouter.ai/models/openai/o3-deep-research-2025-06-26 Details Evidence status Evidence 1 Status: ok URL: present Reasons: missing:reasons Evidence 2 Status: ok URL: present Reasons: missing:reasons Penalty Penalty: present Penalty reasons: missing:reasons Spec checks Spec checks: none	Formula — Default — Why Score verified from available inputs and evidence.
Alignment disclosure	0	rlhf=No dataDisclosure=No	Official page (ok): https://openai.com Official page (ok): https://openrouter.ai/models/openai/o3-deep-research-2025-06-26 Details Evidence status Evidence 1 Status: ok URL: present Reasons: missing:reasons Evidence 2 Status: ok URL: present Reasons: missing:reasons Penalty Penalty: present Penalty reasons: missing:reasons Spec checks Spec checks: none	Formula — Default — Why Score verified from available inputs and evidence.
Misuse policy coverage	100	misusePolicy=Yes harmMitigation=Yes	Official page (ok): https://openai.com Official page (ok): https://openrouter.ai/models/openai/o3-deep-research-2025-06-26 Details Evidence status Evidence 1 Status: ok URL: present Reasons: missing:reasons Evidence 2 Status: ok URL: present Reasons: missing:reasons Penalty Penalty: present Penalty reasons: missing:reasons Spec checks Spec checks: none	Formula — Default — Why Score verified from available inputs and evidence.
External audit & red teaming	— WITHHELD: Evidence links are missing, so the score is withheld until sources are provided.	redTeam=Yes independentAudit=Yes	Evidence (missing): No link provided. Spec missing evidence: evidence URLs are required for this score. WITHHELD: Evidence links are missing, so the score is withheld until sources are provided. Details Evidence status Status: missing URL: missing Reasons: policy:safety Penalty Penalty: present Penalty reasons: missing:reasons Spec checks spec_missing_evidence: score exists but no verifiable URL is present withheld: evidence or data withheld Withheld reasons: missing:reasons	Formula — Default — Why Evidence links are missing, so the score is withheld until sources are provided.
Transparency updates	100	transparencyUpdate=Yes	Official page (ok): https://openai.com Official page (ok): https://openrouter.ai/models/openai/o3-deep-research-2025-06-26 Details Evidence status Evidence 1 Status: ok URL: present Reasons: missing:reasons Evidence 2 Status: ok URL: present Reasons: missing:reasons Penalty Penalty: present Penalty reasons: missing:reasons Spec checks Spec checks: none	Formula — Default — Why Score verified from available inputs and evidence.
Minor incidents	— WITHHELD: Missing item evidence: score withheld (missing inputs, evidence (official-page only)).	Missing status: missing missing:reasons note:note:missing_field:inputs_raw	Official page (ok): https://openai.com Official page (ok): https://openrouter.ai/models/openai/o3-deep-research-2025-06-26 WITHHELD: Missing item evidence: score withheld (missing inputs, evidence (official-page only)). Details Evidence status Evidence 1 Status: ok URL: present Reasons: missing:reasons Evidence 2 Status: ok URL: present Reasons: missing:reasons Penalty Penalty: present Penalty reasons: missing:reasons Spec checks withheld: evidence or data withheld Withheld reasons: missing:reasons	Formula — Default — Why Missing item evidence: score withheld (missing inputs, evidence (official-page only)).
Major incidents	— WITHHELD: Missing item evidence: score withheld (missing inputs, evidence (official-page only)).	Missing status: missing missing:reasons note:note:missing_field:inputs_raw	Official page (ok): https://openai.com Official page (ok): https://openrouter.ai/models/openai/o3-deep-research-2025-06-26 WITHHELD: Missing item evidence: score withheld (missing inputs, evidence (official-page only)). Details Evidence status Evidence 1 Status: ok URL: present Reasons: missing:reasons Evidence 2 Status: ok URL: present Reasons: missing:reasons Penalty Penalty: present Penalty reasons: missing:reasons Spec checks withheld: evidence or data withheld Withheld reasons: missing:reasons	Formula — Default — Why Missing item evidence: score withheld (missing inputs, evidence (official-page only)).
Critical incidents	— WITHHELD: Missing item evidence: score withheld (missing inputs, evidence (official-page only)).	Missing status: missing missing:reasons note:note:missing_field:inputs_raw	Official page (ok): https://openai.com Official page (ok): https://openrouter.ai/models/openai/o3-deep-research-2025-06-26 WITHHELD: Missing item evidence: score withheld (missing inputs, evidence (official-page only)). Details Evidence status Evidence 1 Status: ok URL: present Reasons: missing:reasons Evidence 2 Status: ok URL: present Reasons: missing:reasons Penalty Penalty: present Penalty reasons: missing:reasons Spec checks withheld: evidence or data withheld Withheld reasons: missing:reasons	Formula — Default — Why Missing item evidence: score withheld (missing inputs, evidence (official-page only)).
Model documentation	67	modelCard=No overview=Yes limitations=Yes	Official page (ok): https://openai.com Official page (ok): https://openrouter.ai/models/openai/o3-deep-research-2025-06-26 Details Evidence status Evidence 1 Status: ok URL: present Reasons: missing:reasons Evidence 2 Status: ok URL: present Reasons: missing:reasons Penalty Penalty: present Penalty reasons: missing:reasons Spec checks Spec checks: none	Formula — Default — Why Score verified from available inputs and evidence.
Training data disclosure	67	dataCategories=Yes dataFiltering=Yes copyright=No	Official page (ok): https://openai.com Official page (ok): https://openrouter.ai/models/openai/o3-deep-research-2025-06-26 Details Evidence status Evidence 1 Status: ok URL: present Reasons: missing:reasons Evidence 2 Status: ok URL: present Reasons: missing:reasons Penalty Penalty: present Penalty reasons: missing:reasons Spec checks Spec checks: none	Formula — Default — Why Score verified from available inputs and evidence.
Paper / technical report	— WITHHELD: Evidence links are missing, so the score is withheld until sources are provided.	architecture=Yes parameterScale=No	Evidence (missing): No link provided. Spec missing evidence: evidence URLs are required for this score. WITHHELD: Evidence links are missing, so the score is withheld until sources are provided. Details Evidence status Status: missing URL: missing Reasons: policy:safety Penalty Penalty: present Penalty reasons: missing:reasons Spec checks spec_missing_evidence: score exists but no verifiable URL is present withheld: evidence or data withheld Withheld reasons: missing:reasons	Formula — Default — Why Evidence links are missing, so the score is withheld until sources are provided.
External review & transparency	— WITHHELD: Evidence links are missing, so the score is withheld until sources are provided.	safetyControls=No riskLimits=Yes externalReview=No transparencyReport=Yes	Evidence (missing): No link provided. Spec missing evidence: evidence URLs are required for this score. WITHHELD: Evidence links are missing, so the score is withheld until sources are provided. Details Evidence status Status: missing URL: missing Reasons: policy:safety Penalty Penalty: present Penalty reasons: missing:reasons Spec checks spec_missing_evidence: score exists but no verifiable URL is present withheld: evidence or data withheld Withheld reasons: missing:reasons	Formula — Default — Why Evidence links are missing, so the score is withheld until sources are provided.