How We Score

Our methodology for evaluating the ethical practices of AI tools — what we measure, where our data comes from, and how you can hold us accountable.

Our Approach

Ai-Dex ethics scores are compiled using publicly available data, published research, regulatory filings, and documented company practices. We are not ethicists, lawyers, or AI researchers. We are practitioners who believe transparency about AI tools matters.

Our scores reflect our best assessment of available evidence and are updated as new information emerges. We show our work — every score traces back to verifiable sources, and every methodology decision is documented here. If you believe a score is inaccurate, we welcome evidence-based challenges.

We do not accept payment for higher scores. We do not allow companies to preview or influence their assessments. Our affiliate relationships (clearly disclosed on every relevant page) have no bearing on ethics scores. A tool we earn affiliate revenue from receives the same scrutiny as any other.

What We Don't Claim

We are not the final word

Ethics is complex, context-dependent, and evolving. Our scores are one input into your decision-making, not a substitute for your own judgment. A tool that scores well on our framework may still be wrong for your specific use case.

We have blind spots

Our 11-category framework captures the most significant ethical dimensions we've identified, but it doesn't cover everything. Cultural context, accessibility, labor practices in AI training, and emerging concerns may not be fully represented. We expand our framework as understanding evolves.

Our data has limits

We rely on what's publicly available. Companies with better transparency get more accurate scores. A company that publishes no safety data will score lower than one that does — which may reflect secrecy rather than poor practices. We note this where it applies.

The 11 Categories

Every tool is evaluated across 11 dimensions. Each category has specific data sources and criteria for high and low scores. Categories are weighted equally in the overall score unless documented harms in a specific area warrant additional weight.

SAFETY

AI Safety & Misinformation

How well does this tool prevent harmful outputs, misinformation, and misuse? Does it have content moderation, safety testing, and responsible deployment practices?

Data Sources

  • Published safety testing results and model cards
  • Independent red-team evaluations and adversarial testing
  • Content moderation policies and enforcement track record
  • Responsible disclosure and incident response practices
  • NIST AI Risk Management Framework alignment
High Score Indicators

Extensive safety testing, robust content moderation, published model cards, responsible scaling commitments, transparent incident response

Low Score Indicators

Minimal safety guardrails, history of generating harmful content, no published safety evaluations, resistance to safety norms

BIAS

Bias & Discrimination

Does this tool produce equitable outcomes across demographics? Has it been tested for racial, gender, cultural, and socioeconomic bias?

Data Sources

  • Published bias audits and fairness evaluations
  • Academic research on model bias (e.g., MIT Gender Shades methodology)
  • Demographic performance disparities in independent testing
  • Training data diversity and representation
  • Company commitments to bias reduction and progress reports
High Score Indicators

Published bias audits, third-party fairness testing, documented bias mitigation efforts, diverse training data, demographic parity in outputs

Low Score Indicators

Documented bias in outputs, no fairness testing, training data reflects historical prejudices without correction, demographic disparities in performance

POWER

Concentration of Power

Does this tool contribute to unhealthy concentration of power in the AI industry? Is there meaningful governance, competition, and accountability?

Data Sources

  • Corporate governance structure (independent board, oversight mechanisms)
  • Market share and competitive landscape
  • Antitrust actions or investigations
  • Open-source commitments and ecosystem contributions
  • Interoperability and data portability
High Score Indicators

Independent governance, supports competition and interoperability, open-source contributions, data portability, distributed power structure

Low Score Indicators

Single-owner control, monopolistic market position, vendor lock-in, resistance to interoperability, opaque governance

IP

Copyright & Intellectual Property

How does this tool handle training data rights, creator compensation, and intellectual property? Is the training data ethically sourced?

Data Sources

  • Training data sourcing and licensing documentation
  • Active lawsuits related to copyright infringement
  • Creator compensation programs and opt-out mechanisms
  • Content provenance and attribution features
  • IP indemnification policies for users
High Score Indicators

Licensed training data, creator compensation programs, content provenance features, IP indemnification, transparent data sourcing

Low Score Indicators

Training on copyrighted material without license, active copyright lawsuits, no creator compensation, no opt-out mechanisms

CYBER

Cybersecurity & Deepfakes

Could this tool be used to create deepfakes, enable fraud, or compromise security? What safeguards exist against misuse?

Data Sources

  • Anti-deepfake measures and watermarking
  • Known misuse incidents and company response
  • Security certifications (SOC 2, ISO 27001)
  • Vulnerability disclosure programs
  • Authentication and identity verification features
High Score Indicators

Content watermarking, deepfake detection support, security certifications, responsible disclosure program, proactive misuse prevention

Low Score Indicators

Easily used for deepfakes/fraud, no watermarking, no misuse prevention, history of security incidents without remediation

DATA

Data Privacy & Surveillance

How does this tool handle user data? Does it respect privacy, minimize data collection, and provide meaningful consent?

Data Sources

  • Privacy policy clarity and data handling practices
  • Data retention policies and deletion capabilities
  • Regulatory compliance (GDPR, CCPA, HIPAA where applicable)
  • Privacy certifications and third-party audits
  • History of data breaches or privacy violations
  • Whether user data is used for model training (opt-in vs. opt-out)
High Score Indicators

Clear privacy policy, minimal data collection, strong regulatory compliance, privacy certifications, user data not used for training without opt-in

Low Score Indicators

Opaque data practices, excessive data collection, privacy violations, user data used for training by default, regulatory non-compliance

ECO

Environmental Impact

What is the energy and environmental footprint of this tool? Is the company taking meaningful steps to reduce its climate impact?

Data Sources

  • Published environmental reports and carbon disclosures
  • Data center energy sources (renewable vs. fossil)
  • Model efficiency and computational requirements
  • Carbon offset or reduction commitments
  • Academic research on AI energy consumption
High Score Indicators

Published environmental reports, renewable energy commitments, efficient model architecture, carbon reduction programs, transparent energy usage

Low Score Indicators

No environmental reporting, high energy consumption without mitigation, reliance on fossil fuel data centers, increasing emissions without accountability

MIL

Military & Government Surveillance

Is this tool used for military applications, autonomous weapons, or mass surveillance? Does the company have policies governing government use?

Data Sources

  • Published acceptable use policies regarding military applications
  • Known government and military contracts
  • Involvement in autonomous weapons or targeting systems
  • Employee activism or internal dissent about military use
  • Participation in international weapons governance discussions
High Score Indicators

Explicit policy against weapons applications, no military targeting contracts, supports international governance, transparent about government relationships

Low Score Indicators

Active military weapons contracts, provides targeting or surveillance systems, no restrictions on military use, opaque government relationships

JOBS

Work & Economic Impact

How does this tool affect employment? Does it augment human work or replace it? Is the company investing in transition support?

Data Sources

  • Industry employment data and trend analysis
  • Company statements on workforce impact
  • Reskilling and transition support programs
  • Academic research on AI labor displacement (WEF, McKinsey, etc.)
  • Freelancer and worker impact surveys
High Score Indicators

Designed to augment rather than replace, reskilling investments, transparent about workforce impact, supports human-AI collaboration

Low Score Indicators

Directly replaces workers with no transition support, marketed as human replacement, documented job losses without mitigation

ARTS

Creative & Cultural Impact

How does this tool affect creative professionals and cultural production? Does it respect creative labor and cultural diversity?

Data Sources

  • Creative industry employment and revenue impact data
  • Creator compensation and attribution practices
  • Cultural representation and homogenization research
  • Artist and creator community surveys (AOI, Concept Art Association, SAG-AFTRA)
  • UNESCO cultural impact assessments
High Score Indicators

Augments creative work, compensates training data creators, promotes cultural diversity, supports creative industry ecosystem

Low Score Indicators

Directly displaces creative workers, trained on creator work without consent/compensation, contributes to cultural homogenization

TRUTH

Truth & Information Integrity

Does this tool help maintain a shared sense of reality, or does it undermine it? Can its outputs be distinguished from authentic content?

Data Sources

  • Hallucination and factual accuracy benchmarks
  • Content provenance and watermarking features
  • Known misinformation incidents involving the tool
  • Citation and source attribution capabilities
  • World Economic Forum Global Risks Report rankings
High Score Indicators

Strong factual accuracy, citation-based outputs, content watermarking, proactive misinformation prevention, transparent about limitations

Low Score Indicators

High hallucination rate, enables deepfakes without safeguards, used in documented misinformation campaigns, no content provenance features

MIND

Human Development & Cognition

Does this tool support human learning and critical thinking, or does it create dependency? What is the impact on cognitive development?

Data Sources

  • Academic research on AI and cognitive development
  • Studies on AI-assisted learning outcomes
  • Child safety and age-appropriate design features
  • User dependency and engagement metrics research
  • APA and educational organization guidelines
High Score Indicators

Designed to teach rather than just answer, promotes critical thinking, age-appropriate safeguards, supports learning outcomes over engagement metrics

Low Score Indicators

Encourages dependency over learning, no age-appropriate features, engagement-optimized design, documented negative effects on critical thinking

Scoring Process

1

Data Collection

We gather publicly available information for each tool across all 11 categories. This includes company documentation, published research, news coverage, regulatory filings, and independent audits. We do not rely on company self-reporting alone.

2

Category Scoring

Each tool receives a score from 0-100 in each category based on the weight of available evidence. Scores reflect the balance of positive practices (safety testing, transparency, creator compensation) against negative indicators (lawsuits, documented harms, opaque practices).

3

Overall Score Calculation

The overall ethics score is a weighted average across all 11 categories, with adjustments for the severity of any documented harms. A tool with excellent privacy but active copyright lawsuits will have a lower overall score than the straight average would suggest.

4

Source Citation

For major tools with significant ethics implications, we attach source citations to individual category scores. When you see a score, you can click through to the evidence that supports it — the lawsuits, the research papers, the company policies.

5

Review & Update

Scores are reviewed quarterly and updated when material new information emerges — a new lawsuit, a policy change, an independent audit, or a resolved controversy. Every score change is logged in our changelog.

Verdict Scale

Overall ethics scores translate to verdicts that provide a quick signal of where a tool stands. These are guidelines — a tool with a 69 and a tool with a 70 are not meaningfully different.

Excellent85-100

Industry-leading practices across most categories. May have minor gaps but demonstrates genuine commitment to ethical AI development.

Good70-84

Solid ethical practices with some areas for improvement. Generally trustworthy with transparent practices in most dimensions.

Moderate50-69

Mixed ethical track record. Has both positive practices and documented concerns that users should be aware of before adoption.

Poor25-49

Significant ethical concerns across multiple categories. Documented harms or lack of transparency that warrant careful consideration.

Critical0-24

Serious ethical issues. Active harms, refusal to engage with safety norms, or documented damage to affected communities.

Update Cadence

Quarterly Reviews

Every tool's ethics scores are reviewed at minimum once per quarter. This catches gradual changes in company practices, new certifications, and evolving industry standards.

Event-Driven Updates

Scores are updated immediately when material events occur: new lawsuits filed, data breaches disclosed, safety incidents reported, policy changes announced, or independent audit results published.

Automated Monitoring

Our agent system monitors news, regulatory filings, and industry reports for events that may affect tool scores. Flagged items are reviewed by a human before any score change is made.

Challenge a Score

We believe accountability goes both ways. If you're a tool maker, a user, a researcher, or anyone who believes an ethics score is inaccurate, we want to hear from you.

1

Submit Evidence

Email ethics@ai-dex.pro with the tool name, the category you're challenging, and links to evidence supporting a different score. We require verifiable, public sources — not assertions.

2

Review Period

We review all challenges within 14 business days. Complex cases involving ongoing litigation or emerging research may take longer. We'll acknowledge receipt within 48 hours.

3

Resolution

If the evidence supports a score change, we update the score, add the new citations, and note the change in our changelog. If we disagree, we'll explain our reasoning. Either way, the challenge and our response become part of the public record for that tool.

Independence & Conflicts of Interest

Ai-Dex earns revenue through affiliate partnerships with some of the tools we review. These relationships are always disclosed on the relevant tool and comparison pages. Affiliate partnerships have no influence on ethics scores — a tool we earn commissions from receives identical scrutiny to one we don't.

We do not accept payment for higher scores, preferential placement in ethics leaderboards, or advance notice of score changes. Companies cannot pay to suppress negative findings or remove documented controversies from their profiles.

If we ever identify a conflict of interest that could affect a score, we will disclose it on the tool's page and, if necessary, recuse ourselves from scoring that tool until the conflict is resolved.

Current Coverage

524
Tools Scored
11
Ethics Categories
5,764
Individual Scores
13
Tools with Detailed Citations

Source citations are currently available for the highest-profile tools where ethics implications are most significant. We are expanding citation coverage to all 524 tools over time. Tools without individual citations are scored using the same methodology — the evidence simply hasn't been formally documented in our system yet.

Explore the Data

See how specific tools score across all 11 dimensions, or explore our concern deep-dives with historical context and evidence.