Open Data

Last updated: October 2025

1. Our Commitment to Openness

The Public Prompt Project exists to make AI systems more transparent, accountable, and open. We believe that when AI models shape what the world sees, the underlying data — the prompts, responses, and visibility metrics — should belong to everyone.

That's why all non-personal data we collect is released under an open data license. Our goal is to support researchers, developers, educators, and journalists who want to understand and improve how AI platforms surface information.

2. What "Open Data" Means

"Open Data" means information that can be freely used, reused, and shared by anyone — provided that proper credit is given.

This mirrors how open source works for software, but applies to datasets instead of code.

You can:

  • ✅ Use the data for research, education, or commercial projects
  • ✅ Build tools or visualizations using the data
  • ✅ Share subsets or derived data
  • ✅ Combine our data with other open datasets

You must:

  • 🔗 Attribute the Public Prompt Project as the data source
  • 💬 Include a link to https://publicpromptproject.ai and the license text
  • 🚫 Not misrepresent or rebrand the dataset as your own

3. License

All data published by the Public Prompt Project is made available under the Open Data Commons Attribution License (ODC-BY 1.0).

In simple terms:

You are free to:

  • Share — copy, distribute, and use the data
  • Adapt — transform, build upon, or combine the data for any purpose, even commercially

As long as you:

  • Attribute — you must give appropriate credit, link to this page, and indicate if changes were made.

Example Attribution

Data source: Public Prompt Project (publicpromptproject.ai) — Licensed under the Open Data Commons Attribution License (ODC-BY 1.0).

4. Why We Chose ODC-BY

We wanted a license that:

  • Encourages open collaboration and research
  • Allows both academic and commercial re-use
  • Protects attribution without restricting innovation
  • Aligns with open standards recognized by the Open Knowledge Foundation

ODC-BY gives others maximum freedom while ensuring the Public Prompt Project remains credited as the original source of data transparency work.

5. What's Included

The open dataset may include:

What's included:

  • Anonymized prompt text (if consented)
  • Response visibility data across AI engines
  • Ranking and citation data from AI outputs
  • Metadata (model name, timestamp, query category)

What's not included:

  • No personal data (names, emails, IPs, or identifiers)
  • No individual user tracking
  • No proprietary AI model data or internal logs

6. Responsible Use

We ask that anyone using our data:

  • Follows ethical research practices
  • Avoids re-identifying users or prompts
  • Shares improvements, findings, or derivative datasets back with the community when possible

Responsible reuse helps ensure open data stays open and respectful of user privacy.

7. Citation Example

If you publish a report, visualization, or paper using Public Prompt Project data, please cite it as:

Public Prompt Project. (2025). Open Dataset on AI Visibility and Prompts.
Available at https://publicpromptproject.ai/data.
Licensed under ODC-BY 1.0.

8. Contact & Collaboration

We welcome partnerships with research groups, journalists, educators, and developers.

If you'd like access to structured datasets or want to collaborate on transparency studies:
📧 privacy@publicpromptproject.ai

9. Related Pages

🜂 TL;DR

Public Prompt Project data is free, open, and attributed.
You can use it for anything — as long as you credit the source.
Together, we're building a more transparent internet.

Public Prompt Project Public Prompt Project

Privacy for people. Transparency for machines. We collect anonymous, open-source data to help researchers, creators, and citizens understand how AI systems shape our shared internet.

Sponsored by Pixaura.com and GEOGrow.ai

© 2024 Public Prompt Project. Built with ❤️ for AI transparency and the public good.

Site v1.2.11 (Build 6378) | Plugin v1.2.0