Open Data
Last updated: October 2025
1. Our Commitment to Openness
The Public Prompt Project exists to make AI systems more transparent, accountable, and open. We believe that when AI models shape what the world sees, the underlying data — the prompts, responses, and visibility metrics — should belong to everyone.
That's why all non-personal data we collect is released under an open data license. Our goal is to support researchers, developers, educators, and journalists who want to understand and improve how AI platforms surface information.
2. What "Open Data" Means
"Open Data" means information that can be freely used, reused, and shared by anyone — provided that proper credit is given.
This mirrors how open source works for software, but applies to datasets instead of code.
You can:
- ✅ Use the data for research, education, or commercial projects
- ✅ Build tools or visualizations using the data
- ✅ Share subsets or derived data
- ✅ Combine our data with other open datasets
You must:
- 🔗 Attribute the Public Prompt Project as the data source
- 💬 Include a link to https://publicpromptproject.ai and the license text
- 🚫 Not misrepresent or rebrand the dataset as your own
3. License
All data published by the Public Prompt Project is made available under the Open Data Commons Attribution License (ODC-BY 1.0).
In simple terms:
You are free to:
- Share — copy, distribute, and use the data
- Adapt — transform, build upon, or combine the data for any purpose, even commercially
As long as you:
- Attribute — you must give appropriate credit, link to this page, and indicate if changes were made.
Example Attribution
Data source: Public Prompt Project (publicpromptproject.ai) — Licensed under the Open Data Commons Attribution License (ODC-BY 1.0).
4. Why We Chose ODC-BY
We wanted a license that:
- Encourages open collaboration and research
- Allows both academic and commercial re-use
- Protects attribution without restricting innovation
- Aligns with open standards recognized by the Open Knowledge Foundation
ODC-BY gives others maximum freedom while ensuring the Public Prompt Project remains credited as the original source of data transparency work.
5. What's Included
The open dataset may include:
What's included:
- Anonymized prompt text (if consented)
- Response visibility data across AI engines
- Ranking and citation data from AI outputs
- Metadata (model name, timestamp, query category)
What's not included:
- No personal data (names, emails, IPs, or identifiers)
- No individual user tracking
- No proprietary AI model data or internal logs
6. Responsible Use
We ask that anyone using our data:
- Follows ethical research practices
- Avoids re-identifying users or prompts
- Shares improvements, findings, or derivative datasets back with the community when possible
Responsible reuse helps ensure open data stays open and respectful of user privacy.
7. Citation Example
If you publish a report, visualization, or paper using Public Prompt Project data, please cite it as:
Public Prompt Project. (2025). Open Dataset on AI Visibility and Prompts.
Available at https://publicpromptproject.ai/data.
Licensed under ODC-BY 1.0.
8. Contact & Collaboration
We welcome partnerships with research groups, journalists, educators, and developers.
If you'd like access to structured datasets or want to collaborate on transparency studies:
📧 privacy@publicpromptproject.ai
9. Related Pages
🜂 TL;DR
Public Prompt Project data is free, open, and attributed.
You can use it for anything — as long as you credit the source.
Together, we're building a more transparent internet.