From 029a59e6c88d005dd69462ead11df629c75881dc Mon Sep 17 00:00:00 2001 From: emmanuelgjr <129134995+emmanuelgjr@users.noreply.github.com> Date: Mon, 18 May 2026 22:09:38 -0400 Subject: [PATCH] Delete entrydatasupport/2_0/2_0_candidates directory Outdated files, removing for repo revamp. Signed-off-by: emmanuelgjr <129134995+emmanuelgjr@users.noreply.github.com> --- .../2_0/2_0_candidates/BackdoorAttacks.md | 57 ------------------- .../2_0/2_0_candidates/BypassingSIUSPL.md | 54 ------------------ .../2_0_candidates/DangerousHallucinations.md | 54 ------------------ .../2_0/2_0_candidates/DeepFakeThreat.md | 54 ------------------ .../2_0/2_0_candidates/InsecureDesign.md | 53 ----------------- entrydatasupport/2_0/2_0_candidates/LICENSE | 19 ------- entrydatasupport/2_0/2_0_candidates/README.md | 36 ------------ .../2_0_candidates/SensitiveInfoDisclosure.md | 53 ----------------- .../2_0/2_0_candidates/SupplyChainV.md | 50 ---------------- .../2_0/2_0_candidates/UnwantedAIActions.md | 53 ----------------- .../VulnerableAutonomousAgents.md | 51 ----------------- .../2_0/2_0_candidates/promptinjection.md | 53 ----------------- 12 files changed, 587 deletions(-) delete mode 100644 entrydatasupport/2_0/2_0_candidates/BackdoorAttacks.md delete mode 100644 entrydatasupport/2_0/2_0_candidates/BypassingSIUSPL.md delete mode 100644 entrydatasupport/2_0/2_0_candidates/DangerousHallucinations.md delete mode 100644 entrydatasupport/2_0/2_0_candidates/DeepFakeThreat.md delete mode 100644 entrydatasupport/2_0/2_0_candidates/InsecureDesign.md delete mode 100644 entrydatasupport/2_0/2_0_candidates/LICENSE delete mode 100644 entrydatasupport/2_0/2_0_candidates/README.md delete mode 100644 entrydatasupport/2_0/2_0_candidates/SensitiveInfoDisclosure.md delete mode 100644 entrydatasupport/2_0/2_0_candidates/SupplyChainV.md delete mode 100644 entrydatasupport/2_0/2_0_candidates/UnwantedAIActions.md delete mode 100644 entrydatasupport/2_0/2_0_candidates/VulnerableAutonomousAgents.md delete mode 100644 entrydatasupport/2_0/2_0_candidates/promptinjection.md diff --git a/entrydatasupport/2_0/2_0_candidates/BackdoorAttacks.md b/entrydatasupport/2_0/2_0_candidates/BackdoorAttacks.md deleted file mode 100644 index cf0bb1e..0000000 --- a/entrydatasupport/2_0/2_0_candidates/BackdoorAttacks.md +++ /dev/null @@ -1,57 +0,0 @@ -# Backdoor Attacks in Large Language Models (LLMs) - -**Authors:** -Massimo Bozza, Matteo Meucci - -## Dataset -- **[Hugging Face Model Hub](https://huggingface.co/models):** Repository of models for various use cases, useful for testing backdoor vulnerabilities. -- **[OpenAI Model Gymnasium](https://github.com/Farama-Foundation/Gymnasium):** Datasets and models provided by OpenAI for evaluating the integrity of models. -- **[TensorFlow Hub](https://www.tensorflow.org/hub):** A repository of pre-trained models, helpful for analyzing vulnerabilities in the supply chain. - -## Research Papers and Relevant Research Blogs -1. **Research Paper:** [Backdoor Attacks and Countermeasures on Deep Learning: A Comprehensive Review](https://arxiv.org/abs/2007.10760) - - _Authors:_ Various - - _Abstract:_ This paper provides a comprehensive review of backdoor attacks and countermeasures in deep learning, including LLMs. - -2. **Research Paper:** [Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training](https://arxiv.org/abs/2401.05566) - - _Authors:_ Various - - _Abstract:_ Discusses methods for training deceptive LLMs that can persist through safety training, highlighting the risks of backdoor attacks. - -3. **Research Blog:** [A Survey on Backdoor Attack and Defense in Natural Language Processing](https://arxiv.org/abs/2211.11958) - - _Authors:_ Various - - _Abstract:_ Surveys the landscape of backdoor attacks and defenses in natural language processing, providing insights into current trends and challenges. - -4. **Research Blog:** [Backdoor Attacks on AI Models](https://www.cobalt.io/blog/backdoor-attacks-on-ai-models) - - _Author:_ Cobalt - - _Description:_ Explores the threat of backdoor attacks in AI models and discusses strategies to mitigate these risks. - -5. **Research Paper:** [Adversarial Attacks and Defenses in Machine Learning](https://arxiv.org/abs/1810.00069) - - _Authors:_ Various - - _Abstract:_ Analyzes various adversarial attacks and defenses in machine learning, with a focus on LLMs. - -6. **Research Paper:** [A Survey of Backdoor Attacks and Defenses on Large Language Models: Implications for Security Measures](https://arxiv.org/abs/2406.06852) - - _Authors:_ Various - - _Abstract:_ Authors systematically classified backdoor attacks into three categories: full-parameter fine-tuning, parameter-efficient fine-tuning, and attacks without fine-tuning. - -## Real-World Examples -1. **Example #1:** [Backdoor Attacks in AI Systems: A Comprehensive Review](https://arxiv.org/abs/2007.10760) - - _Source:_ Arxiv - - _Description:_ A comprehensive review of backdoor attacks in AI systems, including case studies and real-world examples. - -2. **Example #2:** [Sleeper Agents: Training Deceptive LLMs](https://arxiv.org/abs/2401.05566) - - _Source:_ Arxiv - - _Description:_ Discusses the training of deceptive LLMs that persist through safety training, including real-world implications. - -3. **Example #3:** [Backdoor Attacks and Defense in NLP](https://arxiv.org/abs/2211.11958) - - _Source:_ Arxiv - - _Description:_ Surveys backdoor attacks and defenses in natural language processing, with real-world examples. - -4. **Example #4:** [B3: Backdoor Attacks against Black-box Machine Learning Models](https://dl.acm.org/doi/10.1145/3605212) - - _Source:_ Cobalt - - _Description:_ Discusses the threat of backdoor attacks in AI models and provides real-world examples of such attacks. - -5. **Example #5:** [Adversarial Attacks and Defenses in Machine Learning](https://arxiv.org/abs/1810.00069) - - _Source:_ Arxiv - - _Description:_ Analyzes adversarial attacks and defenses in machine learning, with a focus on real-world examples of backdoor attacks. - -**Note:** This document outlines the risks and strategies for addressing backdoor attacks in Large Language Models, providing a foundation for further research and practical implementation. diff --git a/entrydatasupport/2_0/2_0_candidates/BypassingSIUSPL.md b/entrydatasupport/2_0/2_0_candidates/BypassingSIUSPL.md deleted file mode 100644 index 1e516bd..0000000 --- a/entrydatasupport/2_0/2_0_candidates/BypassingSIUSPL.md +++ /dev/null @@ -1,54 +0,0 @@ -# Bypassing System Instructions Using System Prompt Leakage - -**Author(s):** -Aditya Rana - -## Dataset -- **[OpenAI GPT-3 System Prompts](https://github.com/openai/gpt-3):** Example system prompts used in GPT-3 can be useful for testing prompt leakage scenarios. -- **[Red Teaming Data](https://github.com/rootsecdev/Azure-Red-Team?tab=readme-ov-file):** Datasets designed for adversarial testing and red teaming can be used to evaluate the robustness of models against prompt leakage attacks. -- **[Sensitive Data Protection Datasets](https://cloud.google.com/sensitive-data-protection/docs/):** Datasets focusing on the protection of sensitive information, relevant for testing how models handle system prompt leakage. - -## Research Papers and Relevant Research Blogs -1. **Research Paper:** [Investigating the prompt leakage effect and black-box defences for multi-turn LLM interactions](https://arxiv.org/html/2404.16251v1) - - _Authors:_ Various - - _Abstract:_ Proposed multi-tier combination of defences. - -2. **Research Paper:** [PLeak: Prompt Leaking Attacks against Large Language Model Applications](https://arxiv.org/abs/2405.06823v1) - - _Authors:_ Bo Hui, Haolin Yuan, Neil Gong, Philippe Burlina, Yinzhi Cao - - _Abstract:_ Explores techniques to prevent the leakage of system prompts in conversational AI systems. - -3. **Research Blog:** [Prompt Leakage: An Emerging Threat](https://www.prompt.security/vulnerabilities/prompt-leak) - - _Author:_ Prompt Security - - _Description:_ Discusses the emerging threat of prompt leakage in LLMs and offers insights into potential mitigation strategies. - -4. **Research Blog:** [System Prompts and Security](https://github.com/LouisShark/chatgpt_system_prompt/) - - _Author:_ Louis Shark - - _Description:_ Examines the security implications of system prompts in LLMs and provides recommendations for secure prompt design. - -5. **Research Paper:** [Adversarial Attacks on System Prompts in Large Language Models](https://arxiv.org/abs/2204.08312) - - _Authors:_ Alice Cooper, Bob Harris - - _Abstract:_ Investigates adversarial attacks targeting system prompts in large language models and proposes defences. - -## Real-World Examples -1. **Example #1:** [OpenAI’s Custom Chatbots Are Leaking Their Secrets](https://www.wired.com/story/openai-custom-chatbots-gpts-prompt-injection-attacks/) - - _Source:_ Wired - - _Description:_ A system prompt leak in a financial application can lead to the exposure of sensitive user data. - -2. **Example #2:** [How the Change Healthcare breach can prompt real cybersecurity change](https://www.securitymagazine.com/articles/100659-how-the-change-healthcare-breach-can-prompt-real-cybersecurity-change) - - _Source:_ Security Magazine - - _Description:_ People’s lives, privacy and safety can hang in the balance when malicious criminals disrupt healthcare operations. - -3. **Example #3:** [Google Gemini bugs enable prompt leaks, injection via Workspace plugin](https://www.scmagazine.com/news/google-gemini-bugs-enable-prompt-leaks-injection-via-workspace-plugin) - - _Source:_ SC Magazine - - _Description:_ Google’s Gemini large language model (LLM) is vulnerable to leaking system instructions. - -4. **Example #4:** [Samsung Software Engineers Busted for Pasting Proprietary Code Into ChatGPT](https://www.pcmag.com/news/samsung-software-engineers-busted-for-pasting-proprietary-code-into-chatgpt) - - _Source:_ PC Mag - - _Description:_ Developers sent lines of confidential code to ChatGPT. - -5. **Example #5:** [Three ways AI chatbots are a security disaster](https://www.technologyreview.com/2023/04/03/1070893/three-ways-ai-chatbots-are-a-security-disaster/) - - _Source:_ MIT Technology Review - - _Description:_ Large language models are full of security vulnerabilities, yet they’re being embedded into tech products on a vast scale. - -**Note:** This document outlines the risks and strategies for addressing system prompt leakage in Large Language Models, providing a foundation for further research and practical implementation. - diff --git a/entrydatasupport/2_0/2_0_candidates/DangerousHallucinations.md b/entrydatasupport/2_0/2_0_candidates/DangerousHallucinations.md deleted file mode 100644 index 4d958a5..0000000 --- a/entrydatasupport/2_0/2_0_candidates/DangerousHallucinations.md +++ /dev/null @@ -1,54 +0,0 @@ -# Dangerous Hallucinations in Large Language Models (LLMs) - -**Author(s):** -Steve Wilson - -## Dataset -- **[PubMedQA](https://github.com/pubmedqa/pubmedqa):** A dataset for biomedical research question-answering, useful for evaluating the accuracy of LLM-generated medical information. -- **[TruthfulQA](https://github.com/sylinrl/TruthfulQA):** A benchmark to test whether LLMs generate truthful answers to questions. -- **[FeTaQA](https://github.com/Yale-LILY/FETAQA):** A dataset designed for evaluating factual correctness and faithfulness of text generation in LLMs. - -## Research Papers and Relevant Research Blogs -1. **Research Paper:** [Unveiling the Generalization Power of Fine-Tuned Large Language Models](https://arxiv.org/html/2403.09162v1) - - _Authors:_ Haoran Yang, Yumeng Zhang, Jiaqi Xu, Hongyuan Lu, Pheng Ann Heng, Wai Lam - - _Abstract:_ Aim to contribute valuable insights into the evolving landscape of fine-tuning practices. - -2. **Research Paper:** [Reducing Hallucinations in Neural Machine Translation with Feature Attribution](https://arxiv.org/abs/2211.09878) - - _Authors:_ Joël Tang, Marina Fomicheva, Lucia Specia - - _Abstract:_ Propose a novel loss function that substantially helps reduce hallucinations. - -3. **Research Blog:** [Guide to Hallucinations in Large Language Models](https://www.lakera.ai/blog/guide-to-hallucinations-in-large-language-models) - - _Author:_ Lakera - - _Description:_ Beginner’s Guide to Hallucinations in Large Language Models - - -4. **Research Blog:** [How to Reduce Hallucinations from Large Language Models](https://thenewstack.io/how-to-reduce-the-hallucinations-from-large-language-models/) - - _Author:_ The New Stack - - _Description:_ Practical steps for reducing hallucinations in LLM outputs. - -5. **Research Paper:** [Hallucinations in Neural Machine Translation](https://arxiv.org/abs/1710.11363) - - _Authors:_ Katherine Lee, Orhan Firat, Ashish Agarwal, Clara Fannjiang, David Sussillo - - _Abstract:_ Introduce and analyze the phenomenon of "hallucinations" in NMT, or spurious translations unrelated to source text, and propose methods to reduce its frequency. - -## Real-World Examples -1. **Example #1:** [A news site used AI to write articles. It was a journalistic disaster](https://www.washingtonpost.com/media/2023/01/17/cnet-ai-articles-journalism-corrections/) - - _Source:_ Washington Post - - _Description:_ A news site utilized an AI to generate articles, which led to numerous factual inaccuracies and damaged the site's credibility. - -2. **Example #2:** [AI Lawsuits Worth Watching: A Curated Guide](https://www.techpolicy.press/ai-lawsuits-worth-watching-a-curated-guide/) - - _Source:_ TechPolicy - - _Description:_ A legal firm presented fabricated legal precedents in court due to reliance on an LLM, resulting in legal and reputational consequences. - -3. **Example #3:** [AI Hallucinations: Package Risk](https://vulcan.io/blog/ai-hallucinations-package-risk) - - _Source:_ Vulcan.io - - _Description:_ Developers were misled by an AI coding assistant to use a non-existent code library, which was then exploited by attackers. - -4. **Example #4:** [Software developers want AI to give medical advice](https://www.cbsnews.com/news/software-developers-want-ai-to-give-medical-advice-how-accurate-is-it/) - - _Source:_ CBS News - - _Description:_ An AI provided false medical advice, leading to potential harm and illustrating the risks of LLM hallucinations in healthcare. - -5. **Example #5:** [A Word of Caution: Company Liable for Misrepresentations Made by Chatbot](https://mcmillan.ca/insights/a-word-of-caution-company-liable-for-misrepresentations-made-by-chatbot/) - - _Source:_ MCMillan - - _Description:_ The chatbot provided the customer with incorrect information relating to Air Canada’s bereavement travel policy. - -**Note:** This document outlines the risks and strategies for addressing dangerous hallucinations in Large Language Models, providing a foundation for further research and practical implementation. - diff --git a/entrydatasupport/2_0/2_0_candidates/DeepFakeThreat.md b/entrydatasupport/2_0/2_0_candidates/DeepFakeThreat.md deleted file mode 100644 index d9ecd53..0000000 --- a/entrydatasupport/2_0/2_0_candidates/DeepFakeThreat.md +++ /dev/null @@ -1,54 +0,0 @@ -# Deepfake Threat - -**Authors:** -Ken Huang, Ads - GangGreenTemperTatum - -## Dataset -- **[Deepfake Detection Challenge Dataset](https://www.kaggle.com/c/deepfake-detection-challenge/data):** A large dataset of real and fake videos to support deepfake detection research. -- **[DFDC (Deepfake Detection Challenge) Preview Dataset](https://ai.meta.com/datasets/disc21-dataset/):** A dataset provided by Facebook for developing and testing deepfake detection algorithms. -- **[UADFV (Deepfake Video Detection Dataset)](https://github.com/cuihaoleo/kaggle-dfdc):** A dataset containing deepfake videos for research on detection methods. - -## Research Papers and Relevant Research Blogs -1. **Research Paper:** [Dangers of Deepfake: What to Watch For](https://uit.stanford.edu/news/dangers-deepfake-what-watch) - - _Authors:_ Stanford - - _Abstract:_ In the past few years, artificial intelligence technology has crossed a threshold with the capability to make people look and sound like other people. - -2. **Research Paper:** [Adversarially Robust Deepfake Detection via Adversarial Feature Similarity Learning](https://arxiv.org/html/2403.08806v1) - - _Authors:_ Sarwar Khan, Jun-Cheng Chen, Wen-Hung Liao, Chu-Song Chen - - _Abstract:_ Extensive experiments on popular deepfake datasets. - -3. **Research Blog:** [Cybersecurity, Deepfakes, and the Human Risk of AI Fraud](https://www.govtech.com/security/cybersecurity-deepfakes-and-the-human-risk-of-ai-fraud) - - _Author:_ Government Technology - - _Description:_ Examines the risks associated with deepfakes in cybersecurity and provides strategies for mitigation. - -4. **Research Blog:** [The Rise of Deepfake Phishing Attacks](https://bufferzonesecurity.com/the-rise-of-deepfake-phishing-attacks/) - - _Author:_ BufferZone Security - - _Description:_ Discusses the increasing prevalence of deepfake phishing attacks and offers prevention tips. - -5. **Research Paper:** [Deep Learning for Deepfakes Creation and Detection: A Survey](https://arxiv.org/abs/1909.11573) - - _Authors:_ Various - - _Abstract:_ Provides an overview of current deepfake detection methods and their effectiveness. - -## Real-World Examples -1. **Example #1:** [UK COMPANY SCAMMED $25 MILLION VIA DEEPFAKES](https://www.cnn.com/2024/02/04/asia/deepfake-cfo-scam-hong-kong-intl-hnk/index.html) - - _Source:_ CNN - - _Description:_ A UK company was scammed out of $25 million using deepfake technology. - -2. **Example #2:** [‘Everyone looked real’: multinational firm’s Hong Kong office loses HK$200 million after scammers stage deepfake video meeting](https://www.scmp.com/news/hong-kong/law-and-crime/article/3250851/everyone-looked-real-multinational-firms-hong-kong-office-loses-hk200-million-after-scammers-stage) - - _Source:_ South China Morning Post - - _Description:_ Scammers used a deepfake video meeting to steal HK$200 million from a multinational firm. - -3. **Example #3:** [Crypto Projects Scammed with Deepfake AI Video of Binance Executive](https://www.bitdefender.com/blog/hotforsecurity/crypto-projects-scammed-with-deepfake-ai-video-of-binance-executive/) - - _Source:_ Bitdefender - - _Description:_ Crypto projects were scammed using a deepfake AI video of a Binance executive. - -4. **Example #4:** [Fraudsters Used AI to Mimic CEO’s Voice in Unusual Cybercrime Case](https://www.wsj.com/articles/fraudsters-use-ai-to-mimic-ceos-voice-in-unusual-cybercrime-case-11567157402) - - _Source:_ Wall Street Journal - - _Description:_ Fraudsters used AI to mimic a CEO's voice in a cybercrime case, leading to significant financial loss. - -5. **Example #5:** [Deepfake on the Rise](https://www.techradar.com/pro/security/deepfake-threats-are-on-the-rise-new-research-shows-worrying-rise-in-dangerous-new-scams) - - _Source:_ TechRadar - - _Description:_ Deepfake scam involving fake identities and manipulated videos. - - -**Note:** This document outlines the risks and strategies for addressing deepfake threats in digital security, providing a foundation for further research and practical implementation. diff --git a/entrydatasupport/2_0/2_0_candidates/InsecureDesign.md b/entrydatasupport/2_0/2_0_candidates/InsecureDesign.md deleted file mode 100644 index c6879b2..0000000 --- a/entrydatasupport/2_0/2_0_candidates/InsecureDesign.md +++ /dev/null @@ -1,53 +0,0 @@ -# Insecure Design - -**Author(s):** -Ads - GangGreenTemperTatum - -## Dataset -- **[AI Ethics Dataset](https://www.kaggle.com/code/alexisbcook/identifying-bias-in-ai):** Contains data on AI ethics and biases, useful for understanding the ethical implications of AI model design. -- **[Privacy Policy Dataset](https://www.kaggle.com/datasets/krist0phersmith/ai-4-privacy-pii-masking-en-38k):** A collection of privacy policies for various AI products, helpful for evaluating compliance and data usage. -- **[Safety and Fairness Evaluation](https://www.kaggle.com/datasets/strikoder/llm-evaluationhub):** Presents an enhanced dataset tailored for the evaluation and assessment of Large Language Models (LLMs). - -## Research Papers and Relevant Research Blogs -1. **Research Paper:** [An Introduction to Training LLMs Using Reinforcement Learning From Human Feedback (RLHF)](https://wandb.ai/ayush-thakur/Intro-RLAIF/reports/An-Introduction-to-Training-LLMs-Using-Reinforcement-Learning-From-Human-Feedback-RLHF---VmlldzozMzYyNjcy) - - _Authors:_ Ayush Thakur - - _Abstract:_ Introduces the concept of training large language models using reinforcement learning from human feedback. - -2. **Research Paper:** [Scaling Laws for Reward Model Overoptimization](https://openai.com/research/scaling-laws-for-reward-model-overoptimization) - - _Authors:_ OpenAI Team - - _Abstract:_ Discusses the implications of overoptimizing reward models in large language models. - -3. **Research Blog:** [Secure Design Principles for AI Systems](https://www.lexology.com/library/detail.aspx?g=58bc82af-3be3-49fd-b362-2365d764bf8f) - - _Author:_ Lexology - - _Description:_ Explores secure design principles to mitigate risks in AI systems. - -4. **Research Blog:** [The Role of Privacy Policies in AI Development](https://platform.openai.com/docs/models/how-we-use-your-data) - - _Author:_ OpenAI - - _Description:_ Details how privacy policies should be integrated into AI development to protect user data. - -5. **Research Paper:** [Governance of artificial intelligence: A risk and guideline-based integrative framework](https://www.sciencedirect.com/science/article/abs/pii/S0740624X22000181) - - _Authors:_ Science Direct - - _Abstract:_ The framework constitutes a comprehensive reference point for developing and implementing AI governance strategies and measures in the public sector. - -## Real-World Examples -1. **Example #1:** [AI shows clear racial bias when used for job recruiting](https://mashable.com/article/openai-chatgpt-racial-bias-in-recruiting) - - _Source:_ Mashable - - _Description:_ An AI recruiting tool exhibited biases, leading to unfair hiring practices. - -2. **Example #2:** [The rise of AI-Generated propaganda: the impact of AI and deepfakes on US elections](https://cybernews.com/editorial/impact-ai-deepfakes-us-elections/) - - _Source:_ Cyber News - - _Description:_ Data poisoning in AI models was used to sway public opinion during an election. - -3. **Example #3:** [Lack of AI Training Leads to Insecure Application Design](https://www.healthcareitnews.com/video/emea/use-ai-effectively-clinicians-must-learn-proper-data-hygiene) - - _Source:_ Healthcare IT News - - _Description:_ Insufficient AI training for developers led to insecure application design, exposing sensitive patient data. - -4. **Example #4:** [Company Faces Penalties for Exposing Client Data](https://www.finextra.com/newsarticle/36584/company-faces-penalties-for-exposing-client-data) - - _Source:_ Finextra - - _Description:_ A company faced penalties for exposing client data due to a lack of understanding of AI privacy policies. - -5. **Example #5:** [AI hiring tools may be filtering out the best job applicants](https://www.bbc.com/worklife/article/20240214-ai-recruiting-hiring-software-bias-discrimination) - - _Source:_ BBC - - _Description:_ Software may be excising the best candidates due to insecure design. - -**Note:** This document outlines the risks and strategies for addressing insecure design in AI systems, providing a foundation for further research and practical implementation. diff --git a/entrydatasupport/2_0/2_0_candidates/LICENSE b/entrydatasupport/2_0/2_0_candidates/LICENSE deleted file mode 100644 index 9cf1062..0000000 --- a/entrydatasupport/2_0/2_0_candidates/LICENSE +++ /dev/null @@ -1,19 +0,0 @@ -MIT License - -Permission is hereby granted, free of charge, to any person obtaining a copy -of this software and associated documentation files (the "Software"), to deal -in the Software without restriction, including without limitation the rights -to use, copy, modify, merge, publish, distribute, sublicense, and/or sell -copies of the Software, and to permit persons to whom the Software is -furnished to do so, subject to the following conditions: - -The above copyright notice and this permission notice shall be included in all -copies or substantial portions of the Software. - -THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR -IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, -FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE -AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER -LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, -OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE -SOFTWARE. diff --git a/entrydatasupport/2_0/2_0_candidates/README.md b/entrydatasupport/2_0/2_0_candidates/README.md deleted file mode 100644 index ea0f415..0000000 --- a/entrydatasupport/2_0/2_0_candidates/README.md +++ /dev/null @@ -1,36 +0,0 @@ -# OWASP TOP 10 for LLM AI Apps - Data Support - -This repository provides the necessary data support gathered for the project **OWASP TOP 10 for LLM AI Apps**. The data included here aims to assist developers, researchers, and security professionals in understanding and mitigating risks associated with Large Language Model (LLM) applications. - -## Data and Resources - -This repository contains various datasets, research papers, and real-world examples that support the identification and management of security risks in LLM AI applications. The resources are organized according to different security concerns and vulnerabilities relevant to LLMs. - -## Disclaimer - -Please note that running tests with datasets on LLMs may incur costs, including but not limited to computational resources and storage. Users are responsible for managing and covering any incurred costs. - -Weblinks provided in this repository were verified at the time of creation. However, they may change over time. Users should verify the validity of these links and the fonts for their specific needs. - -## Usage - -This repository is intended for use by developers, researchers, and security professionals who are working on securing LLM applications. It is part of an open-source project and contributions are welcome. Please follow the contribution guidelines if you wish to add to this repository. - -## Contribution Guidelines - -- Fork the repository. -- Create a new branch for your changes. -- Make your changes and commit them with clear and concise messages. -- Submit a pull request for review. - -## License - -This project is licensed under the MIT License. See the LICENSE file for details. - -## Contact - -For any questions or further information, please contact the project maintainers. - ---- - -**Note:** This project is part of an open-source initiative aimed at enhancing the security of LLM AI applications. Your contributions and feedback are highly appreciated. diff --git a/entrydatasupport/2_0/2_0_candidates/SensitiveInfoDisclosure.md b/entrydatasupport/2_0/2_0_candidates/SensitiveInfoDisclosure.md deleted file mode 100644 index cceb6c7..0000000 --- a/entrydatasupport/2_0/2_0_candidates/SensitiveInfoDisclosure.md +++ /dev/null @@ -1,53 +0,0 @@ -# Sensitive Information Disclosure - -**Authors:** -Rachel James, Bryan Nakayama - -## Dataset -- **[Enron Email Dataset](https://edrm.net/resources/data-sets/):** Contains a large number of emails, useful for testing information disclosure and data sanitization techniques. -- **[Synthetic Training Data for LLMs](https://research.google/blog/protecting-users-with-differentially-private-synthetic-training-data/):** A synthetic dataset designed for evaluating privacy-preserving techniques in LLM training. -- **[Healthcare Data for ML Models](https://apps.who.int/gho/data/node.resources):** Contains medical data, useful for testing the handling of sensitive information in LLM applications. - -## Research Papers and Relevant Research Blogs -1. **Research Paper:** [Preserving data privacy in machine learning systems](https://www.sciencedirect.com/science/article/pii/S0167404823005151) - - _Authors:_ Soumia Zohra El Mestari, Gabriele Lenzini, Huseyin Demirci - - _Abstract:_ Discuss current challenges and research questions that are still unsolved in the field. - -2. **Research Paper:** [Mitigating Unintended Memorization in Language Models via Alternating Teaching](https://arxiv.org/abs/2210.06772) - - _Authors:_ Zhe Liu, Xuedong Zhang, Fuchun Peng - - _Abstract:_ Propose a novel approach called alternating teaching to mitigate unintended memorization in sequential modeling. - -3. **Research Blog:** [Preventing Data Leakage in Machine Learning Models](https://shelf.io/blog/preventing-data-leakage-in-machine-learning-models/) - - _Author:_ Shelf - - _Description:_ Discusses strategies to prevent sensitive data leakage in AI models, with a focus on practical implementations. - -4. **Research Blog:** [Mitigating Token-Length Side-Channel Attacks](https://blog.cloudflare.com/ai-side-channel-attack-mitigated) - - _Author:_ Cloudflare - - _Description:_ Examines the risks and mitigation strategies for token-length side-channel attacks in AI applications. - -5. **Research Paper:** [A Survey of Privacy Attacks in Machine Learning](https://dl.acm.org/doi/full/10.1145/3624010) - - _Authors:_ Maria Rigaki, Sebastian Garcia - - _Abstract:_ Analysis of more than 45 papers related to privacy attacks against machine learning that have been published during the past seven years. - -## Real-World Examples -1. **Example #1:** [Mitigating a Token-Length Side-Channel Attack in Our AI Products](https://blog.cloudflare.com/ai-side-channel-attack-mitigated/) - - _Source:_ Cloudflare - - _Description:_ A real-world example of mitigating a token-length side-channel attack in AI products. - -2. **Example #2:** [ChatGPT Leaks Sensitive User Data, OpenAI Suspects Hack](https://www.spiceworks.com/tech/artificial-intelligence/news/chatgpt-leaks-sensitive-user-data-openai-suspects-hack/) - - _Source:_ Spice Works - - _Description:_ The leaks exposed conversations, personal data, and login credentials. - -3. **Example #3:** [Amazon’s Q has ‘severe hallucinations’ and leaks confidential data in public preview](https://www.platformer.news/amazons-q-has-severe-hallucinations/) - - _Source:_ Plataformer - - _Description:_ A chatbot revealed confidential information in a public response due to inadequate data sanitization. - -4. **Example #4:** [Atlantic Health System CIDO offers lessons on AI in cybersecurity](https://www.healthcareitnews.com/news/atlantic-health-system-cido-offers-lessons-ai-cybersecurity) - - _Source:_ Healthcare IT News - - _Description:_ An AI model used in healthcare leaked patient information, highlighting the risks of sensitive data exposure. - -5. **Example #5:** [Side-Channel Attack Exposes AI Model Outputs](https://www.securityweek.com/how-quantum-computing-will-impact-cybersecurity/) - - _Source:_ SecurityWeek - - _Description:_ A side-channel attack of an AI model, demonstrating the need for robust security measures. - -**Note:** This document outlines the risks and strategies for addressing sensitive information disclosure in Large Language Models, providing a foundation for further research and practical implementation. diff --git a/entrydatasupport/2_0/2_0_candidates/SupplyChainV.md b/entrydatasupport/2_0/2_0_candidates/SupplyChainV.md deleted file mode 100644 index 3a2170b..0000000 --- a/entrydatasupport/2_0/2_0_candidates/SupplyChainV.md +++ /dev/null @@ -1,50 +0,0 @@ -# Supply-Chain Vulnerabilities in Large Language Models (LLMs) - -## Dataset -- **[Hugging Face Model Hub](https://huggingface.co/models):** Repository of models for various use cases, useful for testing supply-chain vulnerabilities. -- **[OpenAI Model Gymnasium](https://github.com/Farama-Foundation/Gymnasium):** Datasets and models provided by OpenAI for evaluating the integrity of models. -- **[TensorFlow Hub](https://www.tensorflow.org/hub):** A repository of pre-trained models, helpful for analyzing vulnerabilities in the supply chain. - -## Research Papers and Relevant Research Blogs -1. **Research Paper:** [Compromised PyTorch-nightly Dependency Chain](https://pytorch.org/blog/compromised-nightly-dependency) - - _Authors:_ PyTorch Team - - _Abstract:_ Discusses a real-world example of a compromised dependency chain in the PyTorch nightly builds. - -2. **Research Paper:** [PoisonGPT: How we hid a lobotomized LLM on Hugging Face to spread fake news](https://blog.mithrilsecurity.io/poisongpt-how-we-hid-a-lobotomized-llm-on-hugging-face-to-spread-fake-news) - - _Authors:_ Mithril Security Team - - _Abstract:_ Explores the method and impact of hiding a lobotomized LLM to spread misinformation. - -3. **Research Blog:** [LLM Applications Supply Chain Threat Model](https://github.com/jsotiro/ThreatModels/blob/main/LLM%20Threats-LLM%20Supply%20Chain.png) - - _Author:_ John Sotiropoulos - - _Description:_ Provides a threat model for understanding supply chain vulnerabilities in LLM applications. - -4. **Research Blog:** [Hijacking Safetensors Conversion on Hugging Face](https://hiddenlayer.com/research/silent-sabotage/) - - _Author:_ HiddenLayer Team - - _Description:_ Examines an attack on Safetensors conversion to inject vulnerabilities. - -5. **Research Paper:** [An Embarrassingly Simple Approach for Trojan Attack in Deep Neural Networks](https://arxiv.org/abs/2006.08131) - - _Authors:_ Ruixiang Tang, Mengnan Du, Ninghao Liu, Fan Yang, Xia Hu - - _Abstract:_ Investigates a simple approach for Trojan attacks in deep neural networks and its implications for LLMs. - -## Real-World Examples -1. **Example #1:** [ChatGPT Data Breach Confirmed as Security Firm Warns of Vulnerable Component Exploitation](https://www.securityweek.com/chatgpt-data-breach-confirmed-as-security-firm-warns-of-vulnerable-component-exploitation/) - - _Source:_ SecurityWeek - - _Description:_ A confirmed data breach involving ChatGPT due to vulnerable component exploitation. - -2. **Example #2:** [Compromised PyTorch-nightly Dependency Chain](https://pytorch.org/blog/compromised-nightly-dependency) - - _Source:_ PyTorch Blog - - _Description:_ A real-world example of a compromised dependency chain in the PyTorch nightly builds. - -3. **Example #3:** [PoisonGPT: How we hid a lobotomized LLM on Hugging Face to spread fake news](https://blog.mithrilsecurity.io/poisongpt-how-we-hid-a-lobotomized-llm-on-hugging-face-to-spread-fake-news) - - _Source:_ Mithril Security - - _Description:_ A demonstration of hiding a lobotomized LLM on Hugging Face to spread fake news. - -4. **Example #4:** [Large Language Models On-Device with MediaPipe and TensorFlow Lite](https://developers.googleblog.com/en/large-language-models-on-device-with-mediapipe-and-tensorflow-lite/) - - _Source:_ Google Developers Blog - - _Description:_ Discusses the implementation and risks of on-device LLMs with MediaPipe and TensorFlow Lite. - -5. **Example #5:** [Hijacking Safetensors Conversion on Hugging Face](https://hiddenlayer.com/research/silent-sabotage/) - - _Source:_ HiddenLayer - - _Description:_ An attack on Safetensors conversion to inject vulnerabilities into LLMs. - -**Note:** This document outlines the risks and strategies for addressing supply-chain vulnerabilities in Large Language Models, providing a foundation for further research and practical implementation. diff --git a/entrydatasupport/2_0/2_0_candidates/UnwantedAIActions.md b/entrydatasupport/2_0/2_0_candidates/UnwantedAIActions.md deleted file mode 100644 index 6f810a6..0000000 --- a/entrydatasupport/2_0/2_0_candidates/UnwantedAIActions.md +++ /dev/null @@ -1,53 +0,0 @@ -# Unwanted AI Actions by General Purpose LLMs - -**Author(s):** -Markus Hupfauer - -## Dataset -- **[Health Insurance Dataset](https://www.kaggle.com/datasets/hhs/health-insurance-marketplace):** Useful for testing LLMs in the context of health insurance to ensure they do not provide unauthorized advice. -- **[Financial Services Dataset](https://www.kaggle.com/datasets/ealaxi/paysim1):** Contains financial data, helpful for validating that LLMs do not offer unauthorized financial advice. -- **[Customer Service Dataset](https://www.kaggle.com/datasets/teejmahal20/airline-passenger-satisfaction):** A dataset of customer service interactions to ensure LLMs maintain neutrality and do not recommend competitors. - -## Research Papers and Relevant Research Blogs -1. **Research Paper:** [Ensuring Legal Compliance in AI Systems](https://www.traverselegal.com/blog/ai-data-privacy-compliance/) - - _Authors:_ Various - - _Abstract:_ Discusses methods for ensuring AI systems remain compliant with legal regulations, preventing unwanted AI actions. - -2. **Research Paper:** [AI RISK MANAGEMENT FRAMEWORK](https://www.nist.gov/itl/ai-risk-management-framework) - - _Authors:_ NIST - - _Abstract:_ Explores governance protocols and risk management strategies for deploying AI systems in sensitive environments. - -3. **Research Blog:** [AI Regulation Has Its Own Alignment Problem](https://hai.stanford.edu/policy-brief-ai-regulatory-alignment-problem) - - _Authors:_ Guha, Neel; Lawrence, Christie M., et al. - - _Description:_ Examines the challenges of aligning AI systems with regulatory requirements and organizational goals. - -4. **Research Blog:** [Building Robust AI Systems](https://dgallitelli95.medium.com/building-robust-ai-systems-with-dspy-and-amazon-bedrock-d0376f158d88) - - _Author:_ Medium - - _Description:_ Provides practical advice on building robust AI systems that avoid unintended actions and legal pitfalls. - -5. **Research Blog:** [AI in Customer Service: Balancing Privacy and Innovation](https://dialzara.com/blog/ai-customer-service-balancing-privacy-and-innovation/) - - _Authors:_ Dial Zara - - _Abstract:_ Analyzes the use of AI in customer service, focusing on maintaining privacy and avoiding unwanted actions. - -## Real-World Examples -1. **Example #1:** [Liability for harm caused by AI in healthcare](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10755877/) - - _Source:_ NLOM - - _Description:_ An AI system unlawfully provided healthcare advice, leading to legal action and fines. - -2. **Example #2:** [Google’s A.I. Search Errors Cause a Furor Online](https://www.nytimes.com/2024/05/24/technology/google-ai-overview-search.html) - - _Source:_ NY Times - - _Description:_ The company’s latest A.I. search feature has erroneously told users to eat glue and rocks, provoking a backlash among users. - -3. **Example #3:** [Financial AI System Breaches Legal Requirements](https://www.finextra.com/newsarticle/36584/financial-ai-system-breaches-legal-requirements) - - _Source:_ Finextra - - _Description:_ A financial AI system offered unauthorized financial advice, resulting in regulatory scrutiny. - -4. **Example #4:** [Airline held liable](https://www.bbc.com/travel/article/20240222-air-canada-chatbot-misinformation-what-travellers-should-know) - - _Source:_ BBC News - - _Description:_ An AI used in customer service led to legal issues due to inappropriate recommendations. - -5. **Example #5:** [AI tools in healthcare sector rife with legal and ethical risk](https://www.lexpert.ca/special-editions/technology/ai-tools-in-healthcare-sector-rife-with-legal-and-ethical-risk-that-must-be-mitigated/386725) - - _Source:_ Lexpert - - _Description:_ A healthcare AI system overstepped legal boundaries, providing unauthorized medical advice. - -**Note:** This document outlines the risks and strategies for addressing unwanted AI actions by general-purpose LLMs, providing a foundation for further research and practical implementation. diff --git a/entrydatasupport/2_0/2_0_candidates/VulnerableAutonomousAgents.md b/entrydatasupport/2_0/2_0_candidates/VulnerableAutonomousAgents.md deleted file mode 100644 index b8f70bc..0000000 --- a/entrydatasupport/2_0/2_0_candidates/VulnerableAutonomousAgents.md +++ /dev/null @@ -1,51 +0,0 @@ -# Vulnerable Autonomous Agents - -**Author(s):** -John Sotiropoulos - -## Dataset -- **[LangChain Autonomous Agents](https://js.langchain.com/v0.1/docs/use_cases/autonomous_agents/):** Resources for building and testing autonomous agents. -- **[LLM Models on Mobile](https://developers.googleblog.com/en/large-language-models-on-device-with-mediapipe-and-tensorflow-lite/):** Datasets and models for testing on-device LLMs. - -## Research Papers and Relevant Research Blogs -1. **Research Paper:** [TrustAgent: Ensuring Safe and Trustworthy LLM-based Agents](https://openreview.net/forum?id=zozQq4UWN3) - - _Authors:_ Various - - _Abstract:_ Discusses methods for ensuring the safety and trustworthiness of LLM-based agents. - -2. **Research Paper:** [Integrating LLM and Reinforcement Learning for Cybersecurity](https://arxiv.org/abs/2403.17674) - - _Authors:_ Various - - _Abstract:_ Explores integrating LLM and reinforcement learning to enhance cybersecurity measures for autonomous agents. - -3. **Research Blog:** [Here Come the AI Worms](https://www.wired.com/story/here-come-the-ai-worms/) - - _Author:_ Wired - - _Description:_ Examines the risks of AI worms and malware spreading through interconnected AI agents. - -4. **Research Blog:** [Building a Zero Trust Security Model for Autonomous Systems](https://spectrum.ieee.org/zero-trust-security-autonomous-systems) - - _Author:_ IEEE Spectrum - - _Description:_ Explores the implementation of zero trust security models for autonomous systems to mitigate risks. - -5. **Research Paper:** [AutoDefense: Multi-Agent LLM Defense against Jailbreak Attacks](https://arxiv.org/abs/2403.04783) - - _Authors:_ Various - - _Abstract:_ Discusses multi-agent defense mechanisms to protect against jailbreak attacks in LLM-based systems. - -## Real-World Examples -1. **Example #1:** [Here Come the AI Worms](https://www.wired.com/story/here-come-the-ai-worms/) - - _Source:_ Wired - - _Description:_ Researchers created the AI worm Morris II, which infects generative AI ecosystems by embedding itself in AI-assisted email applications. - -2. **Example #2:** [We Need to Control AI Agents Now](https://www.theatlantic.com/technology/archive/2024/07/ai-agents-safety-risks/678864/) - - _Source:_ The Atlantic - - _Description:_ Automated bots are about to be everywhere, with potentially devastating consequences. - -3. **Example #3:** [National Power Grid Compromise](https://www.gov.uk/government/publications/frontier-ai-capabilities-and-risks-discussion-paper) - - _Source:_ UK Government - - _Description:_ Discusses the potential for adversarial agents to manipulate power grid data, causing widespread outages. - -4. **Example #4:** [Personal Assistant Tampering](https://www.enisa.europa.eu/publications/considerations-in-autonomous-agents) - - _Source:_ ENISA - - _Description:_ An attacker exploits a misconfiguration in a mobile health assistant, leading to personal harm and data exfiltration. - -5. **Example #5:** [Military Drone Kills Operator Attempting to Abort Operation](https://news.sky.com/story/ai-drone-kills-human-operator-during-simulation-which-us-air-force-says-didnt-take-place-12894929) - - _Source:_ Sky News - - _Description:_ A military drone kills its operator in a simulation, highlighting the need for robust safeguards in autonomous systems. - diff --git a/entrydatasupport/2_0/2_0_candidates/promptinjection.md b/entrydatasupport/2_0/2_0_candidates/promptinjection.md deleted file mode 100644 index 0cd26cd..0000000 --- a/entrydatasupport/2_0/2_0_candidates/promptinjection.md +++ /dev/null @@ -1,53 +0,0 @@ -# LLM01: Prompt Injection - -**Author(s):** -Rachel James, Bryan (also combined with things from AdsDawson_AdversarialInputs) - -## Dataset -- **[DeBERTa v3 Base - Prompt Injection v2](https://huggingface.co/protectai/deberta-v3-base-prompt-injection-v2):** This dataset, hosted by Protect AI on Hugging Face, is designed to train and evaluate language models for robustness against prompt injection attacks. It includes a variety of prompts specifically curated to test for vulnerabilities related to both direct and indirect prompt injections. Researchers can utilize this dataset to enhance the security and robustness of large language models. - -## Research Papers and Relevant Research Blogs -1. **Research Paper:** [Adversarial Attacks on Machine Learning Models with Multiple Oracles](https://www.ijcai.org/proceedings/2019/0925.pdf) - - _Authors:_ K Wang - - _Abstract:_ Discusses methods for adversarial attacks using multiple oracles. - -2. **Research Blog:** [The ELI5 Guide to Prompt Injection](https://www.lakera.ai/blog/guide-to-prompt-injection) - - _Author:_ Lakera, R. (2023) - - _Description:_ A simplified guide to understanding prompt injection attacks. - -3. **Research Blog:** [GenAI Security Framework Blog Series 2/6: Prompt Injection 101](https://live.paloaltonetworks.com/t5/community-blogs/genai-security-framework-blog-series-2-6-prompt-injection-101/ba-p/590862) - - _Author:_ Palo Alto Networks. (2023) - - _Description:_ Introduction to prompt injection attacks and their implications. - -4. **Research Paper:** [Benchmarking and Defending Against Indirect Prompt Injection Attacks](https://arxiv.org/html/2312.14197v2) - - _Authors:_ Jingwei Yi, Yueqi Xie, Bin Zhu, Emre Kıcıman, Guangzhong Sun, Xing Xie, Fangzhao Wu - - _Abstract:_ Explores benchmarking and defensive strategies against indirect prompt injection attacks. - -5. **Research Paper:** [An Early Categorization of Prompt Injection Attacks on Large Language Models](https://arxiv.org/abs/2402.00898) - - _Authors:_ Sippo Rossi, Alisia Marianne Michel, Raghava Rao Mukkamala, Jason Bennett Thatcher - - _Abstract:_ Provides an early categorization of different types of prompt injection attacks on large language models. - -## Real-World Examples -1. **Example #1:** [Twitter bot hijack (2022)](https://incidentdatabase.ai/cite/352/) - - _Source:_ Incident report - - _Description:_ Incident involving the hijacking of a Twitter bot through prompt injection. - -2. **Example #2:** [Bing Chat manipulation (2023)](https://www.theverge.com/2023/2/15/23599072/microsoft-ai-bing-personality-conversations-spy-employees-webcams) - - _Source:_ The Verge - - _Description:_ Manipulation of Bing Chat through prompt injection to spy on employees. - -3. **Example #3:** [Grandma Exploit jailbreak](https://www.reddit.com/r/ChatGPT/comments/12sn0kk/grandma_exploit/?rdt=63684) - - _Source:_ Reddit discussion - - _Description:_ A jailbreak exploit allowing manipulation of a chatbot through prompt injection. - -4. **Example #4:** ["Haha pwned" demonstration](https://simonwillison.net/2022/Sep/12/prompt-injection/) - - _Source:_ Simon Willison's blog - - _Description:_ Demonstration of a prompt injection attack resulting in "Haha pwned" output. - -5. **Example #5:** [Cross-site scripting (XSS) in AI-powered web applications](https://www.cobalt.io/blog/prompt-injection-attacks) - - _Source:_ Cobalt blog - - _Description:_ Examination of cross-site scripting vulnerabilities in AI-powered web applications. - -6. **Example #6:** [Bypassing hate speech detection](https://www.technologyreview.com/2021/06/04/1025742/ai-hate-speech-moderation/) - - _Source:_ Technology Review - - _Description:_ Bypassing AI-based hate speech detection through prompt manipulation.