Unpacking NIST's New Draft Guidance on Managing Misuse Risk of Dual-Use Foundation Models

To meet its 270-day deadlines pursuant to U.S. President Joe Biden’s Executive Order on Safe, Secure, and Trustworthy Artificial Intelligence, the U.S. Department of Commerce recently published a tranche of documents. As part of that release, the National Institute of Standards and Technology (NIST) issued detailed guidance on AI safety, risk management, and responsible development and deployment.

AI safety has been a priority theme amongst policymakers worldwide. This has been demonstrated by activity including: the G7’s efforts to develop International Guiding Principles and a Code of Conduct for Advanced AI Systems; European Union policymakers work to develop a code of practice to address general purpose AI models with systemic risk under the EU AI Act; the global AI Safety Summits in the United Kingdom and South Korea; and many taskings aimed at managing the risk associated with “dual-use foundation models” in the U.S. AI EO.

At the same time, there have been challenges with meaningfully advancing AI safety work because there is not a consistent understanding of risks, which continue to evolve. Notably, on the 270-day deadline, the NIST U.S. AI Safety Institute released for public comment, its first-ever guidance document, Managing Misuse Risks of Dual-Use Foundation Models, fulfilling task 4.1 of the Executive Order. At the outset, the document walks through a series of considerations that make it difficult to map and measure misuse risks. Still, the new draft voluntary guidance document has the potential to be a meaningful contribution to the discourse if it can further hone and/or develop critical areas.

The guidance document lays out seven objectives that organizations should seek to achieve to help prevent deliberate misuse of their systems, and includes practices, recommendations, and documentation that an organization can leverage to help with implementation. It aims to be consistent with the functions outlined in the NIST AI Risk Management Framework.

We appreciate that the AI EO tasked the U.S. government with preparing this guidance and that the U.S. AI Safety Institute is seeking feedback from stakeholders on ways it can improve the document. With that in mind, there are several areas in the guidance document that would benefit from further detail or clarification, and that we look forward to working with NIST and the U.S. AI Safety Institute as it finalizes the guidance document.

  • Technical red-teaming guidance. The document lays out measuring misuse risk as one of the topline objectives that organizations should seek to achieve. As a part of that, NIST explains that organizations should use red teams to evaluate whether malicious actors might be able to get past AI system safeguards. While we agree that red-teaming can play a critical role in assessing a system for flaws, we also encourage NIST to develop more technical red-teaming guidance for dual-use foundation models in order to make such activity practically implementable in a consistent way.

  • Technical guidance for measuring capability and/or risk. The guidance proposes activities to manage risk of model theft, including for example, combining estimates related to “capabilities of concern” with the “probability of model theft.” However, it is not clear how an organization would reasonably develop those metrics. Similarly, is there a consistent way in which an organization might go about measuring capabilities of concern? Technical guidance for such measurements would be useful for organizations seeking to practically implement these sorts of activities.

  • Considerations related to various actors’ roles, responsibilities, and capabilities in the value chain. The guidance initially recognizes that actors along the AI value chain and throughout the lifecycle play a role in managing AI risk. That said, there are certain activities proposed throughout the document that seem to misunderstand or mischaracterize the level of control that a foundation model developer has over its downstream deployers, and/or the level of insight it will have into potential downstream use cases. Further clarifying those activities where responsibility should be shared would be helpful.

  • More clearly reflect that depending on the objectives of transparency and disclosure, the level of detail that an organization share may differ. We appreciate that the guidance document reflects the important role that transparency plays in fostering trust and communicating information downstream. That being said, the guidance should further clarify what parties organizations should consider disclosing information to, and the ways in which that information might differ depending on the audience.

While there is certainly more work to be done, 91pro视频 looks forward to continuing to collaborate with NIST and the AI Safety Institute to shape these guidelines in a way that will be valuable to all organizations seeking to manage misuse risk.

Tags: Artificial Intelligence

Related