Enabling government data to be freely shared and accessed can expedite research and innovation in high-value disciplines, create opportunities for economic development, increase citizen participation in government, and inform decision-making in both public and private sectors.
Though federal agencies and policymakers alike support the idea of safely opening their data both to other agencies and to the research community, a substantial fraction of the U.S. federal government’s safely shareable data is not being shared.
The Biden Administration needs to assign open government data as a 2021 Cross-Agency Priority Goal in the President’s Management Agenda. This goal should revitalize the 2018 CAP Goal: Leveraging Data as a Strategic Asset to improve upon the 2020 U.S. Federal Data Strategy and emphasize that open data is a priority for the U.S. Government.
Absent elevating open data as a top priority in the President's Agenda, the U.S. risks falling behind internationally. Many nations have surged ahead building smart, prosperous AI-driven societies while the U.S. has failed to unlock its nascent data. If the Biden Administration wants the U.S. to prevail as an international superpower and a global beacon of democracy, it must revitalize its waning open data efforts.
The COVID-19 pandemic took 592,776 American lives as of June 2, 2021.<fn-sp>1<fn-sp> This grim number would have been higher if not for the continuous stream of data on infection, mortality, and spread released by the Department of Health and Human Services(HHS) during the height of the pandemic.<fn-sp>2<fn-sp> This government data, made freely available and easily accessible, empowered data scientists to produce public-focused models, analyses, and predictive analytics which accelerated scientific and public health insights, shortening the time it took for COVID-19 information to save American lives. Opening this data was essential to the U.S. pandemic response, and in retrospect, every day this stream of data was delayed, American lives were lost. As a country, we cannot afford to wait for the next crisis to ensure that our data is ready to be used to improve decision-making.
Open data, as defined by the Government Accountability Office (GAO) is data in a standardized format that is free to use, modify, and share.<fn-sp>3<fn-sp> Open Government Data, as defined by the Foundations for Evidence-Based Policymaking Act, are public data assets created by, collected by, under the control or direction of, or maintained by a federal agency that are machine-readable, available on a comprehensive data inventory in standardized non-proprietary formats.<fn-sp>4<fn-sp> Open government data generates many public benefits, including increasing citizen participation in government,<fn-sp>5<fn-sp> spurring research and innovation<fn-sp>6<fn-sp>, creating opportunities for economic development<fn-sp-extra-space>7<fn-sp-extra-space>, and informing decision-making in both the private and public sectors.<fn-sp>8<fn-sp> Government facilitated open data has benefited critical decision-making in communities torn apart by natural disasters<fn-sp-extra-space>9<fn-sp-extra-space> as well as, most recently, communities responding to the COVID-19 pandemic.<fn-sp>10<fn-sp>
Policymakers have shown consistent support over the years for open government data.<fn-sp>11<fn-sp> Figure 1 gives a brief timeline of some of the recent statutes and guidances, including major successes like the creation of data.gov and the appointment of a Chief Data Officer (CDO) in every agency. These advancements were direct results of the Executive Branch and Congress’s designation of open data as a priority for the U.S. government.
Despite these positive strides, there remains more work to be done. As part of the FDS 2020 Action Plan, agencies were asked to make data governance materials publicly available by January 31, 2020.<fn-sp>12<fn-sp> Over half of the agencies did not post.<fn-sp>13<fn-sp> Of the agencies that did post, 2,258 datasets remain non-public as of Q4 2020.<fn-sp>14<fn-sp> This means a sizable amount of government data that is legal to share with trusted non-government researchers is not being shared.<fn-sp>15<fn-sp> To get this non-public government data into the hands of researchers, government personnel need to address the various challenges that prevent agencies from opening their data.<fn-sp>16<fn-sp> Policymakers and practitioners alike agree on the nature of the challenges that exist but struggle to implement effective solutions because implementation is very hard. Implementing a whole-of-government approach to open data will require interagency & interdisciplinary stakeholders contributing to the design of a process that works for not only the individual agency but, more importantly, the collective U.S. government. Leadership in the U.S. government must act on opening government data given that Artificial Intelligence (AI) will be a key technology to the U.S government’s national success in the 21st century and it is powered by data.<fn-sp>17<fn-sp> The insights that can be derived from federal data have the potential to supply national actors with new information to make data-driven decisions that can drive American progress and competitiveness across multiple industries. Each day government data remains inaccessible to researchers, American entities fall behind internationally and unknowable scientific insights are deferred to the future causing people to live worse lives unnecessarily. Due to the government’s position as the largest and most important holder of data, our ability to build a smart, successful AI-driven society is dependent on the capacity to open our data as soon as possible.
The Biden Administration, with the support of the Deputy Director for Management (DDM) at the OMB, should explicitly emphasize that open government data is a top Administration priority. They can do this by assigning open government data as a 2021 CAP Goal in the PMA. In accordance with President Biden’s desire to refresh and reinvigorate our national science and technology strategy,<fn-sp>18<fn-sp> the 2021 CAP Goal should revitalize the 2018 PMA CAP Goal: Leveraging Data as a Strategic Asset, to improve upon the 2020 FDS.
Currently, there is no dedicated government official positioned with the mandate to champion the 2021 CAP Goal along with the necessary authority to execute on such a goal. Upon filling the vacant U.S. CTO seat in the Office of Science and Technology Policy (OSTP), the U.S. CTO should direct a Deputy CTO to focus solely on fulfilling the 2021 CAP Goal. The Deputy CTO should be a joint appointment with OMB.
Congressional and Executive leadership alike can support the Deputy CTO in fulfilling the 2021 CAP Goal by emphasizing that open data is a priority for the U.S. at all levels of government. Executive leadership can prioritize opening data that is in demand by national actors to restore America’s global standing. Legislative leadership can support the innovation economy and create new jobs by opening existing federal government data and mandating the creation of new data.
As part of the Deputy CTO’s strategy for fulfilling this CAP Goal, the Deputy CTO should address the following challenges that prevent many federal agencies from opening their data. The challenges outlined below have been sourced and synthesized from conversations conducted by the Plaintext Group with employees from various federal agencies.<fn-sp>19<fn-sp>
There are no explicit statutory appropriations to support and fund the work of an agency’s CDO<fn-sp-extra-space>20<fn-sp-extra-space> or additional technical staff needed to open high-value datasets. The statutes and guidances mentioned in Figure 1 signal support from policymakers, but many are unfunded mandates, leaving agencies responsible to find funding.
The public sector talent pipeline is in crisis as the need for talented public servants has sky-rocketed<fn-sp-extra-space>21<fn-sp-extra-space> and the government’s personnel systems are not currently designed to build, support, and promote a data workforce.<fn-sp>22<fn-sp> Many data teams lack technical expertise, full-time staff,<fn-sp>23<fn-sp> and continued training.
The Open, Public, Electronic and Necessary Government Data Act of 2018 (OPEN Government Data Act) required the OMB to issue guidance by July of 2019 for agencies to implement comprehensive data inventories, but this guidance has yet to be released.<fn-sp>28<fn-sp> This failure has limited agencies’ progress in implementing their requirements under the act.<fn-sp>29<fn-sp>
Mandates have been provided with myriad action items, corresponding to milestones and target timelines. Agencies prioritize their data assets as it relates to their mission statements but lack a concrete incentive structure and corresponding motivation to post, update, and maintain their data assets.
The absence of interagency collaboration opportunities and public-private partnerships limits the imagination to envision use cases for federal open data beyond their current facility.<fn-sp>34<fn-sp> Agencies may not know the value of their data to other stakeholders because they may not regularly communicate data needs or inventory beyond their agency of operation.
The differences in quality of data in the possession of federal agencies is vast.<fn-sp>37<fn-sp> Quality in this context refers to the sophistication of the data’s format and structure. Some data is clean and structured as machine-readable information, accessible in databases prepared for AI applications. Other pieces of data are formatted as PDF photos of hand-written notes hosted in folders on desktops. Many data assets include inaccurately labeled, incomplete, or missing data. Unsophisticated and messy data is useless.<fn-sp>38<fn-sp>
Much of federal agencies’ data assets contain personally identifiable information (PII)<fn-sp>41<fn-sp> and sensitive data. As opening this data risks disclosing PII,<fn-sp>42<fn-sp> privacy-enhancing techniques are necessary. However, many agencies do not have the necessary expertise or guidance to implement effective privacy-enhancing techniques.
Trust in government infrastructures is low. As recent high-profile hacks<fn-sp-extra-space>48<fn-sp-extra-space> have highlighted, government technology infrastructure is outdated and in need of major upgrades.<fn-sp>49<fn-sp> Agencies’ risk management strategies take this reality into account and often determine that the cybersecurity risk of opening government data is not worth the reward.
Agencies are more likely to be able to respond to requests for expanding access to data if government and private sector experts can identify specific datasets and high impact use cases that would be enabled if this data was made available.<fn-sp>53<fn-sp>
OMB has statutory authority over much of the current open data related statutes and guidance but is primarily government facing. OSTP lacks statutory authority but has more freedom to access external technologists for sourcing implementation expertise. Therefore, a joint appointment would place the Deputy CTO in the best position to successfully coordinate and execute on the U.S. FDS.
No. Many of the challenges listed above are interconnected and this is not an exhaustive list of challenges. Addressing one challenge effectively may entail solutions from several challenges.
Read more about the Day One Project <rte-link> here<rte-link>.