Open Source Software Health and Sustainability

Professor Goggins is a founding member of the Linux Foundation’s CHAOSS project, a current co-director, and a maintainer for the open source software project Augur: https://github.com/chaoss/augur. The most recent updates for Goggins and Germonprez on this work are noted in the remainder of this post.

Significant advancements have been made in the development of data and tools that are integral to the core of the CHAOSS community. Most notably our badging program has now badged 48 distinct events[1], up from 7 in the previous reporting period, and a new working group focused on “metrics models” [2] has released its first functional prototype[3].  Metrics models are collections of CHAOSS metrics that organizations often use to understand complex phenomena like newcomer retention, community welcomingness, and issue-handling[4] for example. This new working group is also actively developing toolkits to help open source communities gather data, like DEI information, for example, following a systematic process.

Through support from Google Summer of Code for the third consecutive year, Augur implemented three risk metrics that are proving important in the wake of vulnerabilities discovered in open source dependency chains that operate gasoline distribution on the East coast in summer, 2021, and a log4J defect identified early in 2022[5]. Our participating organizations have increased their focus in this area as a result of national attention on the role of open source software in the cybersecurity landscape. 14 Augur instances are being operated with a significant focus on Augur’s implementation of the CHAOSS Libyear, and upstream dependencies metrics, as well as Augur’s implementation of the OSSF Scorecard.[6]

The following links represent the development of the CHAOSS Community, a community precisely focused on open source project health and sustainability. These items represent our top-level work within which a broad range of community members, and other open source contributors, now participate. They focus on the development of tools, and consumption of those tools by community members.


During the past year we initiated ongoing partnerships with Red Hat Software, the “All-In” program at GitHub, the Research Software Alliance (ReSA), and continue to seek outlets to fund our collaboration with Karthick Ram and James Howison in the scientific open-source software ecosystem mapping project area. Notable accomplishments during this reporting period include:

  1. A Diversity, Equity, and Inclusion Survey in collaboration with the Linux Foundation, and GitHub’s All-In Initiative[1]. The Linux Foundation issued a preliminary report in Fall, 2021[2],[3] and Goggins, Germonprez, and Anita Sarma are writing a more comprehensive, academic paper using methods designed to work with data where key populations have comparably low numbers (i.e., the limited diversity in open source software requires specific statistical methods for valid analysis).
  2. Collaboration with Red Hat Software ($100,000) on a qualitative interview study aimed at understanding how the definition of “sustainability” is evolving in open source software[4].
  3. Collected comprehensive, updated Augur Data sets for 2,030 CZI sponsored repositories, 246 repositories used in a Howison paper on the impact of NSF funding on scientific open source software, 1,022 academic and scientific repositories identified through our collaboration with Harvard University, 113 Linux Foundation CNCF repositories, 3,540 VMWare repositories, ~17,000 Red Hat Software repositories, and all repositories from the 2021, and 2022 Linux Foundation “Open Source Census”. During the next year we will develop a series of academic papers and blog posts highlighting ecosystems, and contrasts identified using Augur’s machine learning capabilities.

We have worked with scientific software, critical infrastructure, and safety-critical systems during this reporting period. Our work to bridge into more extensive open-source scientific software research with Karthick Ram, and James Howison is ongoing, and our principal accomplishments this year include :

  1. Revision of paper focused on Social Capital as a Mechanism for community spanning in open source software, including technical contrasts.
  2. Active collaboration on proposals to extend our scientific open source software work with Karthak Ram, James Howison, and Michelle Barker.
  3. 2 Collaborative Proposals to the Wellcome Trust with ReSA, and FAIR4RS
  4. Co-organized a conference focused on centering open source around Diversity, Equity, and Inclusion as a pathway to professionalizing research software engineering with Michelle Barker of ReSA, Anna-Lena Lamprecht of Utrecht University and Mozhgan Chimeh of NVIDIA[5].
  5. We are partnering with Stephen Jakobs of the Rochester Institute of Technology on developing university OSPO focused CHAOSS metrics, beginning in early 2021.
  6. We also starting working with Atul Pokharel, at New York University in early, though his personal obligations presently have us in a wait and see mode.
  7. In early 2022 we started a conversation with Lou Woodley and Katie Pratt at the Center for Scientific Collaboration and Community Engagement.


Long Term Sustainability

At some fundamental level, CHAOSS is becoming a sustainable project by executing the iterative definition of open-source health and sustainability metrics and adapting to the changing needs of newcomers. Responsiveness, and reputation at the five-year mark of this project’s life play a significant role in its sustainability. Our networks with related projects through the Sloan Foundation are further assisting the development of CHAOSS metrics, and in the past year, CHAOSS metrics models, as an important cornerstone for advancing other open-source projects. As we enter our sixth year as a Linux Foundation project, we are embarking on four interconnected foci: (1) The CHAOSS project and its core team, (2) managing our software ecosystem as an interconnected whole instead of two distinct projects, (3) development of metrics models, and (4) engagement with partners aligned with the core of the CHAOSS project.

Hiring our community manager, Elizabeth Barron, has had a significant positive impact over her nearly two years of leadership. Elizabeth’s deep knowledge of open-source software remains instrumental for making connections that show promise for building a sustainable model for CHAOSS. Strategically, our goal is to support Elizabeth as the CHAOSS community manager. We believe at this point members have a clear understanding of the importance of her role. We continue to seek longer-term external financial support for the community manager role through direct discussions with prominent CHAOSS community members. Some of these conversations include the Schmidt Foundation, and the National Science Foundation.

Managing our software ecosystem in a way where Grimoirelab and Augur collaborate to build community is a new focus brought by Goggins taking on a project Co-Director role during the past year. We recently met with Bitergia (commercial enterprise providing services around Gimoirelab) for their 10th Anniversary, collaboratively proposed Google Summer of Code projects, and are actively exploring mechanisms that enable our software to potentially provide a revenue stream that helps sustain the project. Though, in hindsight these decisions appear strategic, in reality they emerged from the demands of newcomers to the CHAOSS project over the past year.

Newcomers desire working examples of CHAOSS metrics, and CHAOSS metrics models for the projects they care about. In a limited way, the Gimoirelab’s Cauldron enables a working example. It is, however, limited because of its design as a high level, functional prototype. The depth of metrics collection is not on par with Grimoirelab or Augur. CHAOSS is focusing the energy of each projects towards the collaborative development of metrics models. Our Google Summer of Code, and Outreachy projects are centered on applying the collective value of Grimoirelab and Augur to make “on demand” metrics models a possibility.

With respect to our collaborations, we regularly survey and seek out external financial support that will provide additional and ongoing support of faculty and students. During the past year we have been deeply engaged in four partnerships: “All-In”, ReSA, Red Hat Software, and our Diversity, Equity, and Inclusion advisory group. Each partnership is resulting in components of a sustainable CHAOSS.

[1][1] https://allinopensource.org/

[2] https://bit.ly/lf-DEI-survey-2021

[3] https://www.youtube.com/watch?v=ChznCaAZExw

[4] https://opensource.com/article/22/3/open-source-project-sustainability

[5] https://bit.ly/CHAOSS-ReSA-22

[1] https://github.com/badging/event-diversity-and-inclusion

[2] https://github.com/chaoss/wg-metrics-models

[3] https://bit.ly/community-welcomingness-implementation

[4] https://bit.ly/issue-handling

[5] https://on.wsj.com/377hK8M

[6] https://github.com/ossf/scorecard