Examining mailing list traffic to evaluate community health

Open source software communities have many choices when it comes to modes of communication. Among those choices, mailing lists have been a long standing common choice for connecting with other members of the community. Within mailing lists, the sentiment and communication style can give a good insight into the health of the community. The interactions can become a deciding factor for new and diverse members considering becoming active in the community.

As the focus of diversity and inclusion increases in OSS communities, I have taken on the task of using ML/AI strategies to detect hate speech and offensive language within community mailing lists. This project’s scope is starting with the Fedora devel and user mailing lists and will be transitioned into a service that will be applicable to all OSS mailing lists. In this three-part blog series, we will go through this process step by step. First, the cleaning process, second is model creation, and finishing up the series the creation of a service to be used by managers to be notified of concerning behaviors on their community’s mailing list. It is time to start using data science to help the efforts of D&I.

Continue reading “Examining mailing list traffic to evaluate community health”

Consumption is Fractal: Open Source Sustainability

One of the more obscure terms one might hear bandied about in the free and open source software ecosystem is the so-called “bus factor.” The somewhat-informal term refers to the state of a given project based on its sustainability.

Specifically, bus factor is shorthand for the question: what would happen to your open source project if one of your community members were hit by a bus? Would the project survive? Or is so much workflow and institutional knowledge wrapped up in that one person that your project would be damaged, possibly to the point of no recovery?

Continue reading “Consumption is Fractal: Open Source Sustainability”