Practical Ethics for Working with Human-Centered and Online Data

How to be rule compliant and still innovate?

Data Science projects often involve the collection and analysis of human-centered (i.e., data about people or user-generated content) and/ or online (i.e., data publicly available from the web) data. These data are governed by multiple sets of norms and regulations, e.g., institutional and sectoral norms and rules, intellectual property law, privacy and security laws and regulations, terms of service, technical constraints, and personal ethics (Diesner & Chin, 2016b). Since digital data are often publicly available and can be collected without interacting with people, an IRB review might not apply. Problems can arise when researchers are unaware of other applicable rules, uninformed about their practical meaning and compatibility, and insufficiently skilled in implementing them (Diesner & Chin, 2016a). To address this issue, we have started to develop and deliver educational modules. This work was supported by the Midwest Big Data Hub and the Computing Community Consortium.


  1. Diesner, J., & Chin, C.-L. (2016a). Gratis, Libre, or Something Else? Regulations and Misassumptions Related to Working with Publicly Available Text Data. Proceedings of ETHI-CA² Workshop (ETHics In Corpus Collection, Annotation & Application), 10th Language Resources and Evaluation Conference (LREC), Portoroz, Slovenia.
  2. Diesner, J., & Chin, C.-L. (2016b). Seeing the forest for the trees: considering applicable types of regulations for the responsible collection and analysis of human centered data. Proceedings of Human-Centered Data Science (HCDS) Workshop at 19th ACM Conference on Computer-Supported Cooperative Work and Social Computing (CSCW), San Fancisco, CA.


Midwest Big Data Hub & Computing Community Consortium (NSF sub-award)
Academia-Industry Big Data Collaboration for Early Career Researchers program (2016)

National Center for Supercomputing Applications Faculty Fellowship
Predictive modeling for impact asessment