Main Page

From CommunityData
Jump to: navigation, search


CDSC members at Pok Pok in March 2017. Clockwise from top left: Sneha, Mako, Aaron, Emilia, Nate, Jeremy, Sayamindu, Salt.


The Community Data Science Collective is an interdisciplinary research group made of up of faculty and students at the University of Washington Department of Communication and the Northwestern University Department of Communication Studies.

We are social scientists applying a range of quantitative and qualitative methods to the study of online communities. We seek to understand both how and why some attempts at collaborative production — like Wikipedia and Linux — build large volunteer communities and high quality work products.

Our research is particularly focused on how the design of communication and information technologies shape fundamental social outcomes with broad theoretical and practical implications — like an individual’s decision to join a community, contribute to a public good, or a group’s ability to make decisions democratically.

Our research is deeply interdisciplinary, most frequently consists of “big data” quantitative analyses, and lies at the intersection of communication, sociology, and human-computer interaction.

Workshops and Courses[edit]

In addition to research, we run workshops and teach classes. Some of that work is coordinated on this wiki. A more detailed lists of workshops and teaching material on this wikis is on our Workshops and Classes page. In this page, we only list ongoing classes and workshops.

Public Data Science Wrokshops Workshops[edit]

Community Data Science Workshops — The Community Data Science Workshops (CDSW) are a series of workshops designed to introduce some of the basic tools of programming and analysis of data from online communities to absolute beginners. The CDSW have been held roughtly twice a year since times in Seattle in 2014. So far, more than 100 people have volunteered their weekends to teach more than 500 people to program in Python, to build datasets from Web APIs, and to ask and answer questions using these data.

University of Washington Courses[edit]

  • [Winter 2017] COM521: Statistics and Statistical Programming — A quarter-long quantitative methods course that builds a first-quarter introduction to quantitative methodology and that focuses on both the more mathematical elements of statistics as well as the nuts-and-bolts of statistical programming in the GNU R programming language.

Northwestern Courses & Workshop[edit]

Research Resources[edit]

If you are a member of the collective, perhaps you're looking for CommunityData:Resources which includes details on email, TeX templates, documentation on our computing resources, etc.

Recent Research Blog Posts[edit]

Introducing the Cannabis Data Science Collective
In 2012, Washington State became one of the first two US states to legalize cannabis for non-medical use. Since then, sales tax revenues from the “green economy” have flooded state coffers. Washington’s academic institutions have been elevated by that rising tide. The University of Washington (one of our research group’s two institutional homes) is now home to pot-focused … Continue reading "Introducing the Cannabis Data Science Collective"
— Benjamin Mako Hill http://mako.cc 2017-04-20
Why do people start new online communities and projects?
Online communities have become ubiquitous, providing not only entertainment but wielding increasing cultural and political influence. While news organizations and researchers have focused a lot of attention on online communities after they become influential, very little is known about how or why they get started. Our survey of hundreds of Wikia.com founders shows that typical … Continue reading "Why do people start new online communities and projects?"
— Jeremy Foote http://www.jeremydfoote.com 2017-03-23
Searching for competition on Change.org with LDA topic models
You may have heard of Change.org. It’s a popular online petitioning platform. You may have even noticed there can many online petitions about popular topics. For instance, it is easy to find dozens of petitions protesting the Lychee and Dog Meat Festival with varying levels of support. Imagine you want to start an online petition. … Continue reading "Searching for competition on Change.org with LDA topic models"
— groceryheist 2017-03-15
New Dataset: Five Years of Longitudinal Data from Scratch
Scratch is a block-based programming language created by the Lifelong Kindergarten Group (LLK) at the MIT Media Lab. Scratch gives kids the power to use programming to create their own interactive animations and computer games. Since 2007, the online community that allows Scratch programmers to share, remix, and socialize around their projects has drawn more … Continue reading "New Dataset: Five Years of Longitudinal Data from Scratch"
— Benjamin Mako Hill http://mako.cc 2017-02-03


About This Wiki[edit]

This is open to the public and hackable by all but mostly contains information that will be useful to collective members, their collaborators, people enrolled in their projects, or people interested in building off of their work. If you're interested in making a change or creating content here, generally feel empowered to Be Bold. If things don't fit, somebody who watches this wiki will be in touch.

This is mostly a normal MediaWiki although there are a few things to know:

  • There's a CAPTCHA enabled. If you create an account and then contact any collective member with the username (on or off wiki), they can turn the CAPTCHA off for you.
  • Extension:Math is installed so you can write math here. Basically you just add math by putting TeX inside <nowiki> tags like this: <math>\frac{\sigma}{\sqrt{n}}</math>