Hack week: Study supports collaborative, participant-driven approach for researchers to learn data science from their peers

A scene from the 2018 Neurohackademy on Aug. 10, 2018 in the Alder Commons on th

A scene from the 2018 Neurohackademy on Aug. 10, 2018 in the Alder Commons on the University of Washington campus. Mark Stone/University of Washington

  • Administrative affairs
  • Arts and entertainment
  • Buildings and grounds
  • For UW employees
  • Health and medicine
  • Honors and awards
  • Official notices
  • Politics and government
  • UW and the community


Each night, high-definition cameras mounted to telescopes collect terabytes of data about objects in the sky. Each day, scientists sequence the genomes of people, animals, plants and microbes for biomedical and evolutionary research. Each year, the Large Hadron Collider produces 30 petabytes of data on particle collisions.

Science has become a big-data endeavor. But scientists are not universally adept in "data science” - the computing and statistical skillsets needed to handle, sort, analyze and draw conclusions from big data. The shortage of know-how in data science can hamper research, medicine and even private industry.

Now a team from the University of Washington, New York University and the University of California, Berkeley has developed an interactive workshop in data science for researchers at multiple stages of their careers. The course format, called "hack week,” blends elements from both traditional lecture-style pedagogy with participant-driven projects. The most recent was a neuroscience-themed event held in July on the UW campus. As the team reports in a paper published Aug. 20 in the Proceedings of the National Academy of Sciences , participants rated the hack weeks as opportunities to learn about new concepts, foster new connections, share data openly, and develop skills and work on problems that will positively affect their day-to-day research lives.

"The idea behind hack week was to bring together people who were interested in data science and give them a place to meet, talk and exchange ideas,” said lead and corresponding author Daniela Huppenkothen , associate director of the UW’s astronomy-focused DIRAC Institute. "But instead of a traditional format with experts lecturing nonexperts, this would allow participants to mingle more and teach one another.”

Huppenkothen was involved in the inaugural hack week event, "Astro Data Hack Week,” held at the UW in 2014. That event brought together big-data researchers in astrophysics and cosmology. Since then, the team has held four additional Astro Hack Week events, three " Neuro Hack Week ” events for neuroscience and two " Geo Hack Week ” events for the geosciences.

All hack week events have the same basic design and organizing principles. They usually commence with some structured periods for instruction, and then shift toward time for participant-driven, open-ended projects, as well as peer networking and free discussion. The projects can resemble a hackathon , but with greater emphasis on collaboration and learning rather than specific outcomes. Hack week participants tackle their projects in smaller groups, with organizers circulating to observe and provide feedback or encouragement.

The projects range from experiments that the participants brought from their home institutions to ideas that come up during the course. One project from the inaugural Astro Hack Week, for example, eventually became Stingray, a software project to provide algorithms to analyze time-series data in astronomy. At last month’s Neurohackademy, a new two-week version of Neuro Hack Week, one team worked on developing common ways to analyze different types of MRI scans.

The events’ open-ended structure places greater responsibility on the organizers of each hack week.

"A hack week takes a different kind of preparation, because you don’t have the security of ’falling back’ on the structure of traditional talks and lectures,” said co-author Anthony Arendt, a research scientist with the UW Applied Physics Laboratory who has organized Geo Hack Week. "You have to set up ways to encourage participants at all levels of ability and comfort - creating a welcoming space for everyone to pitch ideas.”

Most hack weeks organized by the team cap the number of participants at 60. Organizers also strive to select participants to maximize diversity - including scientists of different abilities, backgrounds and at different stages of their careers. Participants also agree to abide by a code of conduct that emphasizes respect and positive interactions.

In surveys conducted after eight hack weeks, participants ranked the events positively as spaces to learn, teach, network and foster relationships. More than three-quarters ranked the hack weeks as successful learning experiences, while two-thirds reported teaching skills to someone else. This feedback was constant across different backgrounds, showing that the unique format of hack weeks helps all participants feel included, said Huppenkothen.

"Now we want other scientific communities to learn about our experiences and see how they might start organizing their own events,” said Huppenkothen. "We also want feedback from other communities - both good and bad - and to widen the dialogue about data science and skill development.”

Their paper includes supplementary materials detailing the hack week experiences and advice for other groups interested in starting their own workshops.

Participants gave hack weeks high scores for promoting open-science principles - in which researchers publicly post and share their datasets, code and methods. Open science principles are critical to addressing challenges that researchers face in making their research more reproducible, said co-author Ariel Rokem , a data scientist with the UW eScience Institute and co-organizer of the recent Neurohackademy, along with Tal Yarkoni at the University of Texas at Austin.

"One of our goals with the hack week format is to elevate the quality of science being done,” said Rokem. "The best way to do that is to try out ideas and share what you’ve learned.”

Additional co-authors are David Hogg with the NYU Center for Data Science; Karthik Ram at the Berkeley Institute for Data Science at the University of California, Berkeley; and Jake VanderPlas at the UW eScience Institute.

This site uses cookies and analysis tools to improve the usability of the site. More information. |