Data is constantly being collected, organized, and evaluated, but all too often that data is kept private by corporations, government agencies, and research groups. The benefits of open data are numerous and diverse, ranging from efficiency of public institutions and economic growth to improved social welfare and energy conservation.

Open Data Week at Virginia Tech, hosted by the University Libraries April 10 through 14, aims to celebrate and increase awareness of these benefits, encourage adoption of open data policies, and begin conversations on campus about improving access to data.

“We first hosted Open Data Day in 2015 and there was so much interest in open government data and creating and using open data sets at Virginia Tech, so beginning last year we expanded it into Open Data Week,” said Philip Young, scholarly communication librarian at the University Libraries.

Last year’s events focused on data anonymization, the Freedom of Information Act, incorporating open data into the classroom, and automating the collection of data from the web.

“This year we will start the week with a Monday evening forum on open data in research with a great panel, and we’re looking forward to a wide-ranging discussion with audience interaction,” said Young. “Our focus then shifts to text and data mining of peer-reviewed literature with our guests from ContentMine and a Wednesday afternoon discussion forum. This area of data mining holds tremendous potential, but there are many challenges to overcome.”

ContentMine provides free, open source software that processes information in search of new information, allowing users to gather papers from many different sources, standardize the material, and process them to look up and/or search for key terms, phrases, patterns, and more. ContentMine representatives will lead sessions throughout the week to help participants learn how to use the software to begin text and data mining.

Featured events include:

Open Research/Open Data Forum: Transparency, Sharing, and Reproducibility in Scholarship

April 10, 6:30–8 p.m., in Torgersen Hall 1100

Join our panelists for a discussion on challenges and opportunities related to sharing and using open data in research, including meeting funding and journal guidelines. Panelists include:

  • Daniel Chen (Ph.D. candidate in Genetics, Bioinformatics, and Computational Biology)
  • Karen DePauw (Vice President and Dean for Graduate Education)
  • Sally Morton (Dean, College of Science)
  • Jon Petters (Data Management Consultant, University Libraries)
  • David Radcliffe (Professor, English, College of Liberal Arts and Human Sciences)
  • Laura Sands (Professor of Human Development, Center for Gerontology)

Text and Data Mining Forum

April 12, 2:30–3:45 p.m., in the Newman Library Multipurpose Room

Join panelists for an interactive discussion about opportunities and challenges in text and data mining, with a focus on research purposes and access to content. Panelists include:

  • Tom Arrow (Software Developer, ContentMine)
  • Tom Ewing (Associate Dean, College of Liberal Arts and Human Sciences)
  • Weiguo (Patrick) Fan (Professor, Pamplin College of Business)
  • Ed Fox (Professor, Department of Computer Science)
  • Leanna House (Associate Professor, Department of Statistics)
  • Bert Huang (Assistant Professor, Department of Computer Science)

Introduction to ContentMine Tools for Mining Scholarly & Research Literature

Several sessions offered April 11–12

Join ContentMine instructors for an overview of text and data mining, and an introduction to ContentMine open source tools for mining scholarly and research literature.

ContentMine Tools to Explore Scholarly and Research Literature: In-Depth, Hands-On Workshop

April 13, 9 a.m. – 4 p.m., in Newman Library 207A

During this workshop, participants will complete hands-on exercises to become familiar with ContentMine tools while ContentMine instructors are present. Attendees will also have the opportunity to experiment with using these tools to mine scholarly literature and explore results specific to their own research project goals. Coffee and lunch provided.

Registration required (limit: 20 participants) | NLI credit available

A full listing of events, including registration and NLI credit links, can be found on the Open Data Week event page. All Open Data Week events are free and open to the public.

Share this story