Loading…
Digital Forensics: The Academic Library and Beyond

April 27 - 28, 2017
Norris University Center
Northwestern University 
Evanston, IL

The BitCurator Users Forum brings together representatives from libraries, archives, museums, and related information professions engaged in (or considering) digital forensics work to acquire, better understand, and make available born-digital materials. The 2017 forum will be expanded to two days providing even more opportunities for community members and users to engage and learn from each other. It will balance discussion of theory and practice of digital forensics and related digital analysis workflows with hands-on activities for users at all levels of experience with the BitCurator environment, digital forensics methods in general, and other tools used in digital analysis and curation. 

Sign up or log in to bookmark your favorites and sync them to your phone or calendar.

Thursday, April 27
 

8:00am

Registration and Coffee
Thursday April 27, 2017 8:00am - 8:45am
WildCat Room, Norris University Center, Northwestern University 1999 Campus Drive, Evanston IL 60208

8:45am

Welcome Remarks
Thursday April 27, 2017 8:45am - 9:00am
WildCat Room, Norris University Center, Northwestern University 1999 Campus Drive, Evanston IL 60208

9:00am

Workshop: Diving Deep with BitCurator

This workshop will be aimed at intermediate to advanced users who have a solid understanding of digital forensics and BitCurator basics, and are looking to extend and advance their knowledge and skills. 

Topics may include:

  • Repetitive tasks that could benefit from automation yet currently require human intervention
  • Workflow breakdowns and bottlenecks common to multiple users
  • Functionality that does not currently exist but would benefit multiple users


Depending on the issues and topics gathered from users, deliverables for the advanced track may include user stories, product requirements for desired features, proposed workflow and related diagrams, or scripts and other rudimentary tools for automating tasks.

Due to the hands-on nature of both tracks, participants will be asked to bring a laptop computer, preferably with the minimal system requirements to run BitCurator in a virtual machine.


Thursday April 27, 2017 9:00am - 4:00pm
WildCat Room, Norris University Center, Northwestern University 1999 Campus Drive, Evanston IL 60208

9:00am

Workshop: Testing the BitCurator Waters

Over the course of the day, attendees in the beginner track will participate in a range of activities meant to introduce the BitCurator environment and prepare participants for the proceedings on Day 2 of the User Forum. The morning will start with a brief introduction to digital forensics and how it relates to the work performed in libraries, archives, and museums (note: due to time constraints and subject complexity, this introduction cannot be comprehensive). The introduction touches on the foundational concepts of forensic analysis, defines terms that will recur throughout the day, and closes by tying those concepts broadly to archival practice. Following the introduction, the remainder of the morning will consist of a tour of the BitCurator environment, including the various GUI and command-line tools included in the environment. Participants will be invited to follow along on their own computers.

In the afternoon, a series of group and individual exercises will give participants the opportunity to run the core set of forensics tools on sample disk images and reflect on those experiences. The bulk of the exercises will be focused on the core set of BitCurator tools. Exercises and discussion will address running bulk_extractor, analyzing its reports, and the decision points raised by them; running fiwalk and exploring the metadata it generates; and running the BitCurator Reporting Tool and comparing the reports to those generated by bulk_extractor and fiwalk. A secondary set of exercises will be devoted to the additional tools included in the BitCurator environment. This latter group will be more discussion based, as the tools themselves are in varying states of support or maintenance, with the primary goal of communicating to participants the range of possibilities available to BitCurator users with these and other tools available in Linux environments.

Participants should bring a laptop computer that satisfies the minimum requirements for running BitCurator in a virtual machine. Further, participants should have downloaded the BitCurator environment. Time at the beginning of the day will be set aside to troubleshoot as much as possible; however, downloading the software over a wireless connection takes a fair amount of time and should be performed in advance of the Forum.

Speakers
avatar for [Matthew] Farrell

[Matthew] Farrell

Digital Records Archivist, Duke University Archives


Thursday April 27, 2017 9:00am - 4:00pm
WildCat Room, Norris University Center, Northwestern University 1999 Campus Drive, Evanston IL 60208

10:30am

Coffee Break
Thursday April 27, 2017 10:30am - 10:45am
WildCat Room, Norris University Center, Northwestern University 1999 Campus Drive, Evanston IL 60208

12:00pm

Lunch
Thursday April 27, 2017 12:00pm - 1:15pm
WildCat Room, Norris University Center, Northwestern University 1999 Campus Drive, Evanston IL 60208

2:30pm

Coffee Break
Thursday April 27, 2017 2:30pm - 2:45pm
TBA

4:00pm

Reception
Join us for drinks and light hors d’oeuvres at a post-workshop reception at the Northwestern University Library!

Thursday April 27, 2017 4:00pm - 5:30pm
Ver Steeg Faculty Lounge, Northwestern University Library
 
Friday, April 28
 

8:00am

Registration and Coffee
Friday April 28, 2017 8:00am - 8:45am
WildCat Room, Norris University Center, Northwestern University 1999 Campus Drive, Evanston IL 60208

8:45am

Welcome Remarks
Friday April 28, 2017 8:45am - 9:00am
WildCat Room, Norris University Center, Northwestern University 1999 Campus Drive, Evanston IL 60208

9:00am

Software Development: With Great Ability to Create Disk Images Comes Greater Responsibility to Support Disk Images
As the techniques facilitated by BitCurator and other forensic applications have entered regular usage, practitioners are identifying bottlenecks, breakdowns, and other problem areas in need of further software development. In this session, Forum participants will hear about software development projects from a range of institutional contexts, including individual-driven automation around disk image and logical file processing and documentation, cross-institutionally funded work to support HFS disk images in Archivematica, and grant-funded projects to provide access to, redact, and leverage natural language processing tools on the contents of disk images.

Automated Processing of Disk Images and Directories in BitCurator

As a means to more efficiently process large-scale digital archives and with inspiration from Jess Whyte's scripting work at the University of Toronto, the Canadian Centre for Architecture (CCA) is developing a set of software tools for automating triage, SIP creation, and description of born-digital archives within BitCurator. These tools -- collectively known as "CCA Tools" -- create consistent SIPs packaged for Archivematica from digital files, directories, or disk images, and generate pre-populated description spreadsheets containing information such as extent and date statements and a scope and content note. This talk will discuss why BitCurator is an ideal environment for automated processing, give an introduction to the CCA Tools, and discuss potential use cases and next steps.
Tim Walsh, Canadian Centre for Architecture

BitCurator Access and BitCurator NLP - Updates and Future Directions
The BitCurator environment supports a variety of digital curation activities. The BitCurator Access project extended this to the point of interaction with end users, providing and supporting a variety of access mechanisms. We developed tools that support access to disk images through three exploratory approaches: (1) building tools to support web-based services, (2) enabling the export of file systems and associated metadata, (3) and the use of emulation environments. We’ll highlight two BitCurator Access software products: BitCurator Access Webtools which supports browser-based search and navigation over data from disk images, and a set of scripts to redact sensitive data from disk images. Members of the BitCurator user community expressed that they would like tools to help in identifying and exploring information based on specific entities (e.g. people, places, organizations, events) associated with collections. The BitCurator NLP project aims to address this need by incorporating existing natural language processing (NLP) and visualization tools on top of the existing BitCurator environment and BitCurator Access Webtools. Disk images are internally complex and require the sorts of underlying software that is available through the BitCurator environment and BCA Webtools, adapted for this purpose. Disks can also contain a variety of data and document types, requiring considerable pre-processing to extract content to be processed by NLP tools. We’ll report on the BitCurator NLP project, which is building from and extend a variety of tools and initiatives to provide services that can be run independently or be called by existing software environments being used by LAMs. 
Christopher Lee, School of Information and Library Science, University of North Carolina at Chapel Hill
Kam Woods, School of Information and Library Science, University of North Carolina at Chapel Hill

Developing Improved Disk Image Support in Archivematica: A Project Update
As digital archivists at New York Public Library and the University of California, Los Angeles, we both have a large number of HFS floppy disks in our collections. Our repositories have a focus on collecting in the humanities, and writers and artists in the late-1980s and early-1990s gravitated toward using early Apple computers. This would not be a problem in and of itself, but NYPL and UCLA also both use Archivematica, which was unable to identify and support work on HFS disk images. Previously, Archivematica relied solely on tsk recover to identify file systems but tsk recover does not recognize HFS and many other file systems used by early computers. Together, we sponsored a project to develop functionality in Archivematica to ingest, characterize, and extract files from HFS disk images. This talk will discuss the impetus for the project, give a report on this work, which began in early-February, and provide details on the specific development steps that make up the project. These include the development of a pre-ingest script that could be used in BitCurator or in Automation tools to identify and record file system information, allow dfxml metadata to be generated, and an extraction tool to extract files from the disk image. Presenters will also suggest possible next steps and potential development tasks that might build on the groundwork that this project has laid. 
Susan Malsbury, New York Public Library
Shira Peltzman, UCLA Library 

Speakers
avatar for Cal Lee

Cal Lee

University of North Carolina, United States of America
avatar for Susan Malsbury

Susan Malsbury

New York Public Library
avatar for Shira Peltzman

Shira Peltzman

Digital Archivist, UCLA Library
Shira is the Digital Archivist for UCLA Library Special Collections where she leads the development of a preservation program for born-digital archival material.
avatar for Tim Walsh

Tim Walsh

Digital Preservation Librarian, Concordia University Library
Tim Walsh is a digital preservationist and software developer based in Montreal. He works as the Digital Preservation Librarian at Concordia University Library. Prior to joining Concordia, Tim established a digital archives and digital preservation program at the Canadian Centre for... Read More →


Friday April 28, 2017 9:00am - 10:15am
WildCat Room, Norris University Center, Northwestern University 1999 Campus Drive, Evanston IL 60208

10:15am

Coffee Break
Friday April 28, 2017 10:15am - 10:45am
WildCat Room, Norris University Center, Northwestern University 1999 Campus Drive, Evanston IL 60208

10:45am

Spinning Plates and Moving Parts: Adventures in Workflow and Documentation Management
Managing even small collections of born-digital materials is a complex task including multiple processes lending themselves to asynchronous work and potentially touching multiple staff members over a period of days. Such complexities require robust documentation and well-planned workflows. In this session, speakers will discuss approaches to such work, ranging from case studies, to documentation platforms, and guidelines for non-digital specialists.

More Floppies, Less Process: the Digital Media Log

This talk will discuss the RAC's Digital Media Log, a lightweight web app that integrates with ArchivesSpace to efficiently inventory digital media items and log digital forensics and preservation activities. The application is intended to track inventorying and preservation workflows in an automated digital forensics environment and harnesses ArchivesSpace's API to record contextual information. Instead of acting as a canonical source of data about our disk imaging, the Digital Media Log is intended to export preservation information in a way that can be combined with metadata created by digital forensics tools and stored in preservation systems. In addition to an overview of the application's features, this talk will focus on the principles and workflows behind the application's creation. Automating processes and integrating systems in this way means that we can get through our disk imaging backlog much more quickly, but it also makes it easier for archivists who don't have "digital" in their titles to participate in digital forensics activities.
Bonnie Gordon, Rockefeller Archive Center

Inbetweeners: Developing Guidelines and Documentation to Help Those Who Help the Donors
Often, those who are doing the nitty gritty digital archiving and those who are working with donors are two separate entities. In these cases, providing strong guidelines, policies, and documentation to help archivists and curators in obtaining and managing born digital content is key to both the successful completion of projects as well as the ongoing relationship management between collections and donors. This panel seeks to explore how better communication between internal units can lead to successful relationships with donors. Panelists from Indiana University will explore several perspectives, including that of an archivist who works directly with donors, of a technician who creates digital archiving workflows, and of the librarian who sits in between. The panel will also include a differing viewpoint from University of Virginia, presented by a librarian who does a bit of everything. Some of the main points of discussion will include the impact of donor agreements and institutional policies on outreach and education efforts, the establishment of workflows and decision-making criteria to leverage non-specialized support on the technical end, and the overall impact of relationship management on born digital transfer projects. 
Mary Mellon, Indiana University
Luke Menzies, Indiana University
Lauren Work, University of Virginia 

Lots of Old Onions
Way back in the day, before there was a widely acknowledged set of skills and tools known as Digital Forensics, the boots-on-the ground needed to figure out ways to deal with information from a crazy number of incompatible word processors, minicomputers, and mainframes. The work was typically called “media conversion”. Today, we like to call dealing with arcane media and file types “data rescue”. Many of the lessons learned in those days have been incorporated (and improved) into great initiatives such as the Bit Curator. But we believe that—even now—some of the old insights, along with talk of some interesting projects. For example… We thought of the tasks as something like peeling back the layers of an onion:
  1. Media Compatibility (we’ll show several tape and diskette types)
  2. Age and Storage Conditions (e.g. tape cleaning/baking)
  3. Recording Method (density, interleaving, checksums, etc.) If no funding for 4-7, save those bits with disk and tape images. 
  4. Operating System/File System (IBM, DEC, Wang, Honeywell, etc. etc.)
  5. Backup, Exchange, or Archiving SW (several choices within each #4)
  6. Application File Structure (sequential, indexed, chained, etc.)
  7. Application File Encoding (database, wp, reports, images, A/V) 
Chris Muller, George Blood Audio/Video/Film/Data
George Blood, George Blood Audio/Video/Film/Data

All Together Now: Introducing the KryoFlux User Guide
The KryoFlux, a floppy disk controller card developed by the Software Preservation Society, has become the de facto standard for many digital archives for its ability to safely and effectively capture data from aging floppy disks. Although the KryoFlux is an extremely powerful tool for digital archivists, scant documentation and an unapproachable user forum have hampered wider adoption amongst archives. To address this gap and to encourage more robust use of this hardware, archivists using the KryoFlux at Duke, Emory, UCLA, the University of Texas, and Yale have developed a comprehensive KryoFlux user guide designed specifically for archival contexts and which we hope to make freely available in the coming months. The guide includes:  
  • An explanation as to why an archives might choose to use the Kryoflux; -An explanation of floppy disk formatting;
  • Step-by-step instructions on installation and use (both CLI and GUI); 
  • Troubleshooting tips and tricks;
  • A list of additional resources. 
We have now completed a first draft of our user guide and believe the BitCurator Users Forum offers an excellent opportunity to share our work with active practitioners working regularly with born-digital materials. Our session would explain the impetus behind this project, give an overview of the guide and its content, and aim to solicit feedback and questions from the audience. From our perspective, this opportunity to hear from digital forensic practitioners and potential users of the guide would be incredibly valuable as we prepare it for broad circulation.
Dorothy Waugh, Emory University 
Shira Peltzman, UCLA Library
Jennifer Allen, University of Texas Austin School of Information
 

Speakers
avatar for George Blood

George Blood

Owner, George Blood Audio/Video/Film
George Blood has worked in classical music production since receiving his bachelor's degree in Music Theory from the University of Chicago in 1983. While recording live concerts (from student recitals to opera and major symphony orchestras) since 1982, he documented over 4,000 live... Read More →
avatar for Bonnie Gordon

Bonnie Gordon

Digital Archivist, Rockefeller Archive Center
Bonnie Gordon is a Digital Archivist on the Digital Programs team at the Rockefeller Archive Center, where she focuses on digital preservation, born digital records, and training around technology.
MM

Mary Mellon

Assistant Archivist, Indiana University Archives
avatar for Luke Menzies

Luke Menzies

Digital Preservation Technician, Indiana University
Luke Menzies is currently a dual MLS/MA student at Indiana University. His fields of concentration are digital humanities, digital preservation, and Central Asian history. He also holds MA degrees in Slavic Linguistics and Islamic Studies.
avatar for Chris Muller

Chris Muller

Data Rescue Manager, George Blood Audio/Video/Film/Data
Figuring out and converting legacy media and arcane file formats has been our focus (and enjoyment) for nearly forty years. We are now part of the great George Blood organization. For now, see www.mullermedia.com for our born-digital data-focused abilities. Eventually that will be... Read More →
avatar for Shira Peltzman

Shira Peltzman

Digital Archivist, UCLA Library
Shira is the Digital Archivist for UCLA Library Special Collections where she leads the development of a preservation program for born-digital archival material.
DW

Dorothy Waugh

Digital Archivist, Emory University
Dorothy Waugh is Digital Archivist at the Stuart A. Rose Manuscript, Archives, and Rare Book Library at Emory University. She received her MLS from Indiana University and her MA in English Literature from the Ohio State University.
LW

Lauren Work

Digital Preservation Librarian, University of Virginia
Lauren Work is the Digital Preservation Librarian at the University of Virginia, where she is responsible for the implementation of preservation strategy and systems for university digital resources.


Friday April 28, 2017 10:45am - 12:00pm
WildCat Room, Norris University Center, Northwestern University 1999 Campus Drive, Evanston IL 60208

12:00pm

Lunch
Friday April 28, 2017 12:00pm - 1:15pm
WildCat Room, Norris University Center, Northwestern University 1999 Campus Drive, Evanston IL 60208

1:15pm

Lightning Talks
Interesting things are afoot at institutions implementing digital forensics techniques. In this session, as many quickfire presentations as can fit will deliver a brief presentation on a topic related to their work with BitCurator or related digital curation toolsets.

Friday April 28, 2017 1:15pm - 1:45pm
WildCat Room, Norris University Center, Northwestern University 1999 Campus Drive, Evanston IL 60208

2:15pm

Birds of a Feather Discussions
Building off the lightning talk session, this session will feature loosely structured conversations. Topics may include those raised in earlier sessions, as well as anything that has sparked interest with respect to digital forensics and born-digital materials.

Friday April 28, 2017 2:15pm - 3:15pm
WildCat Room, Norris University Center, Northwestern University 1999 Campus Drive, Evanston IL 60208

3:15pm

Coffee Break
Friday April 28, 2017 3:15pm - 3:30pm
WildCat Room, Norris University Center, Northwestern University 1999 Campus Drive, Evanston IL 60208

3:30pm

Ethically Sourced Forensics
One does not simply keep disk images: ethical, risk tolerance, and sustainabilty issues with forensic disk image retention
Forensic disk images provide the best means of accurately capturing, and ensuring the authenticity of, all of the data on a physical carrier. However, unexpected, deleted, private or sensitive, and system data introduce ethical complexities for archivists and repository administrators when developing disk image retention policies. This session will explore ethical concerns, institutional risk tolerance, potential legal implications, and sustainability issues related to retaining forensic disk images.
Keith Pendergrass, Harvard Business School

The All-Seeing Eye: Digital Archives and Surveillance
"Delete the logs" has become a mantra for security experts post-election as organizations work to minimize risk to populations who are often already targeted for surveillance. But what happens when your professional responsibilities as an archivist demand that you preserve the logs? This presentation will discuss the nature and degree of the threats posed by retaining digital material and attempt to balance it against the ethics of digital preservation.
Talya Cooper, The Intercept 

Speakers
avatar for Keith Pendergrass

Keith Pendergrass

Digital Archivist, Harvard Business School
Keith Pendergrass is the digital archivist for Baker Library Special Collections at Harvard Business School, where he develops and oversees born-digital content workflows. He is also the Library's representative on the HBS Green Team, a School-wide staff group coordinating grassroots... Read More →


Friday April 28, 2017 3:30pm - 4:30pm
WildCat Room, Norris University Center, Northwestern University 1999 Campus Drive, Evanston IL 60208

4:30pm