Name: Software Development: With Great Ability to Create Disk Images Comes Greater Responsibility to Support Disk Images
Start: 2017-04-28T09:00:00-0700
End: 2017-04-28T10:15:00-0700

Digital Forensics: The Academic Library and Beyond

April 27 - 28, 2017
Norris University Center
Northwestern University
Evanston, IL

The BitCurator Users Forum brings together representatives from libraries, archives, museums, and related information professions engaged in (or considering) digital forensics work to acquire, better understand, and make available born-digital materials. The 2017 forum will be expanded to two days providing even more opportunities for community members and users to engage and learn from each other. It will balance discussion of theory and practice of digital forensics and related digital analysis workflows with hands-on activities for users at all levels of experience with the BitCurator environment, digital forensics methods in general, and other tools used in digital analysis and curation.

Back To Schedule

Software Development: With Great Ability to Create Disk Images Comes Greater Responsibility to Support Disk Images

Feedback form is now closed.

Community Notes

As the techniques facilitated by BitCurator and other forensic applications have entered regular usage, practitioners are identifying bottlenecks, breakdowns, and other problem areas in need of further software development. In this session, Forum participants will hear about software development projects from a range of institutional contexts, including individual-driven automation around disk image and logical file processing and documentation, cross-institutionally funded work to support HFS disk images in Archivematica, and grant-funded projects to provide access to, redact, and leverage natural language processing tools on the contents of disk images.

Automated Processing of Disk Images and Directories in BitCurator
As a means to more efficiently process large-scale digital archives and with inspiration from Jess Whyte's scripting work at the University of Toronto, the Canadian Centre for Architecture (CCA) is developing a set of software tools for automating triage, SIP creation, and description of born-digital archives within BitCurator. These tools -- collectively known as "CCA Tools" -- create consistent SIPs packaged for Archivematica from digital files, directories, or disk images, and generate pre-populated description spreadsheets containing information such as extent and date statements and a scope and content note. This talk will discuss why BitCurator is an ideal environment for automated processing, give an introduction to the CCA Tools, and discuss potential use cases and next steps.
Tim Walsh, Canadian Centre for Architecture

BitCurator Access and BitCurator NLP - Updates and Future Directions
The BitCurator environment supports a variety of digital curation activities. The BitCurator Access project extended this to the point of interaction with end users, providing and supporting a variety of access mechanisms. We developed tools that support access to disk images through three exploratory approaches: (1) building tools to support web-based services, (2) enabling the export of file systems and associated metadata, (3) and the use of emulation environments. We’ll highlight two BitCurator Access software products: BitCurator Access Webtools which supports browser-based search and navigation over data from disk images, and a set of scripts to redact sensitive data from disk images. Members of the BitCurator user community expressed that they would like tools to help in identifying and exploring information based on specific entities (e.g. people, places, organizations, events) associated with collections. The BitCurator NLP project aims to address this need by incorporating existing natural language processing (NLP) and visualization tools on top of the existing BitCurator environment and BitCurator Access Webtools. Disk images are internally complex and require the sorts of underlying software that is available through the BitCurator environment and BCA Webtools, adapted for this purpose. Disks can also contain a variety of data and document types, requiring considerable pre-processing to extract content to be processed by NLP tools. We’ll report on the BitCurator NLP project, which is building from and extend a variety of tools and initiatives to provide services that can be run independently or be called by existing software environments being used by LAMs.
Christopher Lee, School of Information and Library Science, University of North Carolina at Chapel Hill
Kam Woods, School of Information and Library Science, University of North Carolina at Chapel Hill

Developing Improved Disk Image Support in Archivematica: A Project Update
As digital archivists at New York Public Library and the University of California, Los Angeles, we both have a large number of HFS floppy disks in our collections. Our repositories have a focus on collecting in the humanities, and writers and artists in the late-1980s and early-1990s gravitated toward using early Apple computers. This would not be a problem in and of itself, but NYPL and UCLA also both use Archivematica, which was unable to identify and support work on HFS disk images. Previously, Archivematica relied solely on tsk recover to identify file systems but tsk recover does not recognize HFS and many other file systems used by early computers. Together, we sponsored a project to develop functionality in Archivematica to ingest, characterize, and extract files from HFS disk images. This talk will discuss the impetus for the project, give a report on this work, which began in early-February, and provide details on the specific development steps that make up the project. These include the development of a pre-ingest script that could be used in BitCurator or in Automation tools to identify and record file system information, allow dfxml metadata to be generated, and an extraction tool to extract files from the disk image. Presenters will also suggest possible next steps and potential development tasks that might build on the groundwork that this project has laid.
Susan Malsbury, New York Public Library
Shira Peltzman, UCLA Library

Speakers

Cal Lee

Professor, University of North Carolina, Chapel Hill

Christopher (Cal) Lee is Professor at the School of Information and Library Science at UNC, Chapel Hill. He teaches courses and workshops in archives and records management. He is a Fellow of SAA, and he serves as editor of American Archivist.

Susan Malsbury

New York Public Library

Shira Peltzman

Digital Archivist, UCLA Library

Shira is the Digital Archivist for UCLA Library Special Collections where she leads the development of a preservation program for born-digital archival material.

Tim Walsh

Digital Preservation Librarian, Concordia University Library

Tim Walsh is a digital preservationist and software developer based in Montreal. He works as the Digital Preservation Librarian at Concordia University Library. Prior to joining Concordia, Tim established a digital archives and digital preservation program at the Canadian Centre for... Read More →

Friday April 28, 2017 9:00am - 10:15am PDT
WildCat Room, Norris University Center, Northwestern University 1999 Campus Drive, Evanston IL 60208

Panel

BitCurator Users Forum 2017

Cal Lee

Susan Malsbury

Shira Peltzman

Tim Walsh

Attendees (23)

BitCurator Users Forum 2017

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Cal Lee

Susan Malsbury

Shira Peltzman

Tim Walsh

Attendees (23)