signac - PyData Ann Arbor Meetup 2018

Ann Arbor, MI, Aug 7th, 2018

The signac framework supports researchers in managing project-related data by providing a well-defined, indexable layout for storing file-based data and metadata directly on the filesystem. In this way, signac enables efficient data access through an ad hoc database-like interface without the need for running a server.

The distributed file-based storage model is well-suited to high-performance computing applications. Additionally, signac aids in defining and executing workflows for operating on these managed data spaces both on local workstations and on leadership-class supercomputers.

Signac is open-source and freely available for Python versions 2.7.x and 3.4+.

These slides are hosted at: https://bit.ly/2vIYtnZ

Our Research

  • Computational materials research on the nano- to microscale on leadership-class supercomputers (XSEDE, INCITE, etc)
  • Development of community-driven open-source software (HOOMD-blue, signac, freud, rowan, ...)
  • Application of machine-learning techniques for data-driven materials discovery

Video provided by courtesy of Chrisy Xiyu Du.

Simple Example

Here is an typical problem we encounter when managing parameterized data spaces on the file system.

We run a series of (computational) experiments on a binary mixture and need to store related files on the file system. flasks

This might be a good start:

concentration_A_0.25/
concentration_A_0.50/
concentration_A_0.75/

Or this?

concentration_A/0.25/
concentration_A/0.50/
concentration_A/0.75/

Maybe a bit shorter?

conc_A/0.25
conc_A/0.50
conc_A/0.75

Even shorter?

conc_A/.25
conc_A/.50
conc_A/.75

But now all all of our data is hidden...

Better remove the dot.

conc_A/25
conc_A/50
conc_A/75

Turns out we need to vary the temperature:

conc_A/25/temp_08
conc_A/25/temp_1
conc_A/50
conc_A/75

Better keep things consistent:

conc_A/25/temp_08
conc_A/25/temp_1
conc_A/50/temp_08
conc_A/75/temp_08

Maybe this is a better schema?

temp_08/conc_A/25
temp_08/conc_A/50
temp_08/conc_A/75
temp_1/conc_A/25

Actually, we need to bring in a component C:

temp_08/conc_A/25/conc_B/05
temp_08/conc_A/50
temp_08/conc_A/75
temp_1/conc_A/25

The signac framework is named after the painter Paul Signac.

Paul Signac The technique of creating natural images out of many small painting dots serves as a metaphor for signac's data model.

Overview

Overview

Overview

Overview

How to install signac

You can install signac either with conda through the conda-forge channel:

$ conda install -c conda-forge signac signac-flow

Alternatively you can install it with pip:

$ pip install --user signac signac-flow

Additional Information

Thank you very much for your attention!

Acknowledgment