Six Golden Rules for Writing, Using and Sharing Research Prototypes

by Martin Monperrus

A research prototype demonstrates an idea, collects some data for sake of writing a scientific paper. When it’s software, it must be considered as a special kind of software. This post presents somes rules that I recommend for authors and users of research software prototypes (“rules” like in “rules of the game” of protoyping). Those rules are much inspired by Paul Feyerabend’s “Anything goes” in philosophy of science.

The Open-Science credo: share code and data

In order to do good open-science, one needs to easily and timely share data and code.

However, one always want to document, refactor or polish research code. But this takes valuable time, and as such prevents good open-science.

The solution is to NOT document, refactor or polish anything before sharing. This is the rationale of those six golden rules for writing, using and sharing research prototypes

Differences between Prototype Software and Normal Software

Prototype software and “normal” software are of different nature. Normal software (whether commercial or open-source) aims at creating value and user satisfaction. Prototype software aims at letting creativity to fully blossom and quickly obtaining results (we equate “prototype” and “proof of concept”). Here are three rules that are specific to prototype software.

Rule #1: Software engineering best practices are only optional

A research prototype must be written fast. In the creation phase, ideas must fly, the jungle must be cleared out, and there is no time for encapsulation, documentation or testing. Taking care of those best practices can slow down the process to the extent of killing one’s creativity and the outcome of an initially brilliant idea.

Rule #2: Prototypes only work on the provided examples

A research prototype has been tested on some examples. It’s not designed or implemented to work on other cases. It’s perfectly fine that a research prototype miserably crashes as soon as one changes the provided examples a little bit.

Rule #3: There is no support

PhD students move away from their theses, researchers’ goals evolve. A research prototype can come with no support at all, whether documentation, email or forum. There is no support to be expected with respect to compilation, configuration, or execution. But when working with source code (see rule #D), even tough bugs can be fixed.

Research Prototype Process and Lifecycle

Rule #4: Open-source but not publicly available

A research prototype has to be open-source for sake of reproduction, comparative evaluation, and extension. I take open-source in its literal meaning. First, the source code is much more important than the binary code (for debugging and extension). Second, open-source does not mean publicly available on the Internet. As Gordon Fraser puts it, there may be “embarassing things” in a research prototype. Since not everybody understands the deep nature of research prototypes (e.g. students or recruiters), it’s perfectly fine that the source code is only available upon request (such as MuJava for instance).

Rule #5: Long-term archival

For sake of reproduction again, a research prototype must be stored in a long term archival site. It can be at the research group level or the university/institution level. Even better, it is supplementary material of a paper stored at the publisher’s website, or uploaded as ancillary files on open-access archival sites such as Arxiv or Hal.

Rule #6: A good prototype has many lives

If there is no software engineering rules, how to encourage reuse among students? how to build a startup? how to have an impact? To me, a good idea has many lives. A successful research prototype will be reimplemented many times. Some of them will remain research prototypes, some will mature as open-source software with a user community, others will eventually be sold as commercial products. The maturation, with many lives and rewritings, is an essential part of the process, and anyway “programs are like pancakes – throw the first one away” (Ivan Sutherland).

The academic Crapl license [1] is along the same line: “If the Program shows any evidence of having been properly tested or verfied, You will disregard this evidence”, “You agree to hold the Author free from shame, embarrassment or ridicule for any hacks, kludges or leaps of faith found within the Program.”, “You recognize that any request for support for the Program will be discarded with extreme prejudice.”

Jason Fried has also pointed out [2] that for companies to remain creative, a sense of quick’n’dirty is indeed important.

Feedback welcome!

–Martin Monperrus
Lille, April 2013

Bibliography

[1] The CRAPL: An academic-strength open source license (Matt Might)

[2] The Importance of Quick and Dirty (Jason Fried)

Acknowledgement

I’d like to thank Gordon Fraser for an insightful discussion on this topic and Raphael Marvie for pointing me to Jason Fried’s post.

Tagged as: