There are, generally, two types of corpora (in licensing terms) that are being imported into the EarlyWritings.com site, and a third class of corpora that could be integrated only offline.
- The first type of corpora is that designated 'public domain'. Whether that is from Project Gutenberg, from modules in existing Bible software, from scans on Google Books, from other sites such as Christian Classics Ethereal Library, or just by deduction from the date of publication (works in 1922 and before are public domain), these works have no restriction in their licensing terms.
- The second type of corpora is that designated 'non commercial', or more specifically Creative Commons Attribution - Noncommercial - Share Alike. This is the license type used by the Perseus Projects at Tufts and (roughly speaking) the Oxford Text Archive, among others. This type of license cannot be made public domain (as it does have some copyright rights reserved), but the reverse is possible: public domain works can be subsumed into a work that, in general, has a Creative Commons license.
- The third type are those that prohibit redistribution entirely. The most significant sources of this type are the Thesaurus Lingua Graecae (of UCI) and the PHI #5 and #7 CD-ROMs (from Packhard Humanities Institute). The latter are free to individuals who write for them, and the former is available to individuals by subscription (formerly on CD-ROM). One might also include in this category any modules or data for proprietary software packages such as Logos and Accordance, assuming that they could be read (and that there is no legal issue with doing so). As stated, the website could not present such material, but offline software could read it if the user already obtained it on their own.
A fourth category is presented by the input from the creators and users of the site, which could be licensed any way in which the site's terms state. A custom license might, for example, make provision for mirrors so long as there is a prominent notice of the www.earlywritings.com domain as the place where the material was originally collected.
A last, fifth category is the software that runs the site and, if implemented, an offline client. Options here include free / libre / open source software licensing (which makes most sense if other sites would want to take advantage of the software and contribute to its code themselves) or a proprietary, copyright license (which makes sense if it isn't likely to be a reused tool and, rather, is likely to be maintained only by me and maybe one or two others). The latter would increase the frequency that users of the software offline would pay a reasonable amount for it, allowing funding to go to the site itself and to the pockets of those who build it.
With these five different licensing terms being considered, here is how I see them interacting:
- Public Doman materials are clearly marked and demarcated as "public domain" where they appear (for purposes of citation and reuse) and subsumed into the larger site licensing (license #4).
- Creative Commons Attribution - Noncommercial - Share Alike materials are clearly marked and demarcated as "cc-by-nc-sa" with a link to the full license (again, for purposes of citation and reuse) and subsumed into the larger site licensing (license #4).
- These materials are accessed by software under full copyright (license #5) but are themselves under the copyright of their respective copyright holders.
- The site as a whole has a "cc-by-nc-sa" license because it is impossible to adapt a "share-alike" licensed work into a work that is licensed in a different way. (See the Creative Commons FAQ.) But the "by attribution" part does allow the work's creators to specify the manner of attribution, which would be by prominent display (and hyperlink if possible) of the Early Writings URL .
- The software doesn't need to be exposed to the web user. It can be licensed for a user when sold (if proprietary), or it can go the open source route. As I've suggested already, I am inclined to the proprietary route due to lack of predicted outside assistance (if it is open source) and because it seems the most natural way for me, the hacker, to profit off of the whole affair (doing the work on compiling and formatting all the public domain and Creative Commons content, more or less, as a "loss leader").
There was a time when I was nearly (but never quite fully) enchanted of the mantra that information "wants" to be free and that there is never a good reason for restrictive licensing. Ironically, it is a book by an open source advocate (titled
The Cathedral and the Bazaar) that inclines me against such a dogmatic approach to free (as in beer or as in speech) information. The concept (of "free" or "open source" licenses) is not adopted because it is inherently charitable or especially humanistic, but because it
works in satisfying the needs of those who are involved with creating and using the information.
In this case, the stuff that is already been and is being produced by online communities is the raw information (the texts, ebooks, and so on). The free or open source licensing (be that public domain or Creative Commons) works for them because it is a powerful motivator. The idea that a work, once digitized and machine-readable, can remain so for perpetuity, strokes the ego in the sense of being an immortality project (something that will outlast you) as well as being, broadly, an exercise in goodwill.
Meanwhile, the Early Writings website requires software that will run on the webserver for providing the information requested by users. That means, from the get-go, that software must be written and a webserver must be paid for. At the same time, the initial legwork of gathering content (primary and secondary) to be accessed through the site is likely to be the work of one man (that is, me). And the reason that I am doing all this, full-time, when I could instead be applying for jobs, is quite honestly because it is not only more rewarding in itself, but it pays better. The Early Christian Writings and Early Jewish Writings CD-ROM sales proved that there are a number of people who are willing to support the efforts of disseminating these materials and who wish to have an offline program.
Moreover, as far as income for the site goes, it is far from a pet project in its conceived form. I have already, a couple years ago, commissioned Chris Weimer to produce an Early Latin Writings website, which has been essentially completed so far as the secondary material (comments from Chris) are concerned. I also gave a contract to my sister to proofread the scanned editions of Charles's
Pseudepigrapha, and now the entire work will be available in machine-readable HTML format. In the future I plan to commission parts of the Nag Hammadi Library and Dead Sea Scrolls for fresh translations that can be displayed online.
I hope readers are as glad as I am to see the websites growing again after years of resting on laurels. However well-earned those laurels may have been, anything that is not getting stronger is dying in the online world. Please be sure to email me at
peterkirby@gmail.com if you have any of your own thoughts on the directions of the sites in the future.
Labels: earlywritings development