בסיעתא דשמיא

Contents

  • Siddurim
  • Tḥines
  • Birkonim
  • Haggadot
  • Art
  • Transcriptions
  • Translations
  • Howto
Username:

Password:


Recover password | Register New Account

An Economic Argument for Open Data by Efraim Feinstein

There are two principles on which the success of data on the contemporary web rests: the web makes content available, and it adds value to that content by linking it to other related information.

When considering bringing old content online, both of these aspects are important. A first level of digitization involves simply making data available. Google Books and Hebrewbooks.org work at this level, providing PDFs and/or OCR-ed transcriptions of the material. A second level of digitization involves semantic linkage of the data, both internal to the site and external to the site. The Open Siddur Project and Open Scriptures digitize at the semantic level. This second-level digitization is required to do all of the cool things we expect to be able to do with online texts: click on a word and find its definition or grammatical form, find the source of a passage in one text in another text, find how the text has evolved historically, etc. Even the simplest form of a link: a reference from another site, requires some kind of internal division.

Digitization that takes advantage of the web therefore requires a number of steps: (1) getting the basic text online, (2) getting it in an addressable form (to make it more like typed text, instead of a picture of a page), (3) assuring the text’s accuracy, and (4) marking it up for semantic linkage. Some of these steps, or parts of them can be done automatically, but, overall, they require some degree of intelligent input. Even step 1, which is primarily mechanical in nature, requires design of the procedures.

I hope that this outline of the required steps to getting a text online suggests that the most expensive part of making content available is human labor — it takes time to do it, and it takes even more time to do it right.

And now for the rhetorical questions:

  • How many times has the Tanach been digitized?
  • … the siddur?
  • … the Talmud?
  • … major commentaries on the siddur, Torah, Talmud (Rashi, Tosefot)?
  • … full codes of Jewish law (Mishneh Torah, Tur, Shulchan Aruch, Aruch Hashulchan)?
  • … uncommon piyyutim (liturgical poems)?

In some cases, the answer is: it’s been done many times. In other cases, the answer is: it’s never been done. And, both answers lead the all-important question: why? Why are there so many digitizations of the Tanach and no full digitizations of Shulchan Aruch online? Why isn’t the siddur already hyperlinked to its Talmudic sources?

I would propose that we have been wasteful with our resources. Earlier, I pointed out that the primary resources that go into these advanced digitizations are time and human labor. In some cases, these resources equate directly to money, in others, the linkage is more indirect.

The core material of all of the above-mentioned works comes from the public domain. It is ownerless, and free for anyone to copy for any purpose. Every time we encounter a basic text that we have to digitize again because of “new copyright” claims or EULA-style contractual constraints, that is an indication of a failure somewhere in the system. This is particularly true if the claims are being made by non-profits, “social” businesses, or academic institutions. In the Jewish world, even for-profit published books are sometimes donation-supported. Each common text that has to be digitized a second, third, or hundredth time equates to another less common text that is not being digitized. Redoing basic OCR work and transcription takes resources away from establishing semantic linkages.

Some people and organizations get it. As of now, we only need one digitization of the Leningrad Codex (Masoretic Bible). That’s because Christopher Kimball and the J. Alan Groves Center for Advanced Biblical Research digitized it, transcribed it, and released it as free data. The Westminster Leningrad Codex is now perhaps the most built-off version of the Hebrew Bible online. The base texts (which may be used “without restriction”) are present in both commercial and non-commercial products. The Open Siddur Project is using it both for its technology demonstrations and as the basis of all biblical texts in the siddur.

There are precious few examples of free data in the Jewish community, even on the Internet. There are copious examples of donation-funded organizations presenting primarily public domain data with new copyright claims.

Free data prevents the necessity of duplication of effort, which, in turn, prevents the community as a whole from unnecessarily wasteful spending. Particularly for organizations with a social mission, its use is a win for everyone.

Print Friendly
 . Creative Commons Attribution-ShareAlike . 4.0 . International .
“An Economic Argument for Open Data by Efraim Feinstein” is shared by Efraim Feinstein with a Creative Commons Attribution-ShareAlike 4.0 International license.

About Efraim Feinstein


Efraim Feinstein is the lead developer of the Open Siddur web application.

Related liturgy and liturgy-related work:

1 comment to An Economic Argument for Open Data by Efraim Feinstein

Leave a Reply. (All comments are shared with a CC BY-SA 3.0 Unported license unless another free-culture license is indicated.)

Recent Posts

"Rainbow God's Earth Covenant" (Virginia, credit: ForestWander, license: CC BY-SA)The Rainbow Haftarah by Rabbi Arthur Waskow, translated by Rabbi Zalman Schachter-Shalomi

I call you to make from fire not an all-consuming blaze
But the light in which all beings see each other fully.
All different,
All bearing One Spark.
I call you to light a flame to see more clearly
That the earth and all who live as part of it
Are not for burning:
A [...]

"Tea bowl fixed in the Kintsugi method" (Public Domain). Kintsugi  is the Japanese art of fixing broken pottery with lacquer resin dusted or mixed with powdered gold, silver, or platinum.סידור ולא נבוש | Jewish Prayer as Shame Resilience Practice: Siddur v’Lo Nevosh for Shaḥarit by Rabbi Shoshana Friedman

For those of us who speak a religious language, we can understand our journey of building shame resilience as one of the many ways we can uplift, exalt, praise, and honor not just our own lives but the Life of life itself. Whenever we feel unworthy of love and belonging, we can remember that the [...]

"Fruits of Prunus domestica" (credit: YAMAMAYA, license: CC BY-SA)Prayer Before Studying Kabbalah by Rav Yitzḥak Luria (translated by Aharon Varady)

Master of the worlds and Lord of Lords,
Father of Compassion and Forgiveness,
we give thanks before you [haShem] Elohainu, Elohai of our ancestors,
by bowing and kneeling for having brought us near to your Torah and to your sacred work,
and for granting us a portion in the hidden insights of your holy Torah.

"Kipppunkt Ei" (an egg symbolizing a tipping point) credit: Jovel, license CC BY 3.0.יום כיפור | HaVidui Ha-Mashlim, Complementary Confession by R’ Binyamin Holtzman

Ahavnu – We have loved,
Bakhinu – we have cried,
Gamalnu – we have given back,
Dibarnu yofi – we have spoken great things!
He’emanu – We have believed,
v’Hish’tadalnu – and we tried to give our best effort,
Zakharnu – we have remembered,
Chibaknu – we have embraced,
Ta’amnu Sefer – we have chanted [...]

"Shmita sign." A resident of Holon, Israel, announcing the fruits on the trees in his backyard are hefker (ownerless property) during the year of Shmita, and that anyone can enter and harvest them.
עברית: תושב חולון מודיע כי הפירות על העצים בחצרו הם הפקר לרגל שנת שמיטה. (credit: Drork, Public Domain.)הרחמן | Haraḥaman, Prayer to the merciful One for the Shmita Year, R”H seder additions, and other liturgical tweaks by Rabbi David Seidenberg (neohasid.org)

This Haraḥaman (prayer to the merciful or compassionate One) for the Shmitah or sabbatical year can be added to Birkat Hamazon (blessing after meals) during the whole Shmitah year, in order to remember and open our hearts to the sanctity of the land. Say it right before the Harachaman for Shabbat, since Shmitah is the [...]

בסיעתא דארעא