Talk:Bibletime2Backend

From BibleTime
Revision as of 17:26, 18 April 2007 by Eelik (Talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

With all this discussion of handling different formats, storing the data, ... it seems that this is moving into Sword's area of functionality. It seems like BibleTime could even be a stand-alone Bible program, and just handle Sword modules if the user wants, instead of being a true Sword frontend. Is that the direction BibleTime is moving? It seems to be duplicating a lot of the work on the Sword engine. Or maybe I'm just misunderstanding things? -Benjie

You are correct. We are moving away from sword to implementing our own backend. We'll use sword for data import and will still support crosswire as a central place to share modules and code, but we decided to drop sword as internal backend for a number of reasons. We'll support alternative versification schemes from the start, for example. --mgruner

I'm extremely interested in hearing what those reasons are. -Eeli

Contents

Rendering and markup languages

The Unbound Bible

The Unbound Bible looks interesting because it seems to have some translations which are not in Crosswire repos. Maybe it would be worth supporting. The file format is simple, it's zipped plain text. Here is text from Readme.txt from a Bible module:

The file ending in .html gives any known version or copyright information.

The file ending in _utf8.txt is the Bible file in UTF-8 encoding, with the following data format:
Book_Index
Chapter
Verse
SubVerse
Verse_Order_Index
Verse_Text

The file ending in _utf8_mapped_to_NRSVA.txt is the Bible file in UTF-8 encoding, where each verse in the Bible is mapped to the matching verse(s) in the English New Revised Standard Version (with Apocrypha).
This mapping may not be entirely accurate. At this point, it is a first attempt to map verse references of the two Bibles to each other.
The file's data format:
NRSVA_Book_Index
NRSVA_Chapter
NRSVA_Verse
Bible_Book_Index
Bible_Chapter
Bible_Verse
Bible_SubVerse
Bible_Verse_Order_Index
Bible_Verse_Text

The file ending in _mapping_to_NRSVA.txt is a list of the special mappings known to exist between the verse reference system in this Bible and the verse reference system in the NRSVA. The file's format:
NRSVA_Book_Index
NRSVA_Chapter
NRSVA_Verse
Bible_Book_Index
Bible_Chapter
Bible_Verse
Bible_SubVerse
Verse_Exists_In_NRSVA_But_Not_In_This_Bible_Version [Values are 0 or 1]

The book_names.txt file lists the Book_Index codes and the books of the Bible they represent.

And here is the beginning of a module:

#THE UNBOUND BIBLE (www.unboundbible.org)
#name	English: King James Version
#filetype	Unmapped-BCVS
#copyright	Public Domain
#abbreviation	
#language	eng
#note	
#columns	orig_book_index	orig_chapter	orig_verse	orig_subverse	order_by	text
01O	1	1		10	In the beginning God created the heaven and the earth.
01O	1	2		20	And the earth was without form, and void; and darkness was upon the face of the deep. And the Spirit of God moved upon the face of the waters.

Comments by mgruner

Eeli, good suggestion. We do want to make more material available. Our backend design will require all content to be xml-based. For simple formats like unbound material (though I do not know it in detail) we can utilize import-time transformation utilities that convert these formats to OSIS or another directly supported markup language. Is there any markup in the bible text of unbound texts? On first glance this would seem very easy to do. --mgruner

I think the Readme tells all about the format - if so, there is no extra markup. Of course we should contact The Unbound Bible before starting anyways, so we can ask about the details. (I think they would be happy to have this kind of great frontend for their format.) --Eeli

Sounds very good. Feel free to contact them! We can open a page in the wiki where we can post all permissions/official statements that we received from Bible societies or similar organizations. --mgruner

I sent a message, let's see if they respond. However, I'm afraid they have some issues with copyrights. They seem to have taken some modules from Crosswire, and I think you know the situation with Portuguese and some other modules which have been removed from Crosswire.

Here is their answer (I chose to put it here publicly, there's no confidential information). It's very clear, only they did not explicitly say if there is markup in the text, but I think "no" is implied.

Supporting the format that Unbound Bible uses is fine. It's currently
designed to try to make Bibles available in a parallel format, using the
NRSVA as the point of commonality. Some of the versions (actually about
40%) have not been mapped at all yet, and these have a different file
format than those that have been mapped.

The files also include some lines that start with #, which are
comments/meta-data
These may include:
#THE UNBOUND BIBLE (www.unboundbible.org)
#name<tab>The name of the Bible
#filetype<tab>The filetype of this file (see below)
#copyright<tab>copyright information
#abbreviation<tab>an abbreviation for the Bible
#language<tab>ISO 693-3 language code
#note<tab>some note about the Bible
#columns<tab>A tab-delimited list of the column names 

The filetype options are currently:
1. Unmapped-BCVS (unmapped means it has not had its verses mapped to
parallel the NRSVA, BCVS stands for Book,Chapter,Verse,Subverse)
2. Unmapped-BCV (this does not include the Subverse field)
3. Mapped-BCVS (mapped means it has had its verses mapped to parallel
the NRSVA)

The status of some Bibles in Unbound Bible may be unclear, and you
should use your own research and judgment in deciding what you want to
do with different Bibles on a case by case basis. Some are public
domain, some of them were approved by signed contracts, some by email
approvals, and some have unknown status or unclear status.

STEP

People ask about the STEP format support every now and then. It's very complicated format (that M$ rtf "standard" which we all hate plus something more) and not reasonable to think about right now. However, we have a complete rtf->XML importer already. It belongs to KWord filter collection. http://websvn.kde.org/trunk/koffice/filters/kword/rtf/import/. It depends only on Qt and some KDE macros as far as I can see (EDIT: wrong, it depends on some KDE stuff also, but not critically). It would be a good starting point. --Eeli

Sounds good as well. Which materials are available in STEP? --mgruner

I thought that you know better than me. http://en.wikipedia.org/wiki/Standard_Template_for_Electronic_Publishing. Actually, Crosswire folks should know something - they host some documentation for STEP.

Check out http://lightbydesign.net/index.htm. They offer STEP books (not Bibles but Christian classics). They don't have own reader, but give link to e-Sword which has STEP support. As far as I know STEP is not supported by any other free reader, let alone FLOSS, and e-Sword is Windows only. Therefore STEP support in BT would be a good idea.

Maybe lightbydesign folks would even be willing to help in some way. At least they know STEP.

If we really start the STEP support, there have to be good reasons. I know two:

  • Support the "third party" material like lightbydesign, who don't have an own reader.
  • Support the material from some commercial applications. STEP may be a withering format and it may become important for some people to have a modern reader for older material which they have purchased. An old reader may not work anymore. Also some people want to use Linux and read their STEP material there even though their commercial programs still work.

In either case we have to know which features need to be supported. Lightbydesign material is probably easier (no WMF-files etc.)

Back in 2000 there was discussion on sword-devel about STEP support. It's seems to have faded away. Troy and others could still be interested in helping in that area.

--Eeli