Analytic Septuagint with morphology tags

Discussion on theWord modules and other resources
Rodrigo Samy
Posts: 60
Joined: Mon Sep 12, 2011 2:25 pm

Analytic Septuagint with morphology tags

Post by Rodrigo Samy »

Hi people!
There is any volunteer - able to work with perl script - who would be interested in creating a module of the Greek Septuagint Old Testament, with morphology tags included?

Mr. Steve Amato has developed the Analytic Septuagint, from the LXXM and CATSS LXX projects. He kindly agreed to the conversion of his Analytic Septuagint to work as a theWord module. He gave me the following link to the source file (which generated modules for other programs): http://www.bcbsr.com/ftp/lxxm_utf8_v4.zip. Within the attachment (below in this same Forum message), you may see the technical information and background of the work he has sent me.

The only problem is that this source file needs some perl scripts to be converted, and I've no knowledge at all on this area. So, every worker is indeed welcome!

Thanks.
Rodrigo Samy
Attachments
Analytic LXX_technical information.rar
technical information and background on Analytic Septuagint of Mr. Steve Amato
(2.51 KiB) Downloaded 825 times
RubioTerra
Posts: 732
Joined: Wed Sep 23, 2009 5:13 pm
Location: Brasília, Brazil

Re: Analytic Septuagint with morphology tags

Post by RubioTerra »

Hello, Rodrigo.

Check this module out: viewtopic.php?p=8954. It has Strong# and morphology.

I started working on a OT Septuagint module some months ago, having accentuation, lemmas and morphology. I could borrow Strong# from the module above, but I think that would be superfluous, considering it already has the lemmas. I halted due to the versification differences between it and KJV, adapting it will require a lot of work.
Rúbio R. C. Terra
Brasília/DF - Brasil
mathetes
Posts: 421
Joined: Sat Jan 05, 2008 6:08 pm

Re: Analytic Septuagint with morphology tags

Post by mathetes »

My vote is still for a proper analytical lexicon for theWord so we can move away from a defective Strong's numbering system and the need for all the extra tags needed to be put into each original language module. If Robinson is not willing for his database to be used would there be a way to take a fully accented New Testament module which already has this information and write a program to find all unique words and take that word and the information from the tags and put it into a database? Variants or additional words from the Septuagint could be added by doing automated searches on other modules for unique words not yet in the database. It might be an easier job then for someone to go through each entry and verify the information. Once it's done the work would never need to be repeated every time there is some new module and it's accuracy would be more reliable.
RubioTerra
Posts: 732
Joined: Wed Sep 23, 2009 5:13 pm
Location: Brasília, Brazil

Re: Analytic Septuagint with morphology tags

Post by RubioTerra »

mathetes wrote:My vote is still for a proper analytical lexicon for theWord so we can move away from a defective Strong's numbering system and the need for all the extra tags needed to be put into each original language module. If Robinson is not willing for his database to be used would there be a way to take a fully accented New Testament module which already has this information and write a program to find all unique words and take that word and the information from the tags and put it into a database? Variants or additional words from the Septuagint could be added by doing automated searches on other modules for unique words not yet in the database. It might be an easier job then for someone to go through each entry and verify the information. Once it's done the work would never need to be repeated every time there is some new module and it's accuracy would be more reliable.
That's a worth-pursuing idea. I'll give it a try. I'm not sure about how to deal with homographs, though. It would confuse users.
Rúbio R. C. Terra
Brasília/DF - Brasil
mathetes
Posts: 421
Joined: Sat Jan 05, 2008 6:08 pm

Re: Analytic Septuagint with morphology tags

Post by mathetes »

I believe all that would be taken care of by the accents. If a word has the same letters but uses different accents to give it a different meaning then it should be a new record. It should be assumed that if users are going to do original language research they need to understand some basics. If you click on a word in a Bible module it should bring up that exact same word in the lexicon and should list the lemma and the morphology and maybe even the definition for the lemma from Thayer's lexicon if available. Now a problem with homographs (if I understand what you are saying) is if a non-accented module is being used. Then the same word could potentially have one of several different records. In that case I see the option of either listing all the records it could be or informing the user that an accented module needs to be used. Is there a valid reason anymore why a non-accented module would need to be used? I guess the Strong's number could also be a field in the record so English modules, non-accented modules, etc. which have the Strong's number already included could use the same database. Of course that would lead right back to possible inaccuracies due to Strong's numbering being used but it would be no worse than things are now for a person who knows nothing about Greek.
csterg
Site Admin
Posts: 8627
Joined: Tue Aug 29, 2006 3:09 pm
Location: Corfu, Greece
Contact:

Re: Analytic Septuagint with morphology tags

Post by csterg »

I don't think there are so many cases of these to worry about... They can be fixed in a couple of hours for someone who knows Greek.
Costas
Rodrigo Samy
Posts: 60
Joined: Mon Sep 12, 2011 2:25 pm

Re: Analytic Septuagint with morphology tags

Post by Rodrigo Samy »

Hello people!
I never dreamed this Topic would arise such a great interest! :) I apologize for the large post, but since the discussion divided into different fields, I will divide it in two parts.

1. Thanks Mathetes, your work seems to be exactly what I need, for now, and there would be no reason to do the work again. I suppose the primordial source of his work on the LXX is also the CATSS or LXXM text. Since his module passed also by a hard proof time, I wonder why is it not published as an official theWord module, since it is useful to everyone who wants to work with Byzantine texts both of Old and New Testament.

If Costas do prefer to do not publish compilations, well, I understand fully, since the work on the Old and on the New Testament are based on different sources, the tagging were made by different equips with different views, aims, technical procedures, degrees of reliability, etc... Of course, a compilation is used mainly for practical reasons. Even so, why not publish an ot. official module of the LXX with morphology, taken from mathetes module? Dear Costas, I think it would be great! :D

2. This being said, I agree that an accented module with all tagging would also be great. However, my intuition is that any automatic way of tagging could generate, if not many, at least great errors difficult to detect and to revert, once made. These unexpected errors happens often-times when we work with find & replace (and are difficult to revert...), how not with more complexes tools, on a so huge volume of text? So, I agree that the better way would be a work word-by-word.

3. After all, I think I could help in the versification issues, if needed, regarding the difference between LXX and King James numbering. It would be a hard work, :cry: I cannot determine a deadline, but it is anything worth to do.

Thanks to all,
Samy
mathetes
Posts: 421
Joined: Sat Jan 05, 2008 6:08 pm

Re: Analytic Septuagint with morphology tags

Post by mathetes »

Rodrigo,

It has been my sentiment for a long time that we need to move away from tagging each Greek Bible module. IMO, it would be far better to have one lexicon module that no matter what accented Greek module is being used, whenever a word is clicked on it will automatically find that word in the lexicon complete with morphology. The Strong's numbering system is inaccurate, Robinson has updated his morphology coding, and it is too much of a hassle to correct each module if a correction needs to be made. It would be far better to update one lexicon module whenever an error is found or new data needs to be added.

The LXX module I created is that of Brenton's edition which was based on the Codex Vaticanus and as far as I know is the only text copy available on the Internet. It was taken from an Orthodox edition which was edited and proof-read twice to match that of Brenton. That of the CATSS and LXXM is from Ralf's critical edition.

The verse order in the module for Brenton's edition should match with what's needed for theWord. If an error is found you can PM me so I can fix it.

An easy way to get the texts you want for the Old and New Testaments is to use the "Compare" view and set the options for what texts you want accordingly.
RubioTerra
Posts: 732
Joined: Wed Sep 23, 2009 5:13 pm
Location: Brasília, Brazil

Re: Analytic Septuagint with morphology tags

Post by RubioTerra »

RubioTerra wrote:I started working on a OT Septuagint module some months ago, having accentuation, lemmas and morphology. I could borrow Strong# from the module above, but I think that would be superfluous, considering it already has the lemmas. I halted due to the versification differences between it and KJV, adapting it will require a lot of work.
The work I started is based on the LXXMorph, which uses yet another morphology coding (a bit more detailed, I must say), so I created a custom dictionary.

There is one issue that must be solved though -- the lemmas for preposition based verbs are broken in its parts. For instance, ἐπιφέρω comes as ἐπι+φέρω, διανοίγω as δια+ἀνα+οἴγω. In the case of ἐπιφέρω it's just a matter of concatenating the two parts to find the correct lemma, but this is not the case with διανοίγω.

If Rodrigo Samy or someone else find it worth the time to adjust the versification I can pass the individual books files, that must be put together and adjusted to form the final module. There would remain the broken lemmas issue. I attached the dictionary and the Bible module containing the first books for consideration.
Attachments
lxxmorph.zip
(1.5 MiB) Downloaded 793 times
Rúbio R. C. Terra
Brasília/DF - Brasil
RubioTerra
Posts: 732
Joined: Wed Sep 23, 2009 5:13 pm
Location: Brasília, Brazil

Re: Analytic Septuagint with morphology tags

Post by RubioTerra »

RubioTerra wrote:
mathetes wrote:My vote is still for a proper analytical lexicon for theWord so we can move away from a defective Strong's numbering system and the need for all the extra tags needed to be put into each original language module. If Robinson is not willing for his database to be used would there be a way to take a fully accented New Testament module which already has this information and write a program to find all unique words and take that word and the information from the tags and put it into a database? Variants or additional words from the Septuagint could be added by doing automated searches on other modules for unique words not yet in the database. It might be an easier job then for someone to go through each entry and verify the information. Once it's done the work would never need to be repeated every time there is some new module and it's accuracy would be more reliable.
That's a worth-pursuing idea. I'll give it a try. I'm not sure about how to deal with homographs, though. It would confuse users.
In order for the analytical lexicon to work, theWord should normalize accents (grave to acute, not sure about tonos) before doing the search. Otherwise, I would have to create an entry for every accent variant. What do you say, Costas?
Rúbio R. C. Terra
Brasília/DF - Brasil
csterg
Site Admin
Posts: 8627
Joined: Tue Aug 29, 2006 3:09 pm
Location: Corfu, Greece
Contact:

Re: Analytic Septuagint with morphology tags

Post by csterg »

There have been some private discussion lately on working out something similar for Hebrew searching, so I can implement the same for Greek, yes.
Costas
RubioTerra
Posts: 732
Joined: Wed Sep 23, 2009 5:13 pm
Location: Brasília, Brazil

Re: Analytic Septuagint with morphology tags

Post by RubioTerra »

csterg wrote:There have been some private discussion lately on working out something similar for Hebrew searching, so I can implement the same for Greek, yes.
Costas
That would be great, thanks! I'm already doing some tests so feel free to do it whenever you find the time.
Rúbio R. C. Terra
Brasília/DF - Brasil
ALbeSh
Posts: 64
Joined: Mon May 23, 2011 5:14 pm

Re: Analytic Septuagint with morphology tags

Post by ALbeSh »

I have been watching this discussion with a great deal of interest. Because my knowledge of Greek is minimal, I was relating it to Hebrew issues where they are mainly similar, and it seems Matheletes is on to something. While putting together the Shoroshim Thesaurus and CT (by the way, either the bibles WHM or Shoroshim CT are necessary for it to work properly), I had considered that reducing an inflected word to a lemma and then building it back up again was inefficient.

This is an important discussion that could lead to a change in the way we use TW. If moving away from a dependence on text tagging to individual words carries over to the book view, this may also have implications beyond Biblical texts to original language works of all kinds.

Of course, this is all easy to say, but please let me add my encouragement to the process.

ALbeSh
DarrelW
Posts: 1260
Joined: Fri Sep 11, 2009 1:04 am
Location: Klamath Falls, Oregon
Contact:

Re: Analytic Septuagint with morphology tags

Post by DarrelW »

Forgive my question if not appropriate. Has this project been completed, and if so what is the name of the completed module? Thanks!

Darrel
Rodrigo Samy
Posts: 60
Joined: Mon Sep 12, 2011 2:25 pm

Re: Analytic Septuagint with morphology tags

Post by Rodrigo Samy »

I hope there are conditions to this project to walk, since it would be great! although I imagine how much hard work would be needed to put it working. I have however a suggestion, that I am not sure whether or not it is doable.

There is a very good Greek-Latin dictionary software, Diogenes, which works with Perseus project database (however, it works offline). Thus, it displays quickly the lemma and the morphological analysis for any Greek or Latin word, giving the entries of Liddell-Scott Greek Lexicon or Lewis-Short Latin Lexicon. It is quite good, so far the Greek word receives all accents - otherwise, it will not do the parsing but just give the closer word from the dictionary entries.

The "about" text says: "Diogenes is an application for searching and browsing databases in the format used by the Packard Humanities Institute and the Thesaurus Linguae Gracae. Diogenes is free software by Peter Heslin."

To have access to the whole Corpus of Greek or Latin literature, one have to buy a license. However, the part of the dictionary entries works very well and they have a good formatting, with bold types for definitions, paragraphs, indenting, all making the text reading clear.

I would say that virtually all New Testament and LXX Greek words are recognized and parsed, so far they have all proper accents. Later terms, as used in Byzantine literature, would have to be sought in Lampe's Greek Patristic Lexicon, but this is no problem so far we think on a primarily Biblical software.

I once wrote to Mr Peter Heslin, and he was very kind and receptive to some questions and suggestions I had. I imagine he would be receptive to any kind of cooperation so that his databases might work within theWord program - so far his credits are acknowledged. If this integration is technically doable, that would give theWord users both a Greek as well a Latin very thorough morphological database. Sure, Latin would be a very secondary thing, as not being an original language, but if the integration might be automatic, to some degree, it would be a nice extra.

Costa, do you think this might have legs to walk? :) I may for sure write to Mr Heslin, if that is needed, although technically I have no knowledge on programming language.

PS: DarrelW, this thread began with a question about a tagged LXX text (and there is a link in the thread to a good one, although not yet official) but the discussion deflected to another more ambitious project, of creating a HUGE database to work in theWord, that would recognize each and every inflected word (either in Hebrew or in Greek), to recognize its lemma (its root form) and to analyze its morphology (gender, number, case, person, tense, mood, &c) with no need to have tagged files.
Post Reply