oddity with automatic detection

Have you found a bug or you think that the program does not function as expected? Report it here
blcjr
Posts: 141
Joined: Wed Mar 14, 2012 4:56 pm

oddity with automatic detection

Post by blcjr »

Costas,

I've encountered a curious inconsistency, if not a bug, in automatic detection. I'm working with a source that has numerous lists of scriptures without a book designation, like this:

5:12, 9:51, 14:1, 17:11, 19:15, 24:4

It happens to be a commentary on Luke, so these are references to Luke. What I've been doing, to facilitate automatic detection, is put an "Lk" in front of the string:

Lk 5:12, 9:51, 14:1, 17:11, 19:15, 24:4

If I use "Control-D" over a larger highlighted unit of text, say a paragraph, or just do a Control-D on the entire topic (usually a verse) I get:

Lk 5:12, 9:51, 14:1, 17:11, 19:15, 24:4

I.e., after the first chapter and verse, in the succeeding chapter and verse pairs, the chapter references are detected as verses to the first chapter in the string, and succeeding chapter and verse pairs are not recognized correctly.

However, if I either highlight just the string (and not try to detect all scriptures in a paragraph or topic at once), it works as expected. Actually, what I've become accustomed to doing, is to place the cursor at the end of the string, and insert a space; that works correctly also.

I suspect that the detection algorithm is confused by extraneous commas. I do know that if there are multiple verses to a particular chapter, say a string like:

5:12, 24, 9:51, 14:1, 17:11, 19:15, 24:4

(note the extra "24" now in 5:12, 24) even just highlighting the specific string no longer works correctly; the chapter verse references now have to be separated by semi-colons, not commas.

There may be nothing you can do about this, except to note it. I can imagine that the variations that your detection algorithm must deal with are horribly complex, and will never work in every conceivable case (much less inconceivable ones). I gather that this is part of the "experimental mode" mentioned in the manual, as I could find nothing else about how the automatic detection feature works. But if nothing can be done about it, then at some point, if the feature is ever documented further, this peculiarity should be noted.

Basil
csterg
Site Admin
Posts: 8627
Joined: Tue Aug 29, 2006 3:09 pm
Location: Corfu, Greece
Contact:

Re: oddity with automatic detection

Post by csterg »

Hi Basil,
it's true that theWord would the the , to be an ;.
Yet, i tried this also and it does work properly even with CTRL+D.
Are you using latest beta?
Costas
blcjr
Posts: 141
Joined: Wed Mar 14, 2012 4:56 pm

Re: oddity with automatic detection

Post by blcjr »

It has been awhile since I updated; I'm using 1280. I'll get the latest beta, and work with it for a bit, and report back whether or not the matter persists.

Thanks.

Basil
blcjr
Posts: 141
Joined: Wed Mar 14, 2012 4:56 pm

Re: oddity with automatic detection

Post by blcjr »

Costas,

Okay, I installed the latest beta, and the matter persists. (It is just a "matter." I doubt that it rises to the level of "problem" or "issue." It may be a circumstance that is just to nuanced for the detection algorithm to deal with. But I want you to understand it correctly, just in case it could be fixed.)

Here's an illustration:

Image

The paragraph was highlighted, and Control-D used to enforce automatic detection. The arrows point to the strings that were not correctly detected. Okay, you say, that's because of the commas, where there should be semi-colons. However, if I place a cursor at the end of the string to the right that did not detect correctly, i.e. after the "24.47" in the string of verses for Lk, and press the space bar, the string is correctly detected, commas not withstanding. On the other had, if I try the same thing after the "12:17" for the string of verses for Heb. on the left, nothing happens. The difference in the two instances is clear. In the first instance, the string of chapter references only has a single verse for each chapter. In that case, separation of the chapter:verse pairs with commas will work, only so long as Control-D is applied specifically to the string, or the automatic detection from a space entered at the end of the string. In the case on the left, this approach doesn't work because one (or more) of the chapter:verse references has multiple verses separate with commas.

Maybe I'm making a mountain out of a molehill. I should think it will be impossible to ever have a detection algorithm that can correctly work with every conceivable way verses and chapters might be designated. However, the particular "oddity" noted here is persistent and repeatable precisely, and so perhaps there might be some clue in its repeatability that would inspire a fix. If not, I've learned to spot the incorrectly detected strings quickly, and can quickly see which can be fixed with a simple space inserted at the end of the string, and which requires editing to change commas to semi-colons.

Basil

Postscript (added): I think I should balance the amount of time I've spent describing this to you with a bit of praise for the way TW works with using hyperlinks for scripture "tooltips." In the image above, there was originally no "Lk 3:" in the string of verses for Lk under discussion; it was simply a string beginning with "8,". I inserted the "Lk 3:" to enable automatic detection. As noted in the first post of the thread, this is a commentary on Luke (Plummer's, in the ICCNT series), and repeatedly omits any reference to Luke in such strings. This is actually a very modest example; in many instances the strings are several times longer. There are quite literally thousands of such scripture references, and they are not "tooltipped" in existing modules of this commentary. But I can enable detection with an insertion like this, and then just remove the insertion after detection, leaving the string as it appears in the printed text, but with the scriptures now viewable with a mouseover of the verses in the string. And then there are the ubiquitous "see on..." references where I can hyperlink the comment note, and not just the scripture itself, so that the internal commentary reference is viewable in a popup. I do not think these features are available in competitive software. When I'm done with this remake of Plummer's commentary, there will be literally thousands of popups not featured in previous versions, thanks to the unique way TW uses hyperlinks.
royhieatt
Posts: 258
Joined: Thu Jan 08, 2009 4:23 am

Re: oddity with automatic detection

Post by royhieatt »

Hi Costas and Basil,

Is not the problem the formatting? Notice that it hyperlinks chapter:verse as a verse where the separator is a comma. Use the semicolon and it hyperlinks chapter:verse properly.

Roy
User avatar
JG
Posts: 4599
Joined: Wed Jun 04, 2008 8:34 pm

Re: oddity with automatic detection

Post by JG »

Just to verify what Basil is saying. Test these. Note that some have a space at the end after the punctuation, and it makes a difference to the detection.

Lk 3:8, 5:32, 15:7, 24:47;
Lk 3:8, 5:32, 15:7, 24:47;
Lk 3:8, 5:32, 15:7, 24:47,
Lk 3:8, 5:32, 15:7, 24:47,
Lk 3:8, 5:32, 15:7, 24:47 and the rest of the sentence.
(Lk 3:8, 5:32, 15:7, 24:47)
Jon
the
Word 6 Bible Software
OS for testing; Windows 10
Beta Download ------Beta Setup Guide------On-line Manual------Tech doc's and Utilities------Copyright Factsheet
blcjr
Posts: 141
Joined: Wed Mar 14, 2012 4:56 pm

Re: oddity with automatic detection

Post by blcjr »

royhieatt wrote:Hi Costas and Basil,

Is not the problem the formatting? Notice that it hyperlinks chapter:verse as a verse where the separator is a comma. Use the semicolon and it hyperlinks chapter:verse properly.

Roy
Roy, there is no question that the "proper" way would be to use semi-colon's as separators always. However, we do always control the formatting of the sources used to create modules, and must work with what we're given. In the case at hand, it would be exceedingly tedious to convert every comma to a semi-colon. And fortunately, in some common cases, it isn't necessary to do so. I'm just trying to call attention to such cases, in case there is the ability to extend detection to even more cases where the separators are commas.

Basil
royhieatt
Posts: 258
Joined: Thu Jan 08, 2009 4:23 am

Re: oddity with automatic detection

Post by royhieatt »

Hi Basil,

I totally agree with you. Maybe, I should not have even commented because I knew you fellows were on top of situation. Its amazing how well the program works with the wide variety of module formats. Costas and his helpers do a terrific job.

Roy
blcjr
Posts: 141
Joined: Wed Mar 14, 2012 4:56 pm

Re: oddity with automatic detection

Post by blcjr »

JG wrote:Just to verify what Basil is saying. Test these. Note that some have a space at the end after the punctuation, and it makes a difference to the detection.

Lk 3:8, 5:32, 15:7, 24:47;
Lk 3:8, 5:32, 15:7, 24:47;
Lk 3:8, 5:32, 15:7, 24:47,
Lk 3:8, 5:32, 15:7, 24:47,
Lk 3:8, 5:32, 15:7, 24:47 and the rest of the sentence.
(Lk 3:8, 5:32, 15:7, 24:47)
And just to note, if any of the chapter references have extra verse references, say like

Lk 3:8, 5:32, 35, 15:7, 24:47;

then all bets are off (about how it all works) and the chapter references must be delineated by semi-colons. I suspect that the best we can do here is just understand why the examples given by Jon sometimes work, and sometimes not.

The main thing I've learned from all of this is that when creating a module from another source (as opposed to writing one's own), where you have little or no control on how the verse references were styled, that any attempt to automatically detect scripture references for large amounts of text will undoubtedly contain errors, frequently numerous errors. So I've earned to basically do the CONTROL-D thing, if I use it at all, just a paragraph or two at a time, and check the results for accuracy, and fix any errors noted. But just as often as not, I do not use CONTROL-D, but use the "insert space at end of string approach" as this will frequently work for strings that get mis-detected when CONTROL-D is used over a larger block of text.

And that is really what started all this discussion: noting that "insert space at end of string" sometimes worked correctly for chapter references separated by commas, and not semi-colons, where using CONTROL-D over a larger block of text will lead to misdetection of that same string. And I think it has to do with how the detection algorithm handles commas. It may well be that the fact that it works at all with "insert space at the end of string" with commas instead of semi-colons was unintended. And it may also be that this is one of those "if it ain't broke, don't fix it" situations. I'm happy that I can frequently correctly detect a lengthy string, where the verses are separated by commas and not semicolons, using the "insert space at end of string" approach, without having to manually change the commas to semicolons. I'd hate for a "fix" to "break" that "feature." I'm just noting it for the record, so that if this is ever written up in more detail for a manual, it can be properly noted that there are times when detection will work with commas separating the chapter references.

Basil
csterg
Site Admin
Posts: 8627
Joined: Tue Aug 29, 2006 3:09 pm
Location: Corfu, Greece
Contact:

Re: oddity with automatic detection

Post by csterg »

Can you pls try build 1315?
Costas
blcjr
Posts: 141
Joined: Wed Mar 14, 2012 4:56 pm

Re: oddity with automatic detection

Post by blcjr »

csterg wrote:Can you pls try build 1315?
Costas
Got it. And initial impression is very favorable. I've tried it out on several instances where I would expect to find the "odd" behavior I described, doing a Control-D over multiple paragraphs, and so far, it has always accurately detected the chapter/verse pairs separated with commas.

I just finished chapter 8 of the 24 chapter book I'm working on, so I will have countless more occasions to put the latest beta through the wringer on this. I'll report back if I encounter any problems.

Nice work!

Basil
blcjr
Posts: 141
Joined: Wed Mar 14, 2012 4:56 pm

Re: oddity with automatic detection

Post by blcjr »

Well, a few verses into chapter 9, and I ran across a problem. Here is the block of text that I pasted into a topic:
6. εὐαγγελιζόμενοι καὶ θεραπεύοντες. Comp, ver. 2. Union of care for men’s bodies with care for their souls is characteristic of Christ and of Christian missions. The miraculous cures of the apostolic age have given place to the propagation of medical and sanitary knowledge, which is pursued most earnestly under Christian influences. For διήρχοντο see on 2:15, and for εὐαγγελιζόμενοι see on 2:10. Excepting Mk. 1:28, 16:20, 1 Cor. 4:17 πανταχοῦ occurs only here and three or four times in Acts: here it goes with both participles.
When I do Control-D, the "16:20" for Mark is not detected correctly; the "16" is detected as Mk. 1:16, and nothing is returned for the "20." However, inserting a space after the "20" leads to the correct result.

:?
csterg
Site Admin
Posts: 8627
Joined: Tue Aug 29, 2006 3:09 pm
Location: Corfu, Greece
Contact:

Re: oddity with automatic detection

Post by csterg »

The problem here is the space in "1 Cor"...
Costas
blcjr
Posts: 141
Joined: Wed Mar 14, 2012 4:56 pm

Re: oddity with automatic detection

Post by blcjr »

csterg wrote:The problem here is the space in "1 Cor"...
Costas
Got it. I figured it was something in the string that I wasn't seeing. Except for the rare case like that, the latest beta is almost always handing the verse chapter pairs separated with commas much better than earlier versions. Now I only need to replace commas with semi-colons when a chapter is followed by multiple verses. The new build has really helped with the current project, which must have thousands of scripture references separated by commas rather than semi-colons.

As an aside, I just noted that TW will parse "Jon." as either John or Jonah, defaulting to John. The commentary I'm working on uses "Jn." for John, and "Jon." for Jonah. While you cannot anticipate all the different ways authors will abbreviate books of the Bible, it would seem to me that "Jon." would be more likely for Jonah, than for John. I like the way TW tries to be as flexible and comprehensive as possible, but here I would think Jonah ought to be the default.

Basil
User avatar
Doctordavet
Posts: 354
Joined: Fri Aug 07, 2009 6:18 am

Re: oddity with automatic detection

Post by Doctordavet »

could the user edit the abbreviation table for detecting references?
Dave
Free Modules at
http://www.DoctorDaveT.com
Post Reply