Posted: Thu Sep 12, 2019 2:00 am
by jonathangkoehn
It is a source data issue and not a theWORD issue when data is supplied from a publisher it has their content. So if one has a straight apostrophe ' versus a curly one for example that can cause some issues. Unless each data is normalized which may go against the publisher desire plus what would be the normal?

Now imagine if you were dealing with other items for example. It is common in greek lexicons to normalize a final accent καί but a Bible may have καὶ and καί this can change a search. Thus Costas has built in an ignore option which can help. Sometimes we desire such specifics so as long as we know what that might involve it helps.

To get a better idea of this whole searching process check out regex searches at ... syntax.htm
There are some very powerful search expressions. You should see what JG has come up with at times!

Posted: Thu Sep 12, 2019 2:15 am
by rdwray
When you change all text to one font, that eliminates the issue of apostrophe type, the encoding changes the style. If you change the encoding from UTF-8 to ANSI, the style of the apostrophe will change. My experiment from a previous post:
I just changed the encoding of WEB bible and here is what I got for one verse:

Acts 1:6 Therefore, when they had come together, they asked him, “Lord, are you now restoring the kingdom to Israel?”
The character before "Lord" should be a quotation mark; the characters at the end should be a quotation mark and <CM>.

But this is what the text actually looks like in Notepad2:
Therefore, when they had come together, they asked him, “Lord, are you now restoring the kingdom to Israel?”

The issue is something in TW is changing the encoding when it loads the module; possibly because some modules are encrypted.

Posted: Thu Sep 12, 2019 9:15 pm
by jonathangkoehn
Lord's and Lord’s look very close to the same however 's and ’s the apostrophe in each of these is an entirely different character it is not a UTF-8 or an Ansi issue, nor a font issue. It can be the same with - or – or — which look similar visually but are different. All these various characters can be found inside of data. To search and replace every instance would not prove useful because visually sometimes a Lord’s looks nicer than Lord's

The search engine could perhaps sort out all these differences and try to normalize them. However that could be very time consuming to program, slow down searches.Also what things should be normalized across a multitude of languages? It is better to keep it straight forward and powerful. Perhaps one wants to search for Lord's versus Lord’s you can because that is how powerful the search engine is.

Perhaps a suggestion for theWORD would be to popup a notification or change the background color of the search bar if the user is not entering a character from the virtual keyboard. (This way we would know at least if the character is in the data)

Posted: Thu Sep 12, 2019 11:59 pm
by Jerol
For anyone whose word processor that automatically converts straight apostrophes to curly ones -- just type your word, say lord’s into theWord’s search box. Then type the same word into your word processor where the apostrophe will become a curly one. Copy/paste that into theWord search along with the same word entered earlier and do an “or” search. theWord will find it either way. . . Works for me.
PS: What a fantastic program. I have a 3 other Bible programs and each one has some feature that I like better than theWord, but with theWord I find myself studying more -- i.e., the program's simplicity and flexibility stays in the background, doesn't get in the way, and lets me get on with my work.