Wrong character encoding after editing

Have you found a bug or you think that the program does not function as expected? Report it here
RubioTerra
Posts: 732
Joined: Wed Sep 23, 2009 5:13 pm
Location: Brasília, Brazil

Wrong character encoding after editing

Post by RubioTerra »

This problem was reported on the Portuguese section of the forum and I couldn't answer it. The user wants to edit a dictionary module, adding info to existing topics. But whenever he adds any text containing diacritics (and Portuguese has quite a few), although the text looks OK at first, all accented characters becomes garbled once he leaves the topic and goes back.

I attach here a sample of the module. To reproduce the problem, just add any non-ascii text to a topic, switch to another topic and then go back. All non-ascii text, including text that existed before the editing, becomes wrong. I checked the RTF, it has no font tables and all non-ascii characters are encoded as \uXXXX?.
Attachments
Almeida.7z
(54.24 KiB) Downloaded 304 times
Rúbio R. C. Terra
Brasília/DF - Brasil
User avatar
JG
Posts: 4604
Joined: Wed Jun 04, 2008 8:34 pm

Re: Wrong character encoding after editing

Post by JG »

It appears that the sample database date is in UTF-16le encoding. I am not sure if Sqlite is supposed to deal with this or if it is theWord. A workaround to the issue appears to be to drag and drop the topics to a new user module within theWord. However I would wait for Costas more informed reply.
Jon
the
Word 6 Bible Software
OS for testing; Windows 10
Beta Download ------Beta Setup Guide------On-line Manual------Tech doc's and Utilities------Copyright Factsheet
RubioTerra
Posts: 732
Joined: Wed Sep 23, 2009 5:13 pm
Location: Brasília, Brazil

Re: Wrong character encoding after editing

Post by RubioTerra »

Thanks, JG. This workaround does away with the problem. I'll copy all topics to a new module and send it to the user that reported the problem.
Rúbio R. C. Terra
Brasília/DF - Brasil
csterg
Site Admin
Posts: 8627
Joined: Tue Aug 29, 2006 3:09 pm
Location: Corfu, Greece
Contact:

Re: Wrong character encoding after editing

Post by csterg »

theWord will only deal with utf8 databases. This is a system error with the db itself, i don't plan to change it
Costas
RubioTerra
Posts: 732
Joined: Wed Sep 23, 2009 5:13 pm
Location: Brasília, Brazil

Re: Wrong character encoding after editing

Post by RubioTerra »

That's alright. It's rare to find UTF-16 content anyway.
Rúbio R. C. Terra
Brasília/DF - Brasil
Post Reply