trinity-devel@lists.pearsoncomputing.net

Message: previous - next
Month: March 2016

Re: [trinity-devel] Re: Re: Re: Re: TQString::fromUtf8 vs TQString::fromLatin1 [possible bug in parseVCard]

From: Fat-Zer <fatzer2@...>
Date: Mon, 28 Mar 2016 09:47:58 +0300
2016-03-28 1:00 GMT+03:00 deloptes <deloptes@...>:
> Fat-Zer wrote:
>
>> 2016-03-27 15:07 GMT+03:00 deloptes
>> <deloptes@...>:
>>> Fat-Zer wrote:
>
>>
>> Yes, it is how it should be, a fix for tdelibs. But note if vcard will
>> have a field in a different encoding (e.g. "charset" parameter is set)
>> the code will likely fail... To fix it completely the whole api
>> changes are required (pass TQByteArray to the parser rather than
>> TQString).
>>
>
> You mean charset different than UTF-8?
> But this "else" refers to the case when no charset+encoding is specified, so
> it should really default to UTF (IMO)
>

This "else" refers to case than no "encoding" specified. The charset
is handled later. It seems will be handled correct if bote encoding
and charset are specified, but it will be wrong if only charset is
set.
Here is a general mistake: the QString is used in those functions as a
container for a sequence of bytes with undefined charset, which is
generally wrong.
It works because of QString internally consist of two independent data
sets: zero-terminated const char* for fast return with ascii() or
latin1() (in case the string is latin1) and a QChar[]. But this is a
very malicious practice...

So to make the code work in it's current state an obscure and
unintuitive code is required:
 converter.parseVCard( TQString::fromLatin1(str.utf8()) );

The correct solution is to change the API so parseVCard would accept a
QByteArray rather than a QString. Also note that it was started during
kde times: note KABC_VCARD_ENCODING_FIX ifdefs in tdepim...

>>
>> Here you implicitly use QString (const char*) which is an equivalent
>> to QString::fromAscii (), which is equivalent of fromLatin1 () as far
>> as you don't set QTextCodec::codecForCStrings(). So the code will
>> likely fail if it will have some other encoding.
>>
>
> You mean charset different than UTF-8?
>
> I'm not sure because I observed some strange behavior - just played around
> with the test code until it worked. The most frustrating was to see all
> looks fine in the test program and after sync it was mangled in the address
> book. So I did some testing on the parseVCard until I found out it works
> thebest when passing c_str(). I tried all options that were highlighted in
> the thread here or in syncevolution.
>
> Thanks for explanation on the above. I think it is pity I do not have more
> time to track it further, but I still do not understand when you say it is
> equivalent to .... and the code will fail if encoding is set.

"Equivalent" here means that the following code will have exactly the
same results:
converter.parseVCard(item.c_str());
converter.parseVCard(TQString(item.c_str()));
converter.parseVCard(TQString::fromAscii(item.c_str()));
// if you haven't set TextCodec::codecForCStrings() explicitly
converter.parseVCard(TQString::fromLatin1(item.c_str()));

>
> In vCard 2.1 you have the option to specify charset+encoding
> In vCard 3.0 it looks like it is default to UTF and I've not tested
> charset+encoding
>
> The code I produced operates based on what is coming from syncevolution and
> offers vCard 3.0, so we receive automatically UTF input. Perhaps I should
> test with vCard 2.1. Or better someone else, but this is good point to make
> a todo note.

2016-03-28 1:09 GMT+03:00 deloptes <deloptes@...>:
>
> BTW did you raise a bug to fix this?
No, I haven't...