trinity-devel@lists.pearsoncomputing.net

Message: previous - next
Month: March 2016

Re: Re: Re: Re: TQString::fromUtf8 vs TQString::fromLatin1 [possible bug in parseVCard]

From: deloptes <deloptes@...>
Date: Mon, 28 Mar 2016 00:00:56 +0200
Fat-Zer wrote:

> 2016-03-27 15:07 GMT+03:00 deloptes
> <deloptes@...>:
>> Fat-Zer wrote:

>>>
>>> diff --git a/tdeabc/vcardparser/vcardparser.cpp
>>> b/tdeabc/vcardparser/vcardparser.cpp
>>> index 7ac07ce..db33263 100644
>>> --- a/tdeabc/vcardparser/vcardparser.cpp
>>> +++ b/tdeabc/vcardparser/vcardparser.cpp
>>> @@ -152,7 +152,7 @@ VCard::List VCardParser::parseVCards( const
>>> TQString& text )
>>>              KCodecs::quotedPrintableDecode( input, output );
>>>            }
>>>          } else {
>>> -          output = TQCString(value.latin1());
>>> +          output = TQCString(value.utf8());
>>>          }
>>>
>>>          if ( params.findIndex( "charset" ) != -1 ) { // have to
>>> convert the data
>>>
>>> Note that VCardParser::parseVCards() is generally encoding-unsafe...
>>>
>>
>> Yes I also looked into this, I closed the file less than 60sec later -
>> because I started having headache.
>> I don't understand what this diff means - do you mean how it should be or
>> is it something from the history of the file?
> 
> Yes, it is how it should be, a fix for tdelibs. But note if vcard will
> have a field in a different encoding (e.g. "charset" parameter is set)
> the code will likely fail... To fix it completely the whole api
> changes are required (pass TQByteArray to the parser rather than
> TQString).
> 

You mean charset different than UTF-8?
But this "else" refers to the case when no charset+encoding is specified, so
it should really default to UTF (IMO)

>>>
>>> PS, some notes about your code:
>>>> std::string data_str(data.utf8(),data.utf8().length());
>>>>        SE_LOG_DEBUG(getDisplayName(), "TDE addressbook ENTRY AFTER
>>>>\n%s\n",data_str.c_str() );
>>> Note that there is no need here to create here an intermediate
>>> std::string, next code should work by itself:
>>>
>>> std::string data_str(data.utf8(),data.utf8().length());
>>> SE_LOG_DEBUG(getDisplayName(), "TDE addressbook ENTRY
>>> AFTER\n%s\n",data.utf8() );
>>>
>>> if not, just cast it to (const char *).
>>>
>>
>> Thank you - you speak out some of my thoughts. I also think I have tested
>> the above, but not sure anymore.
>>
>>>>TDEABC::Addressee addressee = converter.parseVCard(item.c_str());
>>> This is an equivalent to fromLatin1(), so it will work only for your
>>> locale...
>>>
>>
>> I'm not sure if I understand this well. fromLatin1 means I use iso-8859,
>> but I use utf8. It is also obvious that the input (item) is received in
>> utf8. I had different experience when using TQString::fromLatin1 ()
> 
> Here you implicitly use QString (const char*) which is an equivalent
> to QString::fromAscii (), which is equivalent of fromLatin1 () as far
> as you don't set QTextCodec::codecForCStrings(). So the code will
> likely fail if it will have some other encoding.
> 

You mean charset different than UTF-8?

I'm not sure because I observed some strange behavior - just played around
with the test code until it worked. The most frustrating was to see all
looks fine in the test program and after sync it was mangled in the address
book. So I did some testing on the parseVCard until I found out it works
thebest when passing c_str(). I tried all options that were highlighted in
the thread here or in syncevolution.

Thanks for explanation on the above. I think it is pity I do not have more
time to track it further, but I still do not understand when you say it is
equivalent to .... and the code will fail if encoding is set.

In vCard 2.1 you have the option to specify charset+encoding
In vCard 3.0 it looks like it is default to UTF and I've not tested
charset+encoding

The code I produced operates based on what is coming from syncevolution and
offers vCard 3.0, so we receive automatically UTF input. Perhaps I should
test with vCard 2.1. Or better someone else, but this is good point to make
a todo note.

thanks again, appreciated
regards