Annotation encoding in C#
Misha Konvisar
2013-08-15 20:40:21 UTC
Hello everyone,

I'm having a minor issue.
Writing a plugin for Notepad++. This plugin is adding annotations to lines
in document.
The problem is that I'm not able to display a Cyrillic text in annotation
when I have my document in UTF encoding.
When I switch document to ANSI, then I'm able to see a Cyrillic annotation
text, but not able to see a Cyrillic text in document.
I see just strange unreadable characters in annotations box.

That is how I'm adding annotation box:
public static void AddCommentToLine(int position, string text)
Encoding unicode = Encoding.Unicode;

Encoding encoder = Encoding.UTF8;//.GetEncoding(1253); here I've tried
different outuput encodings, but no result...

string strEncoded = encoder.GetString(Encoding.Convert(unicode,
encoder, unicode.GetBytes(text)));
Win32.SendMessage(curScintilla, SciMsg.SCI_ANNOTATIONSETTEXT, position,

I had similar problem, when was trying to read line from document, but this
was solved with encoding scintilla output to UTF8, after that my C# code
was able to work with scintilla text correctly.

Has anybody face this problem?
I think my problem is transferring Unicode string to scintilla editor.
Should I use some styles for annotations?

Thank in advance!
You received this message because you are subscribed to the Google Groups "scintilla-interest" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scintilla-interest+***@googlegroups.com.
To post to this group, send email to scintilla-***@googlegroups.com.
Visit this group at http://groups.google.com/group/scintilla-interest.
For more options, visit https://groups.google.com/groups/opt_out.
Neil Hodgson
2013-08-16 00:01:42 UTC
Writing a plugin for Notepad++. This plugin is adding annotations to lines in document.
The problem is that I'm not able to display a Cyrillic text in annotation when I have my document in UTF encoding.
The encoding used for annotations is the same as for the document. For a document in UTF-8, the annotations should also be UTF-8.
Should I use some styles for annotations?
Yes, you should set the styles for annotations. First try the same style settings as the text being annotated.

You received this message because you are subscribed to the Google Groups "scintilla-interest" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scintilla-interest+***@googlegroups.com.
To post to this group, send email to scintilla-***@googlegroups.com.
Visit this group at http://groups.google.com/group/scintilla-interest.
For more options, visit https://groups.google.com/groups/opt_out.
Misha Konvisar
2013-08-16 08:33:50 UTC
Hi Neil,

thank you for help.

I'm trying to set annotation style in this way.

//before adding annotation to line, I'm reading style at position (guess
here should be position of a character position, not line)
int style = (int)Win32.SendMessage(curScintilla, SciMsg.SCI_GETSTYLEAT,
position, 0);

//add annotation to line
Win32.SendMessage(curScintilla, SciMsg.SCI_ANNOTATIONSETTEXT, position,

//apply saved style to annotation
Win32.SendMessage(curScintilla, SciMsg.SCI_ANNOTATIONSETSTYLE, position,

But still have my annotation unreadable.

When I switch Npp encoding to ANSI, annotations are displayed correctly,
but Cyrillic text in document is unreadable.

When Npp encoding is switched to UTF8, situation is reverted.
From Scintilla Documentation, I couldn't find any style message, that is
able to set annotation encoding.
Could you please clarify how to get "First try the same style settings as
the text being annotated." is a "SCI_GETSTYLEAT" message a good way?

Thank you.
Writing a plugin for Notepad++. This plugin is adding annotations to lines in document.
The problem is that I'm not able to display a Cyrillic text in annotation
when I have my document in UTF encoding.
The encoding used for annotations is the same as for the document. For
a document in UTF-8, the annotations should also be UTF-8.
Here's an image of a UTF-8 file in SciTE showing Unicode annotations by
Should I use some styles for annotations?
Yes, you should set the styles for annotations. First try the same
style settings as the text being annotated.
You received this message because you are subscribed to the Google Groups
"scintilla-interest" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at http://groups.google.com/group/scintilla-interest.
For more options, visit https://groups.google.com/groups/opt_out.
You received this message because you are subscribed to the Google Groups "scintilla-interest" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scintilla-interest+***@googlegroups.com.
To post to this group, send email to scintilla-***@googlegroups.com.
Visit this group at http://groups.google.com/group/scintilla-interest.
For more options, visit https://groups.google.com/groups/opt_out.
2013-08-16 13:00:16 UTC
Still no success.
I've set defined AnnotationStyleId == 3 and set character set for it to

Win32.SendMessage(curScintilla, SciMsg.SCI_STYLESETCHARACTERSET,
AnnotationStyleId, (int)SciMsg.SC_CHARSET_CYRILLIC);

Then, in code adding annotation I'm doing conversion to UTF8, adding
annotation, applying AnnotationStyleId style to it.
But still get those strange characters...

public static void AddCommentToLine(int position, string text)
Encoding encSrc = Encoding.Unicode;
Encoding encDest = Encoding.UTF8;
string strEncoded = encDest.GetString(Encoding.Convert(encSrc, encDest,

Win32.SendMessage(curScintilla, SciMsg.SCI_ANNOTATIONSETTEXT, position,
Win32.SendMessage(curScintilla, SciMsg.SCI_ANNOTATIONSETSTYLE,
position, AnnotationStyleId);

Now I'm a bit stuck...
Post by Misha Konvisar
Hello everyone,
I'm having a minor issue.
Writing a plugin for Notepad++. This plugin is adding annotations to lines
in document.
The problem is that I'm not able to display a Cyrillic text in annotation
when I have my document in UTF encoding.
When I switch document to ANSI, then I'm able to see a Cyrillic annotation
text, but not able to see a Cyrillic text in document.
I see just strange unreadable characters in annotations box.
public static void AddCommentToLine(int position, string text)
Encoding unicode = Encoding.Unicode;
Encoding encoder = Encoding.UTF8;//.GetEncoding(1253); here I've tried
different outuput encodings, but no result...
string strEncoded = encoder.GetString(Encoding.Convert(unicode,
encoder, unicode.GetBytes(text)));
Win32.SendMessage(curScintilla, SciMsg.SCI_ANNOTATIONSETTEXT,
position, strEncoded);
I had similar problem, when was trying to read line from document, but
this was solved with encoding scintilla output to UTF8, after that my C#
code was able to work with scintilla text correctly.
Has anybody face this problem?
I think my problem is transferring Unicode string to scintilla editor.
Should I use some styles for annotations?
Thank in advance!
You received this message because you are subscribed to the Google Groups "scintilla-interest" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scintilla-interest+***@googlegroups.com.
To post to this group, send email to scintilla-***@googlegroups.com.
Visit this group at http://groups.google.com/group/scintilla-interest.
For more options, visit https://groups.google.com/groups/opt_out.
Neil Hodgson
2013-08-16 13:21:27 UTC
Post by zebrox
Encoding encSrc = Encoding.Unicode;
Encoding encDest = Encoding.UTF8;
string strEncoded = encDest.GetString(Encoding.Convert(encSrc, encDest, encSrc.GetBytes(text)));
That looks confusing. Dump the bytes before and after conversion along with what you think the text should be.

You received this message because you are subscribed to the Google Groups "scintilla-interest" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scintilla-interest+***@googlegroups.com.
To post to this group, send email to scintilla-***@googlegroups.com.
Visit this group at http://groups.google.com/group/scintilla-interest.
For more options, visit https://groups.google.com/groups/opt_out.
Misha Konvisar
2013-08-16 13:36:13 UTC
Hi Neil,

thanks for answer, already did it, but dont know how to treat the results...

so code is following:
Encoding encSrc = Encoding.Unicode;
Encoding encDest = Encoding.UTF8;

text = "ÙÙÙÙ";
ShowBytes("before", encSrc, text);
string strEncoded = encDest.GetString(Encoding.Convert(encSrc, encDest,
ShowBytes("after", encDest, strEncoded);

and ShowByts outputs are:
You received this message because you are subscribed to the Google Groups "scintilla-interest" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scintilla-interest+***@googlegroups.com.
To post to this group, send email to scintilla-***@googlegroups.com.
Visit this group at http://groups.google.com/group/scintilla-interest.
For more options, visit https://groups.google.com/groups/opt_out.
2013-08-16 14:11:20 UTC
public static void AddCommentToLine(int position, string text)
//Error 1, direction of conversion
Encoding encSrc = Encoding.UTF8;
Encoding encDest = Encoding.Unicode;

//Error 2, wrong procedure
string strEncoded = encDest.GetString(encSrc.GetBytes(text));

//Error 3, some tricks of passing managed strings to unmanaged code

IntPtr strPtr = Marshal.StringToHGlobalUni(strEncoded);
Win32.SendMessage(curScintilla, SciMsg.SCI_ANNOTATIONSETTEXT, position,
So finally I got my annotations in Russian!

Thanks for help, Neil.
You received this message because you are subscribed to the Google Groups "scintilla-interest" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scintilla-interest+***@googlegroups.com.
To post to this group, send email to scintilla-***@googlegroups.com.
Visit this group at http://groups.google.com/group/scintilla-interest.
For more options, visit https://groups.google.com/groups/opt_out.
Dave Brotherstone
2013-08-16 15:20:47 UTC
Post by Misha Konvisar
public static void AddCommentToLine(int position, string text)
//Error 1, direction of conversion
Encoding encSrc = Encoding.UTF8;
Encoding encDest = Encoding.Unicode;
//Error 2, wrong procedure
string strEncoded = encDest.GetString(encSrc.GetBytes(text));
//Error 3, some tricks of passing managed strings to unmanaged code
IntPtr strPtr = Marshal.StringToHGlobalUni(strEncoded);
Win32.SendMessage(curScintilla, SciMsg.SCI_ANNOTATIONSETTEXT,
position, strPtr);
You shouldn't actually need to marshal the string when passing a string in
UTF8 to Scintilla (only when you want Scintilla to fill a buffer for you,
and even then a stringbuffer with a reserved capacity is normally
marshalled correctly automatically). The problem was your encoding and
subsequent decoding of the string. Strings in C# are UTF-16 (so,
Encoding.Unicode), always. Whenever you have a string object, it's always
(internally) encoded in UTF-16, there's no such thing as a C# "string"
object that has a different encoding. So when you do encSrc.GetBytes(text),
you're getting the UTF-8 bytes of the string. When you then pass that to
encDest.GetString( ...), you're passing in UTF-8 byte sequence and asking
it to treat it as UTF-16, which it then converts to a string object. This
obviously comes out as garbage. What you want to do is *just* convert it
to UTF-8, then pass *those* bytes to Scintilla.

I don't know the signature of the Win32.SendMessage method, but I expect
the following would do what you're after.

byte[] utf8Text = Encoding.UTF8.GetBytes(text);
Win32.SendMessage(curScintilla, SciMsg.SCI_ANNOTATIONSETTEXT, position,

Depending on the signature, you might need to cast it to something.

Hope that helps,

PS If you've not seen it before,
http://www.joelonsoftware.com/articles/Unicode.html is a great article on
how all this unicode/utf8/utf16 stuff fits together.
You received this message because you are subscribed to the Google Groups "scintilla-interest" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scintilla-interest+***@googlegroups.com.
To post to this group, send email to scintilla-***@googlegroups.com.
Visit this group at http://groups.google.com/group/scintilla-interest.
For more options, visit https://groups.google.com/groups/opt_out.
Misha Konvisar
2013-08-16 22:44:35 UTC
Hi Dave,

thanks for comment and interesting link
You suggestion is working. No need to make strange conversions.

But two things I still had to do.
1. Add terminating zero to original string, as scintilla was displaying
random characters at the end of annotation.
2. As I don't have overloaded method Win32.SendMessage accepting byte[] as
fourth parameter, I had to obtain IntPtr pointer to UTF8 byte[] array.

So, final code looks like this:
public static void AddCommentToLine(int position, string text)
//add teminating zero to string
text += char.MinValue;
byte[] utf8Bytes = Encoding.UTF8.GetBytes(text);

IntPtr unmanagedPointer = Marshal.AllocHGlobal(utf8Bytes .Length);
Marshal.Copy(utf8Bytes , 0, unmanagedPointer, utf8Bytes .Length);
Win32.SendMessage(curScintilla, SciMsg.SCI_ANNOTATIONSETTEXT, position,
You received this message because you are subscribed to the Google Groups "scintilla-interest" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scintilla-interest+***@googlegroups.com.
To post to this group, send email to scintilla-***@googlegroups.com.
Visit this group at http://groups.google.com/group/scintilla-interest.
For more options, visit https://groups.google.com/groups/opt_out.
Continue reading on narkive: