Discussion:
Why does LexAccessor::Flush reset startPos?
Eric Promislow
2013-10-24 22:51:53 UTC
Permalink
A customer sent us a single-line file that's taking a long time
to be processed by our UDL lexer (which reads in state-machines
for processing languages).

I noticed the lexer was spending much of its time in LexAccessor::Fill.
The UDL lexer calls styler.Flush() whenever it needs to access previous
states. I do this because I found if I don't call styler.Flush() I often
get zero values for recently set styles. However, I don't see why startPos
needs to be reset. I commented out that line of code, and don't see any
problems.

- Eric
--
You received this message because you are subscribed to the Google Groups "scintilla-interest" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scintilla-interest+***@googlegroups.com.
To post to this group, send email to scintilla-***@googlegroups.com.
Visit this group at http://groups.google.com/group/scintilla-interest.
For more options, visit https://groups.google.com/groups/opt_out.
Neil Hodgson
2013-10-25 00:40:59 UTC
Permalink
Post by Eric Promislow
I noticed the lexer was spending much of its time in LexAccessor::Fill.
The UDL lexer calls styler.Flush() whenever it needs to access previous states. I do this because I found if I don't call styler.Flush() I often get zero values for recently set styles. However, I don't see why startPos
needs to be reset.
That code was written for BufferAccess when that class was used in situations where the application was both reading and writing the document.

While it is possible for the application to change the document inside a SC_MOD_CHANGEFOLD or SC_MOD_CHANGESTYLE notification while styling is occurring, it shouldn't and lexers aren't written to handle this.

Since LexAccessor is now only used for lexers and the document should not change during lexing, it should be safe to retain the cached document text and not reset startPos.

Neil
--
You received this message because you are subscribed to the Google Groups "scintilla-interest" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scintilla-interest+***@googlegroups.com.
To post to this group, send email to scintilla-***@googlegroups.com.
Visit this group at http://groups.google.com/group/scintilla-interest.
For more options, visit https://groups.google.com/groups/opt_out.
Eric Promislow
2013-10-25 01:04:46 UTC
Permalink
Actually the bottleneck was in another spot, repeatedly calling strlen() on
a
750k-character line. This is in our UDL lexer, not a core scintilla lexer.

After I fixed that, benchmarks showed no difference with and without the
startPos reset in LexAccessor::Flush

- Eric
Post by Eric Promislow
Post by Eric Promislow
I noticed the lexer was spending much of its time in LexAccessor::Fill.
The UDL lexer calls styler.Flush() whenever it needs to access previous
states. I do this because I found if I don't call styler.Flush() I often
get zero values for recently set styles. However, I don't see why startPos
Post by Eric Promislow
needs to be reset.
That code was written for BufferAccess when that class was used in
situations where the application was both reading and writing the document.
While it is possible for the application to change the document inside
a SC_MOD_CHANGEFOLD or SC_MOD_CHANGESTYLE notification while styling is
occurring, it shouldn't and lexers aren't written to handle this.
Since LexAccessor is now only used for lexers and the document should
not change during lexing, it should be safe to retain the cached document
text and not reset startPos.
Neil
--
You received this message because you are subscribed to the Google Groups
"scintilla-interest" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at http://groups.google.com/group/scintilla-interest.
For more options, visit https://groups.google.com/groups/opt_out.
--
You received this message because you are subscribed to the Google Groups "scintilla-interest" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scintilla-interest+***@googlegroups.com.
To post to this group, send email to scintilla-***@googlegroups.com.
Visit this group at http://groups.google.com/group/scintilla-interest.
For more options, visit https://groups.google.com/groups/opt_out.
Neil Hodgson
2013-10-25 21:08:57 UTC
Permalink
After I fixed that, benchmarks showed no difference with and without the startPos reset in LexAccessor::Flush
OK.

Since this could potentially cause lower performance and the change appears safe, the reset has been removed.
https://sourceforge.net/p/scintilla/code/ci/db03fb05359fda6d99bccfaf3694ba2b98bcfa35/

Neil
--
You received this message because you are subscribed to the Google Groups "scintilla-interest" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scintilla-interest+***@googlegroups.com.
To post to this group, send email to scintilla-***@googlegroups.com.
Visit this group at http://groups.google.com/group/scintilla-interest.
For more options, visit https://groups.google.com/groups/opt_out.
Loading...