Discussion:
Lexing table keys in Lua Lexer
Paul K
2013-07-15 17:34:56 UTC
Permalink
Neil:

One of the users of my Scintilla based IDE noticed that "text" is not
styled as a string on line three in the following Lua fragment:

a={}; a['text']=true
a={['text']=true}
a={text=true} --<-- not styled

Is it something that can/should be fixed in the Lua lexer, or would you
prefer to keep it the way it is? One (small) advantage for me is that I
skip some of the processing for identifiers that appear in strings, so this
would allow me to correctly handle it as well if "text" was marked as
"string" as well. Thank you.

Paul.
--
You received this message because you are subscribed to the Google Groups "scintilla-interest" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scintilla-interest+***@googlegroups.com.
To post to this group, send email to scintilla-***@googlegroups.com.
Visit this group at http://groups.google.com/group/scintilla-interest.
For more options, visit https://groups.google.com/groups/opt_out.
Neil Hodgson
2013-07-15 22:19:35 UTC
Permalink
Post by Paul K
a={}; a['text']=true
a={['text']=true}
a={text=true} --<-- not styled
Is it something that can/should be fixed in the Lua lexer, or would you prefer to keep it the way it is?
The last line is similar to a.text where the piece after the dot has to look like an identifier. I see these as more similar to identifiers than strings so they should be coloured as identifiers.
Post by Paul K
One (small) advantage for me is that I skip some of the processing for identifiers that appear in strings, so this would allow me to correctly handle it as well if "text" was marked as "string" as well.
If someone thinks its important enough to implement as an option, it can be included but its not how most people think about this. Many languages have secondary mechanisms to access entities through strings: should everything be coloured as a string?

Neil
--
You received this message because you are subscribed to the Google Groups "scintilla-interest" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scintilla-interest+***@googlegroups.com.
To post to this group, send email to scintilla-***@googlegroups.com.
Visit this group at http://groups.google.com/group/scintilla-interest.
For more options, visit https://groups.google.com/groups/opt_out.
Paul K
2013-07-15 23:19:22 UTC
Permalink
Post by Neil Hodgson
The last line is similar to a.text where the piece after the dot has to
look like an identifier. I see these as more similar to identifiers than
strings so they should be coloured as identifiers.

I tend to agree with that, although this last part only *looks* like an
identifier. Some Lua lexers/parsers (for example, David Manura's parser: *
https://github.com/davidm/lua-parser-loose*<https://github.com/davidm/lua-parser-loose>)
do identify these as "String" event: 'String', name - string or table field.
Post by Neil Hodgson
If someone thinks its important enough to implement as an option, it can
be included but its not how most people think about this. Many languages
have secondary mechanisms to access entities through strings: should
everything be coloured as a string?

Possibly. In a strict sense I think it should be identified as a string,
but I'm not the one to make a patch for it unfortunately.

Also, I agree that it goes against what most people would expect:

foo = 123 --<-- 'foo' is an identifier
_G.foo = 456 --<-- 'foo' is a string??

Paul.
--
You received this message because you are subscribed to the Google Groups "scintilla-interest" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scintilla-interest+***@googlegroups.com.
To post to this group, send email to scintilla-***@googlegroups.com.
Visit this group at http://groups.google.com/group/scintilla-interest.
For more options, visit https://groups.google.com/groups/opt_out.
KHMan
2013-07-16 01:23:32 UTC
Permalink
Post by Neil Hodgson
Post by Neil Hodgson
The last line is similar to a.text where the piece after the
dot has to look like an identifier. I see these as more similar to
identifiers than strings so they should be coloured as identifiers.
I tend to agree with that, although this last part only *looks*
like an identifier. Some Lua lexers/parsers (for example, David
Manura's parser: _https://github.com/davidm/lua-parser-loose_) do
identify these as "String" event: 'String', name - string or table
field.
Post by Neil Hodgson
If someone thinks its important enough to implement as an
option, it can be included but its not how most people think about
this. Many languages have secondary mechanisms to access entities
through strings: should everything be coloured as a string?
Possibly. In a strict sense I think it should be identified as a
string, but I'm not the one to make a patch for it unfortunately.
IMHO, in a strict sense it is lexically an identifier. Then syntax
sugar kicks in and makes the string equivalence.

For a={text=true}, 'text=true' is supposed to look like an
assignment, hence it should be lexed like an assignment. Our
expectation when seeing this pattern is that the RHS is a value or
expression and the LHS is an assignable LHS object, just as it has
always been done everywhere.
Post by Neil Hodgson
foo = 123 --<-- 'foo' is an identifier
_G.foo = 456 --<-- 'foo' is a string??
Developers have been using the dot syntax in structs and objects
for a very long time. 'foo' as a string in _G.foo would just be
totally weird. You have locked onto an idea, but IMHO I think this
idea is unsound.

Put it in another way: It is wrong to make the colour of the
syntax sugar version of a syntax to look more like the
actual-executed syntax, because the syntax sugar is supposed to
look like a common syntax pattern, but one that is _different_
from the actual-executed syntax. So those syntax are supposed to
invoke assignment and object member syntax patterns, _not_ table
lookup with a string. Meeting halfway would be... ewwwwwwwww :-0 ;-p
--
Cheers,
Kein-Hong Man (esq.)
Kuala Lumpur, Malaysia
--
You received this message because you are subscribed to the Google Groups "scintilla-interest" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scintilla-interest+***@googlegroups.com.
To post to this group, send email to scintilla-***@googlegroups.com.
Visit this group at http://groups.google.com/group/scintilla-interest.
For more options, visit https://groups.google.com/groups/opt_out.
Philippe Lhoste
2013-07-16 11:13:36 UTC
Permalink
One of the users of my Scintilla based IDE noticed that "text" is not styled as a string
a={}; a['text']=true
a={['text']=true}
a={text=true} --<-- not styled
Is it something that can/should be fixed in the Lua lexer, or would you prefer to keep it
the way it is? One (small) advantage for me is that I skip some of the processing for
identifiers that appear in strings, so this would allow me to correctly handle it as well
if "text" was marked as "string" as well. Thank you.
I agree it must *not* be styled as text, for the good reasons already given, and more.
If you write:

a = { for=true }

'for' will not be highlighted as an identifier, but as a keyword, with good reason: the
expression isn't valid.

Beside, changing this behavior to the one you wish would make the lexer complexer without
being satisfying for all users... :-)
--
Philippe Lhoste
-- (near) Paris -- France
-- http://Phi.Lho.free.fr
-- -- -- -- -- -- -- -- -- -- -- -- -- --
--
You received this message because you are subscribed to the Google Groups "scintilla-interest" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scintilla-interest+***@googlegroups.com.
To post to this group, send email to scintilla-***@googlegroups.com.
Visit this group at http://groups.google.com/group/scintilla-interest.
For more options, visit https://groups.google.com/groups/opt_out.
Paul K
2013-07-17 18:15:34 UTC
Permalink
Phillippe, Kein-Hong Man, thank you for the comments.
Post by Philippe Lhoste
Beside, changing this behavior to the one you wish would make the lexer
complexer without being satisfying for all users... :-)
I agree with the reasons for not doing this; although it does get confusing
at times: a.b "looks" like an identifier even though it's not (from the
manual: The language supports this representation by providing a.name as
syntactic sugar for a["name"]).

Paul
--
You received this message because you are subscribed to the Google Groups "scintilla-interest" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scintilla-interest+***@googlegroups.com.
To post to this group, send email to scintilla-***@googlegroups.com.
Visit this group at http://groups.google.com/group/scintilla-interest.
For more options, visit https://groups.google.com/groups/opt_out.
Jorge Visca
2013-07-19 01:24:19 UTC
Permalink
Post by Paul K
Phillippe, Kein-Hong Man, thank you for the comments.
Post by Philippe Lhoste
Beside, changing this behavior to the one you wish would make the lexer
complexer without being satisfying for all users... :-)
I agree with the reasons for not doing this; although it does get
confusing at times: a.b "looks" like an identifier even though it's not
(from the manual: The language supports this representation by providing
a.name as syntactic sugar for a["name"]).
But not any string, it must respect the restrictions for identifier naming.
It's a "user space" identifier :)
Perhaps this sort of thing should be handled by a "known field" indicator.
I've seen something to that effect somewhere, but don't remember where...
--
You received this message because you are subscribed to the Google Groups "scintilla-interest" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scintilla-interest+***@googlegroups.com.
To post to this group, send email to scintilla-***@googlegroups.com.
Visit this group at http://groups.google.com/group/scintilla-interest.
For more options, visit https://groups.google.com/groups/opt_out.
Jorge Visca
2013-07-17 15:40:31 UTC
Permalink
I'm who made the observation to Paul K in the first place, and I find this discussion very interesting.

Probably it all resumes to that if you are mixing both styles (a.text and a['text']) then you are doing something fishy. Normally you are either providing arbitrary strings, and then should use the dictionary-like access, or are storing valid Lua identifiers (e.g. methods and attributes), so you can use the "dot syntax". If your are doing former, the dot syntax doesn't work, if the later, there is no reason to resort to the first syntax.

Therefore, I retract my observation :)
--
You received this message because you are subscribed to the Google Groups "scintilla-interest" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scintilla-interest+***@googlegroups.com.
To post to this group, send email to scintilla-***@googlegroups.com.
Visit this group at http://groups.google.com/group/scintilla-interest.
For more options, visit https://groups.google.com/groups/opt_out.
Loading...