Pygments

generic syntax highlighter


Ticket #367 (closed defect: fixed)

Opened 2 years ago

Last modified 22 months ago

Scala Highlighter is broken

Reported by: guest Owned by: gbrandl
Priority: major Milestone:
Component: lexers Keywords:
Cc: florian@…

Description

The scala highlighter in pygments is quite useless.

I attach a patch that fixes most of the obvious problems and even does some feeble attempts at semicolon inference (i.e. nicely formatted code should be highlighted acceptably, but it should be easy to write code that breaks the highlighting). It produces acceptable highlighting on the scalaz code I tested it on.

Florian Hars <florian (at) hars.de>

Attachments

pygments-scala.diff (11.7 kB) - added by guest 2 years ago.
Errors.scala (327 bytes) - added by guest 23 months ago.

Change History

Changed 2 years ago by guest

Changed 23 months ago by thatch

  • cc florian@… added

Hi Florian,

Thanks for the submission. Can you provide example files that didn't work well under the old lexer?

Changed 23 months ago by guest

Changed 23 months ago by guest

I attaced a file showing some of the problems:

  • Neither `interface` nor interface is highlighted as in identifier, although both are
  • val isn't highlighted as a keyword
  • Nested commtents are broken
  • Multiline strings are broken
  • is highlighted as an error, although it is a valid name
  • foo_+ is highlighted as an identifier followed by an operator, although it is a single itentifier
  • foo_⌬⌬ is totally borked.

More prolems with the old lexer:

  • The deprecated type names boolean|byte|char|double|float|int|long|short|void are highligted different from their prefered uppercase versions
  • The class names String|Int|Array|HashMap are highlighted different from all other class names

Thing the new lexer doesn't solve:

  • Any name can designate almost everything (a type, a class, a method, a variable, an operator), you can only make the decision after type checking. So I make some common sense assumptions:
    • an identifier that starts with an uppercase letter is a Name.class
    • an identifier that starts with any other type of letter is a Name
    • a backticked itentifier is a Name
    • an identifier consisting of opchars is an Operator

This can be as off as you want, but captures conventional practice.

Changed 23 months ago by guest

Oh, by the way, your CC: seems not to work.

Changed 22 months ago by gbrandl

Thanks, applied the patch in changeset [703:aeb0be1650c5].

Changed 22 months ago by gbrandl

  • status changed from new to closed
  • resolution set to fixed
Note: See TracTickets for help on using tickets.