I've been working on documentation for the last little
while and I needed a break and some real coding. So I
decided to make semicolons optional in the Suneido
language. I've been toying with this idea for a while and
over my coffee at Tim Hortons one morning, I worked out how
to do it and it seemed pretty straightforward. In the end,
it took me about half a day to get it 99% working, and
another day and a half, and several complete rewrites, to
get the last 1%. Typical. On the positive side, the
current version is a lot cleaner than the initial version.
One thing I was afraid of was that the changes
would ``break'' a bunch of the existing code in stdlib. So
I
wrote a quick function to syntax check all the records in
stdlib.
QueryApply('stdlib')
{|x| try x.text.Eval();
catch (e) Print(x.name " - " e); }
Then I put this in a text file so I could run it even if I
there were too many errors to get to the WorkSpace. (With
the Print changed to output to a textfile.)
My first stab at the changes resulted in about 500
records
with syntax errors. After fixing a few blatant issues, it
was down to about 30. A few more fixes and voila, no
errors. But that was on code that had semicolons on every
statement. When I started to test examples without
semicolons I ran into a bunch more problems that took quite
a bit longer to fix.
I tried to follow a good refactoring approach,
although,
technically this wasn't refactoring because I was changing
functionality. But I was preserving all the existing
functionality. The basic plan was
1. Change the scanner to return a NEWLINE token
instead
of
a WHITESPACE token for any run of whitespace that contained
a linefeed or return. Then change the parser to ignore
NEWLINE tokens.
This should not have changed anything - and it
didn't.
All
my tests ran, and no syntax errors were introduced.
2. Add a variable to track nesting of (), [], and
{}
and
ignore NEWLINE tokens if inside one of these. Also skip
NEWLINE's after binary, and trinary operators.
Seems simple but this ended up taking quite a bit
of
fiddling to get right. At first, I was adjusting the
nesting counter in various parsing methods. (Suneido uses
a recursive descent parser so there is a method for each
grammar construct.) But this got pretty ugly, so I ended
up counting (), [], and {} in the method (match) that reads
new tokens from the scanner. After this breakthrough
(which, of course, seems obvious now) it went fairly
smoothly.
When I released this version internally, we found
two
records in stdlib and a few more in other libraries, where
there were no syntax errors, but the new interpretation was
different from the old interpretation. For example:
return
... ;
used to be one statement, but now was two i.e.
return ; ... ;
Another case was:
s = s
.Replace(...)
.Replace(...);
Which was now three statements instead of one. The
solution is to put the operators at the end of the lines
instead of at the beginning, which is our normal style
guideline anyway.
s = s.
Replace(...).
Replace(...)
Overall, I'm pretty happy with this change. Personally,
I'm so used to having to have semicolons in C and C++ that
it's not really an issue for me. But I have noticed it can
be a problem for beginners. And if you don't need them,
why require them.
The next step on this path is to make braces
optional
and
use indenting instead, like Python. One of these days when
I need another fix of serious hacking...
Andrew McKinlay
Suneido Software