I have previously had a largely neutral opinion about this language. However after more serious development I would like to share some doubts.
not being funny or anything but if you're doing "serious" development, then you will have automatically gone straight into producing test cases, unit tests and regression tests.
without fail, one hundred percent, you _will_ have a test suite that is significantly larger (two to five times larger) than the rest of the code put together.
you will, of course, have been developing the test cases along-side the code, and in may instances have written the test cases _before_ writing the actual code.
the developers will also have been bluntly told that they will, without fail, check in code into a revision control repository WITHOUT FAIL on a regular basis, as well as running the test suites and regression tests without thinking twice and without being asked or reminded.
thus, you have a clearly-delineated linear progression of the project, where you can, if a bug is found (and it will be, very quickly, because of the regression tests), back-track through the repository history to find out which commit broke the relevant test.
this is standard coding practice, and you are, naturally, sticking to good programming practices with such a large project.
the thing with python is that, as a dynamic language, you MUST stick to good programming practices. use pylint as a matter of course, where you define your "rules" regarding variablenames etc.
if you don't, you can expect to run into difficulties.
with python, i strongly advise you to do an absolute maximum of no more than 20 lines of code before writing a test (or three) and doing a repository commit.
preferably as little as 5 to 10 lines.
if that's awkward to do, because of the revision control system being used, get a better revision control system, such as git. you can even back-end git into svn (git-svn) and in that way, developers can work independently and "sync up" using the svn repository as the clearing-house. so they can even commit quotes broken quotes code into git, fix it, and push it to svn when they're ready.
I can more than feel your pain.
WRT auto declaration of variables, I no like. At the very least a
language should allow you to get the compiler to do this work by
insisting on declarations. This almost always causes bugs and they are
completely unnecessary, IMO. It might be easier for beginners, but I
don't think beginners should be writing production code.
Somebody said:
'let me ask you: without the "self." convention, how would you expect to
distinguish between a local variable called "x" and a member instance
variable called "x"? '
If the language was half-way sensible, it would have scoping rules. If
the programmer was half-way sensible they would not be reusing variable
names in near-scope like this unless they are things for which
conventions exist (i,j,k) and are typically in a local scope on the same
page.
That sorting thing is bizarre and (IMO) violates the 'Rule of least
surprise'.
------
At the risk of insulting everybody's favorite language -- they ALL suck.
I have done most of my production work in C, C++, Java and (I am not
kidding) Visual Basic. Of the group, C is the least offensive. I will
write a more coherent article about this when I go begging for someone
to help me create a language that does not suck. However, let me just
beat my breast about a few things:
I assume that a competent general purpose language is possible. Most of
the world's code is written in C, it seems. Certainly, just about
anything *can* be written in C (I assume inline assembly and the
emission of arbitrary bytes). For most, if not all, of the world's
devices for which a programming language has much meaning, there exists
a C compiler.
Before I go on about C, let me just dismiss a few things:
Scripting languages and/or interpreted languages are fine as it goes.
However, without facilities to compile, create large systems, program
down to bare metal, etc, they are not general purpose in and of
themselves. At some level, shell scripting is necessary and desirable.
However, before we insist on picking whichever of the incumbents is the
winner, let's defer that and assume for the moment that our general
purpose language *can* be a scripting and interpreted and byte-code
targeted language as well as a properly compiled language.
Java might be a good stop-gap and it is certainly gathering steam.
However, it is targeted to a virtual machine and that means a
performance hit any way you slice it.
C++ is also popular. However, for me, the cure is worse than the
disease. It is cumbersome and ungainly. It makes unnecessary
distinctions, IMO (like structs and classes) and adds junk like:
cout << "I should not have been allowed to design a language";
I know people are going to nuts over that, but I am not the only one
that thinks that is just ugly and superfluous.
I came originally through Machine Code (sic) ->Assembler -> Basic ->
Fortran -> PL/1 -> Pascal -> Modula2. By the time I got to C, I had
developed good habits from Modula2, but was keenly aware of its
limitations. C was a nightmare to learn at first and I still call it
'the language of memory corruption'. However, C allowed me to do
whatever I needed to do and provided plenty of ability to build
abstractions back when memory was measured in bytes and Kilobytes.
One of the main complaints about C is that that it is 'unsafe'. This is
true. However, a powerful language will always be potentially unsafe. It
allows you to attempt direct memory access, manipulate the stack, do
arithmetic on pointers, etc. Sometimes these things are necessary.
People complain that there is some problem with pointers. I have a vague
recollection that it was difficult to grasp a long time ago when I was
learning the language. However, once you get the hang of it, it is very
natural and though it is 'low-level' it is more reflective of the actual
thing you are programming. It is easy to create a 'safe' string type and
a set of operations. However, the fact that seasoned C programmers don't
generally bother should tell you something about how necessary it is.
The problem with C is not that it allows you to do system programming.
The problem is that it makes it very cumbersome to express certain
desirable things. I don't want to say 'object oriented programming'
here, because I don't want to bring in all the baggage associated with
the paradigm. I need facilities that make it possible, but those
facilities serve other purposes as well.
I need to be able to create structs (classes in C++) that have some way
to self-reference and (potentially) to have prologues and epilogues
(constructors and destructors in C++). C++ provides this, but it is
clumsy, not very good looking and unnatural.
Some might say that quibbling about giving a second name to structs that
have new abilities is silly. However, it is not that I object to the
name change. It is that I object to creating an unnecessary distinction
between a record that has function pointers (and associated 'behavior')
and a record that (for now) does not have such things.
Unless there is a reason to distinguish between something that points to
data that happens to be code and something that points to data that
happens to be data of another type, I don't think it should be enforced
in the language.
Similarly, it seems to me that it should be possible and desirable to
remove the distinctions between declarations of code and data.
C suffers from the problem that it does not have facilities for OO
programming. Without getting into a debate about their advisability,
some of the encapsulation and 'syntactic sugar' provided by these
facilities is pretty much a necessity in large modern software systems.
It is certainly a nasty business to attempt any kind of GUI work without
some kind of object encapsulation at some level.
Other languages address various things, but then make you pay an
unnecessary price and fall down in places that make many of them unusable.
A general purpose language should allow you to do anything you wish. It
should make it easy to do things well, default sensibly, but allow you
to give the compiler all the clues it needs to optimize the finished
product.
It seems to me that, warts and all, C is still the best starting point
for this and that it should be relatively easy to design the language
and build the compiler.
There really is no clear general purpose language that is small,
powerful and highly extensible. If there were, we would all be using it.
However, I do not see any reason why there should not be such a language.
Although I have my own reasons for preferring C, I think it would be a
good candidate starting point because so much of the world's code is
built in it already and it would be easier to write translators to the
new target language.
I would keep the macro pre-processor, but beef it up to handle
template-like stuff. I know a lot of people hate the pre-processor, but
to them I would say "don't use it".
I am not a big fan of the assert, break, goto, continue, register,
setjmp, longjmp, auto and similar keywords that are either semantically
empty or bust structure. However, I think that some facility should
exist to allow them to either be created or invoked to allow compatibility.
Although I am not entirely certain how you would implement it elegantly,
I think that there should be 'syntactic sugar' such that the compiler
figures out whether or not you have attempted to capture an event and if
so, trigger the event handler. Thus:
void SomeBlock( void ) {
int x = 2;
int y;
void x.=( int x ) // What to do when assignment to x
{
if( x < 1 || x > 10 ) { // S/B manifest consts
printf( "x value is out of range\n" ); // Value not set
}
else {
this = x;
}
}
void x.x( int y ) // What to do when assignment from x
{
if( y < 0 && y < 11 ) { // S/B manifest consts
printf( "y value is already in range\n" ); // Value not set
}
else {
y = this;
}
}
x = 11; // invoke 'onSet' handler
y = x; // y will equal 2, x will equal 2;
x = 3; // x will equal 3;
}
If you have done this, then the compiler should implement the semantics.
If not, then the compiler should implement as an ordinary int.
If only y were referenced later, then a smart compiler should be able to
simply replace the entire block by assigning y = 2 and strip the rest of
the code out. Though it might be tricky to implement, the code above
would allow such an optimization.
The code above brings up other points. As I juggle code about, I should
be able to do nested functions. I should also be able to access a
calling context. The call stack should be available. The name of the
function should be available. Information sufficient to locate it in
code should be available (usually filename, line number).
Although I like the convenience of breaking interfaces into header files
and implementations into code files and that is normally sufficient, I
think it is a mistake to conflate the file with the role. That is, I
should be able to specify nested interfaces such that many could exist
in one file or one could exist in many files. For that matter, I should
be able to put code and interface in the same file if I wish. Let the
compiler figure out how to deal.
I am obliged to do it, but I am not fond of having to:
#ifndef STUFFUSEDALLOVER
// define that stuff
#define STUFFUSEDALLOVER
#endif
Let the compiler figure that out.
There should be some common sense to the compiler. If the standard
stream output library is Con.StreamIO, it has standard calls like
PrintLn, you have a file named 'hello.g', then instead of:
/* Whole bunch of stuff the compiler should deal with */
implementation hello
use definition hello in "hello.h"
import Con.StreamIO in "constuff.h";
int main( int argc, char **argv, char **envp )
{
int SomeInventedReturnVal = 0;
// Bunch of warnings because arguments not used
(void) Con.StreamIO.Println( "Hello? This is too much work!\n" );
(void) Con.StreamIO.Println( "argc was:" );
(void) Con.StreamIO.Println( argc.ToString() );
(void) Con.StreamIO.Println( "\n" );
return( SomeInventedReturnVal );
}
The above is extreme, but it is variations on a theme with many
languages. I am the first to agree that printf is *way* not pretty.
However, it does a job. Sure, I should be able to specify where my
tokens are coming from, but if pretty much any program is going use
them, let the compiler default to something sensible. For purists, we
could make a default implied import file on the fly.
Let the compiler implicitly include defaults for common definitions
encountered, construct whatever wrapper it needs for the operating
system, use the same implicit environment and 'self' references it would
normally, divine a return value, etc.
All the above should be a one liner:
printf( "Hello! This is more like it.\nArgc was:%i\n", ctxt["argc"] );
Where the return value is the value of the function printf and ctxt[] is
an associative array taken from the local 'this' context.
Things that 'bug' me:
The return keyword is not consistent. Using it without brackets should
be a syntax error.
The keywords if, for, while, etc should all require braces, even if they
are only followed by a single line. This is a constant cause of bugs,
and again is not consistent.
boolean is a legitimate type because it is returned from things. I am
pretty sure they have corrected this in one of the later ANSI C
standards, but really... Anything that you are consistently using
typedefs for should just be given a default existence. Again, let the
compiler worry about it. Given that boolean is a type and though it
might be implemented as a bit, byte, int or whatever, it also might not.
Values assignable to a boolean should be true/false. I am torn on this
point: should it allow a third type of 'undefined'? Unless the language
enforces defaults for every declaration, a declared boolean that has not
been assigned is not properly true or false -- it's undefined. If I had
to make a snap decision, I would say boolean can only take 1 = 'on/true'
or '0' = 'off/false' because a lot of the world is using bits as booleans.
Surely there must be some way to make a kick-ass general purpose
language with a spare, consistent syntax and elegant semantics. Because
C is so spartan and already fairly consistent (well, ish) we should be
able to make it sufficiently similar to C so that the porting effort
would be reasonable.
I think a lot of the non C/C++/C#/Java people might scream. However,
with the right libraries, I think they could likely be bought off by the
greater power and expressiveness of the language. Functional programmers
are likely sophisticated enough to know that functional programming is
*possible* in such a language, just not enforced (and with good reason).
I keep hearing about how this thing or that thing is wonderful, but I
don't see a lot of clean, vanilla working code for non-trivial things.
In fairness, most of the C code underlying these things is pretty awful
and it is usually difficult or impossible to get it to compile.
This year marks the 40th anniversary of the original design of C. It is
time for an upgraded language and
Java/C++/C#/Python/Ruby/Haskell/Scheme, etc just don't do the trick.
Surely we can do better.
What a mess. Sorry this is just a rant. I hope to eventually write a
proper article and contact people to see if something can be done.