Older blog entries for ta0kira (starting at number 8)

redi and redi,

You're entirely correct about everything you said, yet you're practically making my point for me. Maybe the confusion is that I reference gcc specifically. gcc is an excellent implementation of what irritates me in a very minor way. In fact, writing this much about it makes it sound like a huge problem either for me or for C++.

There are four reasons for #include <my_callback.h> in C++ to be taken as C++. The first is that C++ started as a pre-processor and the standard headers ended with .h before the ISO and ANSI standards. The second is a failure to subsequently differentiate between C and C++ headers, leading to widespread use of .h for C++ headers. The third is to retain compatibility with C, which can be used to justify letting .h be either C or C++. The fourth is that the standard C++ headers have no extension; therefore, everything must be assumed to be C++.

If allowing .h to be C++ is to allow for compatibility with C then the point is lost because of the details I described in previous posts. If allowing .h to be C++ is to take into consideration older C++ headers ending with .h, that's something that could have been standardized along with everything else. The more likely reason is that a large number of programmers still use .h for C++ headers. You've already lost compatibility with C to some extent because one has to explicitly let the compiler know that C is being used. At that, some parts of standard C will never compile when #included into a C++ file no matter how you try to do it.

It's not the behavior that irritates me; I know there are "official reasons" for it. It's the fact that one file can be C in one context and C++ in another under the same compiler. You can even #include "hello, this is kevin", and something isn't right about that.

Lastly, I can't believe you picked this to dispute out of my entire original post, but you have indeed made your point.

Kevin Barry

redi, you misunderstand my point. Take the example below:

  1. I want to create a shared library with the function my_callback. Because I want to support C programs, dlsym, and C++, I want my_callback to be unmangled; therefore, I put it in my_callback.h.

  2. I compile my 100%-C library libmc.so.

  3. program-a needs my_callback without hard-linking. Because my_callback isn't mangled, dlsym is an option.

  4. program-b needs my_callback with hard-linking. Because my_callback isn't mangled, program-b must either use extern "C" when including my_callback.h or my_callback.h needs to conditionally use extern "C" if C++ compilation is detected. This is because gcc infers that my_callback.h is meant to be a C++ header rather than at least implicitly giving it C linkage. The problem isn't apparent until link time, however; gcc mangles the name and an "undefined reference" error occurs.

I use gcc above to point out that it isn't just g++ that will do this.

Kevin Barry

Belated Response to appenwar's 'tokenizers' Blog

Most of the points have been well-spoken by others with more experience than I have; therefore, I'll stick to my own points. This has less to do with what you actually said and more to do with the principle.

One thing that always irritates me is how gcc will ignore C/C++ file extensions and take a guess, or it will default to C++. For example, a .h will only be taken as C if included strictly by a chain of C files and only if you don't use g++. One must therefore include the awkward #ifdef __cplusplus \ extern "C" { because some people don't know how to use the correct file extensions, otherwise you might have linking problems if your header is actually backed by a C source. If you have to use a C feature not carried over to C++ in a C file (e.g. the .sym = member initializer,) you can't #include your file in a C++ file even with extern "C". You can also get away with not qualifying structure variables with struct in C headers if a C++ file includes it. All of this leads to less concise code, all because of acceptable ambiguity. I do concede that early C++ used .h extensions for the standard headers, so it's partly lack of foresight.

Today I finally got around to using libxml2, which struck me as extensively (yet somehow poorly) documented and extremely ambiguous. On the other hand, it will save having to write my own compliant parser to parse the ~1.4M lines of XML I need to convert and load into a database. This has little to do with libxml2 not accepting partial errors because the data I received was probably exported from SQL using the same library. I'd actually copy the trees created by libxml2 into a more usable structure if they weren't going right into a database, but XML is meant as a format, not as a run-time representation.

If someone is actually hand-writing XML-proper, chances are they're missing the point (or they're dealing with a software interface that misses the point.) Additionally, if someone is using software other than libxml2 to generate XML, they're either missing the point or they lack the appropriate language bindings. That being said, I use my own library to assemble and parse "XML-like" structures (closer to HTML, I guess) for IPC. It wouldn't make sense for me to use formal XML for the application, and especially not libxml2. Though the formats are very similar, the run-time organization used by libxml2 isn't anywhere near being suitable for what I use the data for. Then again, I don't need any sort of standardization because the data doesn't go anywhere outside of the application. It's a symmetrical system because data importation and exportation are designed concurrently to compliment each other, which I can only assume is the case with libxml2.

Something many formal projects lack (software and otherwise) are explicit correlations between the core purposes of the project and the aspects of implementation (yes, I'm guilty, too.) If I were to author something comparable to XML, I'd explicitly state that it isn't meant to be hand-written and it's primarily intended to allow data transfer between applications with different maintainers. At the point of deciding whether or not to accept simple errors, I'd defer back to those principles and conclude that errors should not be accepted. If I were to author something like HTML on the other hand, I would account for hand-written code and acknowledge that rendering with errors is better than rejecting a file. All too often projects are approached with founding principles, yet they fail to rationally extrapolate those principles to the level of implementation (guilty, again.)

Rather than getting into everything already brought up, I'll leave it at that.

Kevin Barry

21 Feb 2009 (updated 22 Feb 2009 at 17:15 UTC) »
RE: Advogato posters: leech or seed? by cdfrey

I was actually very tempted to write a post about syndicated blogs today. cdfrey essentially said what I would have, but I probably would have been more elaborate and possibly less considerate. This is actually a great article topic, although I'm on my way to bed and I'm too lazy to compose right now.

I find myself pre-scanning the recent blogs for those that aren't syndicated. That's about 10%, which certainly saves me a lot of reading. Syndication just tells me "what I have to say is so important that many people on many sites will read it, but I don't have time to go to all of those sites and read what other people post." That might not be the truth, and indeed some people do generate more valuable entries that are of interest to a wide community. It might be better to have a "most-recently syndicated" list separate from the "I actually signed in, making it possible for me to read others' writing" list.

Many of the syndicated blogs provide useful information, but I don't think they belong in the same section as those originating from this site. I can't think of any other site where an RSS feed gets interleaved with original content as if it were the same.

Kevin

20 Feb 2009 (updated 20 Feb 2009 at 22:29 UTC) »
Today's Successful Migrations

I successfully migrated my largest project and a somewhat-smaller project from SourceForge to BerliOS since my last post (about 24 hours ago.) The migration was extremely easy: most of that time was spent sleeping and waiting for cron on the BerliOS servers to build my repositories.

Migration included copying the project web-sites, subversion repository mirrors, and previous file releases from one site to the other. One thing that made the migration extremely easy was being able to control the hook scripts for my repositories, allowing me to svnsync them with my development machine without site-admin intervention (that's where I left off at SourceForge.)

Please take a look at the new home for Resourcerver and hparser if you have time. On a side note, I'll be looking for help later to document hparser so other developers can actually use it.

Kevin

Moving Out of SourceForge

After nearly 4 years of hosting my projects at SourceForge, I've finally decided it's time to move on. I haven't decided for sure where, but the best option at the moment seems to be BerliOS. The commercialism at SourceForge is just getting too out-of-hand.

The first two projects to move will be those I have listed here. In case that changes and you read this later, those would be Resourcerver and hparser. These are my "flagship" project and it's unsung "working horse," respectively. Resourcerver actually relies heavily on two of my C++ template libraries (a list class and a series of data-encapsulation classes,) also. I'll put those up here later, probably after I transfer their hosting from SourceForge.

Kevin

Why are there so many inactive observer accounts? Maybe I should have checked the list before signing up. I feel like an impending casualty of a popularity contest. I'd like to think my work is worth something, but what can one do when one's blog entries are bumped down by automatically imported entries from other sites? All of this blog activity got me excited until I realized most of them are from people who aren't actually logging in to post. Chances are 99% of the users reading this come from the 9k observer base.

Anyway, I'm in a negative mood today in case it wasn't apparent. I normally loathe "how I'm feeling" blogs; therefore, I'd like to say something important rather than waste your time.

I went searching for a new host for a few of my projects today, and I must say, the outlook was very bleak. I understand that hosting costs money; however, the commercial sites don't seem to offer anything better than the free sites. Many of the more optimal free sites strike me as exclusive, or they're missing something like web hosting.

I'm sure there's a large group out there with the mindset that a project needs a strong purpose and a place in the open-source world to be important. This might well be true, in which case I should pack up my projects and go home. The principle of "find a requirement in need of fulfilling" is very relevant and valid; however, nothing I've ever programmed arose from a requirement. This is partly my fault for not seeking out projects in need of help, which was a side-effect of thinking my imperfection/"uniqueness" as a programmer was equivalent to "of no use." In any case, my own unfounded endeavors provided me with a lot of practical experience in programming, documentation, and software design.

Whether or not my work be of use to anyone is at the whim of the community. With projects living near the bottom of the ocean, visibility comes only by chance. I suppose the real question, then, is if I got enough out of my experiences with my projects for my time to not have been wasted if those projects never go anywhere.

If the context in question is all-around programming in research and academics, the answer is definitely "yes." I'll still use my projects even if no one else does. In the context of the open-source community at-large, the answer is "uncertain" at best. A great weakness of mine is advertising myself. I can promote a project all day because that's somewhat tangible and quantifiable, but that's subject to projection onto a larger context.

In any case, this is getting too long and this is indeed the Web where everyone can read what I say, even ex-girlfriends and my mother. This is probably a good time to shut my mouth.

Kevin

17 Feb 2009 (updated 17 Feb 2009 at 05:07 UTC) »
Resourcerver Source Online

I finally put Resourcerver on public subversion and public browsing today. It's been a long time coming. I had the project on CVS a long time ago, in fact when the project started, but my early source-tree structural changes were so frequent and drastic that neither I nor CVS could keep up with tracking them. I set aside version control for all releases up until now. It took about a year of design and programming to get it to the point where I felt confident releasing an alpha version, but I dropped CVS a few months before that. It would be nice to have those changes tracked for regression testing, but in the end the project has quite a bit more structural efficiency than I would have had the patience for using a VCS.

I'm not sure exactly how I feel about having the non-packaged files out there. I don't have anything private, but it does include a few scripts and other files I don't ever intend to include in the package.

Anyway, it's out there now, so please take a look if you have an interest in the project. Reading the source and changes online certainly beats downloading and extracting a package. It is a lot of code, just to warn you (~53k lines.)

Kevin

16 Feb 2009 (updated 16 Feb 2009 at 00:47 UTC) »

I don't do a lot with peer-networking sites. I've never had a blog. In fact, I don't really know what to write here, nor anywhere else on this site. I do, however, write software and I've never been paid for it. I started with BASIC on an Apple IIgs in 1991. My father fried the RAM around 1995, and the only thing available to program on was my HP-48g, which I programmed day and night for lack of a real computer. I lacked other programming resources until 2003, when I finally had a computer of my own. I sought out and learned what I know about programming independently, mostly through websites, message boards, manpages, tearing apart code I've come across, and coding for days at a time until I figure out how to make something work. Programming steals my life, so I try to save it for good ideas. It steals my sleep, my dreams, and even the world right in front of me. My life is elsewhere, yet I remain a slave to my text editor.

I'm not a professional developer, I don't have a degree in CS, nor will either apply to me in the future. I study cognitive science and mathematics. I'm sure I'll have more to say about that later.

I don't generally use IDEs and I don't have much of an interest in GUI programming. Most of what I develop takes the form of algorithms, frameworks, infrastructures, libraries, and many other things not readily usable by the non-programmer.

I have several "open-source projects," made so by virtue of being hosted as such, but I put most of my time into one. Ironically, the one that consistently has zero downloads. I actually don't program for others to use my work; I publish my work so my time doesn't go to waste. I'm a compulsive perfectionist with my code, so when I get something right I like the idea of someone else being able to come across it and see what I've done. I'd like to think that everything I write can be of some use to someone, but that really isn't the point.

This isn't to say I don't care about what I put out there or what other developers think. I often retract a download after noticing a misspelled word in the README for fear of publishing something with an error. I feel quite ashamed when I come across bugs in my own work, even in the alpha and beta versions. It always strikes me as a misrepresentation when I put my name on something with a bug.

I'm just starting to get into collaborative development for a research project I'll be working on. I'm the informal development lead, but the actual algorithm design will be done by computer scientists.

For now, please take a look at Resourcerver, my main project. I'd really like feedback on the design; however, please keep in mind it's only loosely related to dbus, dcop, etc. (multi-process app control vs. IPC framework.)

Kevin

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!