6 Jan 2004 (updated 6 Jan 2004 at 03:22 UTC) »

Oh wow, responses about my gpg comment. Let me get replies in before it all leaves the recent log:

tmorgan: While I find the idea of somehow distinguishing between different levels of trust in the key infrastructure interesting, I think that would make it too complex for the non-technical user. Keeping it at "I trust this person not to send spam" is straightforward and has an obvious, big payoff for everyone involved (the spam problem is only getting worse).

I don't know to much about how it all works either, but I have read that some people have one key for signing and another (usually longer) key for encrypting. So perhaps one's signing key could be a spam trust key, and one's encrypting key could be a super-duper I rilly know you trust key. Mere mortals could be happy with the anti-spam key.

dyork: You're right, of course, about client support. I have been impressed enough with the usability of the Thunderbird/Enigmail combo to think that might be the app that will work for non-techies. (I've never tried gpg on Windows, though -- I'm just guessing things work similarly in that universe.) Home users have their choice of clients, and effective spamfighting might be enough of a draw to make people switch. Thunderbird is a typical GUI mail client; I think anyone could get used to it without much effort.

I know you don't have much choice about mail clients at work; I know that Outlook has a nice plug-in architecture which makes it seem like it might be possible to add a gpg plugin there. (I don't know much about it, but I installed the SpamBayes Outlook plugin at my work and that integrated seamlessly into the client.)

Critical mass is the key, and I can see a glimmer of hope that changing the definition of "trust" to make signed messages useful in blocking spam, combined with new, easier to use clients, could just make it all take off.

Everyone hates spam.

GnuPG for spamfighting?

I've been thinking lately how PGP/GnuPG could be used as a spam-prevention mechanism. I'm impressed with how easy Enigmail makes signing and checking signatures -- it seems like it should be usable by non-geeks.

If GPG-signing were to be used as a spamfighting tool, the meaning of "trust" (which always seemed vague to me anyway) would have to be changed. You would "trust" someone, even if you didn't know them personally, as long as it seems reasonable that person isn't going to send spam. If you receive signed spam that is somewhere in your web of trust, you mark the signers of that key as untrustworthy.

The lower threshold of "trust" would make more sense to a lot of non-technical users -- which is important, because everything I've read about signing keys on the net, is like don't mess around, this is serious business, only sign keys of people you've really met in person and that just means that, outside a small geek community, no one signs any keys.

I trust person X not to spam me ... hell, I'd sign a lot of keys of people from mailing lists and such that I'd probably never meet IRL. And that web of trust seems like it would actually be useful -- signing the keys of "real people" would be a matter of course, and if it were simple enough and gained critical mass, everyone would want to jump onboard.


It looks like only one person has downloaded my alternative iRATE implementation. That's too bad, but I guess it is rather specialized -- limited to people who are

  • on a platform where xmms runs
  • are comfortable installing perl modules from CPAN
  • run the official iRATE client (still necessary to download tunes), and
  • are not entirely satisfied with the official client, or at least curious to try an alternative.
Looks like that's just me.

I wanted to get my track-selection algorithm into the official client, of course, but it was a substantial change which would have been hard for me to write in an unfamiliar language. Combine that with a lukewarm reception for my proposal that made it seem like a patch would have a hard time getting accepted anyway, and I decided it was better to write something for myself in a language I was much more comfortable with.

I'm continuing to develop it and put the latest out on my website, but I'm not bothering with any official announcements.

Hanzi Quiz

I haven't touched this code in ages, but I have an idea how to improve it which I want to implement soon.

I wrote a utility program in perl (which you can find in the hanziquiz tarball) which takes pinyin with tone numbers (e.g. ni3 hao3) and converts it to pinyin with tone marks using utf-8 characters (something like nĭ hăo -- although that's with unicode numeric entities instead of raw utf-8). Now I'm thinking, if I move that conversion into the javascript of Hanzi Quiz itself, I can

  • make it easier to edit the test entries
  • have a fallback for browsers which can't display the accented characters (determined by a can you see this? up front)
Sounds pretty good, eh? The problem is the pinyin character ü (coded as a named entity here). That's hard to enter in text editors (except perhaps as an html entity ... oh that's hard too, just not impossible). In my perl conversion tool I use the character 'v', which isn't used in pinyin, to represent ü. I understand that's common, but perhaps not as common as 'u:' or 'uu'.

Should I code to expect the input to use 'v' (which is reasonable if I'm the one entering data), or should I try to handle other representations? What if I encounter html entities, or non-ASCII utf-8?

Eh, best start coding for the simplest ('v') case, and work from there. Yeah, I've talked myself into it just now.


I have a personal blog over at LJ. I had hoped to keep a blog on my own site, but I eventually decided that LJ was all set up for me, so why not just blog there?

The entries are few and far between now, but will probably become more frequent in the future.

Verizon Spyware Warning

My wife and I recently got a cellphone plan from Verizon. With our cellphones, they included a tutorial CD. I put it in my wife's Win2k box (it's Windows-only, of course) to see if there was anything worth seeing. It appeared to just be a gee-whiz flash presentation of the manual for people who can't read (I'm thinking there must be a lot of those these days!) Yawn.

At least that's what it appeared to be until I shut down the tutorial app. As soon as I clicked the close box, ZoneAlarm informed me that something named noptify.exe wanted to access the internet.

The CD installs noptify.exe as a hidden file in c:\winnt\temp, and it tries to contact the internet periodically as long as you have it installed. Verizon clearly goes to some length to deceive the user and cover their tracks.

Why would the largest U.S. wireless provider do something want to do something so ethically dubious? What sort of information are they gathering? Why would they want to risk their reputation by maliciously compromising their customers' computers?

I definitely want to make some noise about this one, but I haven't formed my plan of attack yet. I'm thinking of writing the FTC and/or my elected representatives and cc'ing Verizon customer support.

1 Dec 2003 (updated 1 Dec 2003 at 04:02 UTC) »

I just got my Perl/Tk alternative iRATE client to a point where I can release it to the world.

It doesn't talk to the iRATE server yet, so you have to fire up the standard client every so often to download tunes. It does write the standard irate trackdatabase.xml file, so the standard client knows about the ratings and such from AlterniRATE.

It uses xmms to play the mp3's so it's limited to platforms on which that's an option. I suspect that Winamp::Control could be used as a Windows alternative, but I'm not too excited about writing that part. Maybe someone else will.

The primary motivation for writing this was that I think that the way the standard iRATE client selects which tunes to play is flawed; it seems like it totally forgets about older tracks, except perhaps if they're rated 7 or 10.

AlterniRATE uses probability weights which grow exponentially for each track until it's played again. The weights grow much faster for high-rated tracks than for low-rated ones, but eventually any tune which hasn't been played in long enough will be screaming "choose me! choose me!"

I hope some iRATE users here will check it out.

Ok, I'm going to blow of steam about a linux desktop user interface problem; there may be more appropriate places, but I guess this isn't the worst place.

I was googling for something like "free music download sites" and I found a link to http://www.mp3downloadhq.com/. I'm using Moz 1.5 and clicked on the link to load in a tab in the background. Boom! My browser is resized to fullscreen. I have my menus on top, but the bottom was hidden below GNOME's bottom bar.

I managed to drag the GNOME bars to the sides instead of top and bottom, but the bottom of my browser was still below the bottom of the screen and I couldn't resize it.

Well, eventually I managed to get it, although I can't say for sure what I did. But I fought with my browser for at least five minutes.

This is not my idea of being in control of my computer. I've had this happen with fvwm, and while it's bad, I can hit Alt-F11 (I think; it's been awhile) to get the top the browser window back so I can move the window into another screen where the corner is grabbable. It took me a couple of times to learn that maneuver, but even when I know what to do, it's a pain in the butt.

I'm supposed to be in charge of my computer here! Ok, I just went under Edit->Preferences->Advanced->Scripts & Plugins and turned off "Move or resize existing windows". I hate to do that, because it could have some useful purpose ... oh wait, I've only ever seen that used for evil. So, I guess it's OK.

I created a page for dinky little programs I've written which I find very useful, and may prove useful to someone else.

Only two out there so far, but I'm sure there will be others.

18 Oct 2003 (updated 18 Oct 2003 at 05:12 UTC) »

Alarming Privacy Violation

I'm sure I must not be the only one here who invests in Vanguard Funds; they have a reputation for low overhead.

Their website is clearly geared towards IE, the only browser they guarantee to work. Mozilla under Linux usually works though, and that's what I usually do. To do a buy transaction you're shepherded through a series of scrollbar-less windows. They offer you the option to print a record of the final transaction, but you're not supposed to save the html, as evidenced by this bit of javascript:


function noRight(e) { if (event.button > 1) { alert("Sorry, the right click has been disabled for this application."); return false; } }

Of course, I saved the html: I wanted to store a record on my computer and the above code presented no restriction to me.

Just now I was looking at the html source so I could enter my data into Gnucash, when I saw something that made a chill run up my spine:

	<div class="gh"><img SRC="https://ad.doubleclick.net/activity;src=9999;type=vangu99;cat=mfbuy9999;qty=1;
cost=999;ord=99999999999;u=99999|Individual|prd;tran=9999999999?" WIDTH=1 HEIGHT=1 BORDER=0></div>

I changed all the numbers to random strings of 9s to obscure my personal financial information (and added a newline to make the formatting less obnoxious), but from the original content it's clear that information about my transaction was sent. To ad.doubleclick.net.

I feel violated. I'd feel really violated if ad.doubleclick.net didn't resolve to on my system.

I guess I'd better go re-read their privacy policy with a fine-tooth comb.

P.S. I know I've read some things before about why Doubleclick in particular is a very dubious entity to trust with one's personal information. I know I can google for it, but if anyone can help me out by pointing me to the best articles to reference in my upcoming complaint to Vanguard, that'd be great.

A Math Question

I have this idea to create a musical composition which consists of richly modulated overtones of a single note. (Hmmm ... wish I could describe that better ... )

I wrote a C program using libsndfile to do simple additive synthesis. I use a base frequency around 50 to 60 Hz and am adding harmonics from around the 7th to the 16th with different amplitude envelopes.

My plan is to create a melody with those harmonics, and calculate amplitude envelopes such that the note that is "played" reaches a defined peak amplitude at that time.

Here's the tricky part. When a note is not being "played", I still want the amplitude envelope for that harmonic to have a rich texture. I'd like it to stay below a certain threshold which I define, but it needs to be there, sinusoidally varying.

So, the question is, how do I get the Fourier composition of a curve defined by the location of various maxima and a threshold which the rest of the curve must not exceed? I guess I also need a parameter somehow controlling how quickly the curve must jump up above the threshold for a maximum and get back down.

My intuition tells me that if I get the formula right, the curve with the fewest sinusoidal components meeting my conditions will also have the richest variation in those regions of the curve which are beneath the threshold, and thus the richest sound texture.

Is this just senseless blathering to all of you? I wish I could make myself clearer.

6 Sep 2003 (updated 6 Sep 2003 at 23:39 UTC) »

A Tale of Digressions

So, I have this code at work that talks to a secure web server pretending to be a browser. For various reasons, we'd like to make the code use HTTP/1.1 and persistent connections. The current code is in perl and invokes a subprocess for each request.

I say, "I know, I'll rewrite it in java, and use The Jakarta Commons HttpClient!". That gives me a chance to get more up-to-speed in java (another work requirement) and use what looks to be a featureful and dependable http client library. There are also more java people than perl people around there to look after the code.

This is kinda hot, so I'm working on it at home over the weekend. First I decide I need to set up SSL on my testing webserver to have something to test against. That way I can also learn something about setting up SSL on Apache 2.x -- what I have installed here on my Debian box -- and that's not exactly a waste of time.

So, to figure out this SSL stuff, I go to the manual, I have apache2-doc installed, so there's a link on my main webpage which takes me to http://localhost/manual, where I see

URI: index.html.de
Content-Language: de
Content-type: text/html; charset=ISO-8859-1

URI: index.html.en Content-Language: en Content-type: text/html; charset=ISO-8859-1

URI: index.html.fr Content-Language: fr Content-type: text/html; charset=ISO-8859-1

URI: index.html.ja.jis Content-Language: ja Content-type: text/html; charset=ISO-2022-JP

URI: index.html.ko.euc-kr Content-Language: ko Content-type: text/html; charset=EUC-KR

... at least with "view source" thats what I see. Rendered as html, it's all run together.

Hmm, it appears that there's some problem with the language negotiation. I should have done then what I did just now: found the existing bug report and let it go, but instead I poked around in my configuration files and read up on mod_negotiation trying to figure out what's up.

Then I decide I should try this from a different browser to see if that has anything to do with it, so I go over to my wife's win2k box (which I normally dread to touch) and point it my apache 2 manual. Same result. I decide to check the manual on the official apache website to see how that behaves and make a wrong guess at the URL ...

My God! Some evil slimeware program has hijacked the 404s to take IE to their scummy advertising "search engine"!!! Now, my wife's not as paranoid as I am, but she is definitely an intelligent and aware computer user, so there's no way she installed anything "cute" or something. This scumware had to come totally stealthily, just as a side-effect of browsing. I am totally amazed that people can even use Windows, when evil marketing assholes compromising your computer is a typical everyday occurance. I'm sure it must cause many non-technical people to give up on the internet altogther.

Now, I know I can just run AdAware and get rid of this crap, but I was so amazed by this evil attack on my wife's computer that I expended some energy (fruitlessly) trying to figure out what happened.

... then the next day I had to post on advogato about all that. Ok, about that http client program I need to write ...

Addendum: A bold scam e-mail

Ok, I really should get back to work, but I got a scam e-mail pretending to be from eBay, and attempting to get me to fill in my personal information at Internet criminals are getting bolder and bolder these days! I've seen some of these for e-gold before, but no one uses e-gold, so that doesn't matter. Everyone uses eBay. I guess I'd better report this to the FTC -- I'm sure someone else will, but what if everyone just said that?


I get a lot of bogus MAILER-DAEMON messages these days, either from unsuccessful spam attempts which used my domain in the From address, or as a clever trick to send me spam. That led me to have this great idea (drumroll, please):

Why can't my mailserver remember what mails it actually sent, and figure out which bounce messages are legit? Bogus MAILER-DAEMON messages would get dropped on the floor. Optionally, three bogus ones from the same address could cause that address to be blocked.

Seems straightforward enough to me ... what would be the problems with doing that?

Trying to figure out IIIMF

I have installed the iiimf-htt-server and iiimf-htt-le-newpy packages, which are parts of the Internet/Intranet Input Method Framework. I want a Simplified Chinese input method which will allow me to send mail from Mozilla, and these packages ... hmmm ... are somehow related? I've been looking all over for documentation, though, and I can't figure out what the heck I need to do. It's possible I need to use a Chinese locale to get use the Chinese input method, but I certainly don't want to do that.

Does anyone know anything about this stuff?

