Name: Alan James Salmoni
Member since: 2004-12-14 09:38:36
Last Login: 2008-05-14 17:24:17
Homepage: http://www.cardiff.ac.uk/medicine/dermatology/staff/salmoni
Notes:
I am someone who likes open source and free software so much that I released my own stuff. My programs are SalStat, a Python and wxPython based application for statistical analysis, and TrackBrowser, a web browser designed to record user behaviour. I built the latter for my professional work. I use Ubuntu for my Linux development.
In real life, I research education and human-computer interaction (usability stuff). I received my doctorate from Cardiff University in January 2005 (my thesis was accepted in 2004) and my degree in psychology from the University of Bristol. I'm also supposed have a postgrad diploma in research methods and statistics somewhere, but I'm not sure where it's gone. My thesis discussed problems that people have in searching for information on the Internet but with a high degree of ecological validity and application.
I work at Cardiff University in the Wales College of Medicine (Department of Dermatology) where I am investigating medical education under a post-doctoral research fellowship. We have lots of cool e-learning things that I am applying myself to.
And how would I describe myself? If I had to use five words I would say creative, perceptive, intelligent, enthusiastic, curious. Of course, I could also use impatient (with bad interfaces at least!), over-enthusiastic, arrogant, single-minded, dispersed. The last refers to me always taking on too many projects at once.
And the development is going well though I have been stuck a lot on importing data. However, the tool is extremely flexible and useful - and it's great for merging data from different sources into one unified dataset which is something I think advanced users will appreciate.
I have also been trying to work on the interactive results without too much luck and have instead asked the opinions of the very knowledgeable people on the wxPython mailing list. They seem to come up with extremely helpful answers, but why not ask here?
My situation is this: I have a wxHTML frame displaying HTML results. These need to be dynamic - users will be able to select options that will mean the HTML needs to be changed and then redisplayed. The best way I can think of dealing with this is just to get the HTML (stored in a temporary memory file system) and remove the old code and insert the new code in its place and then re-display it. Does this seem like too much of a bad hack?
I just wasted most of a day trying to sort out the data import GUI and problems with sizers. It was quite frustrating, but I managed to get most of the problems sorted out finally. It is now connecting to various databases and showing a sample of data which users can browse and select what they want to import from.
Oh, and it imports the variables too which is good. It is so nice when problems eventually finish. I have lots more work to do tomorrow (csv importing - I wrote my own csv module to deal with little problems like missing data in the middle of a row) but I am also going to my wife's family's village for a fiesta. It's been raining all day, so here's hoping the weather improves. Here's a picture of the village in sunnier times.
Either way, the work is coming along really nicely now. The project is not yet 50% finished (my estimate), but it already imports data from databases, allows a range of operations on them, and can produce even complex descriptive analysis. It's looking good so far.
In other work, I'm managing to tame wxPython again and am producing a consistent and simple interface for importing data from different sources (databases, spreadsheets, text files). It could form the basis of a data manager, but it's all for the statistics program which is itself coming along.
The program has an interactive interpreter which is fun: it's all based on Python's 'code' module and I've organised it so that users can import data with awkward field names (like: 'Variable (1) & Variable (2) mixed'), and they can still be used on the command line, thus:
Variable (1) & Variable (2) mixed.mean
Not a big change, but it's one less thing to explain to demanding users. The work on the main GUI is still ongoing (choosing a test is the hardest thing) but we're getting there.
The thing will be released under the Affero GPL license so it's even relevant to this site.
In administration things, I managed to get some more marketing research done (all promising but lots of things to think about), and the company is getting closer to being officially founded. It's all very exciting stuff.
I have a questionnaire here if anyone feels like completing it: it should take about 5-10 minutes and concerns people who use computers to perform statistical analysis. I cannot offer any money in return (we have zero investment - any offers will be carefully considered!), but it would be extremely helpful in getting open source to the top.
The questionnaire is at Survey Monkey. TIA to anyone who completes it.
The stats program is coming on very well thank you! It now handles different types of data with ease and total transparency for the user. The output is nicely formatted and looks good. The architecture seems to be about right and we're looking to release the first version (obviously a beta) maybe next week. It will be under the AGPL because we intend to put in networking capabilities. Complex data analysis through a web-browser? Heh, why not?
Uraeus - I've tried noise-cancelling headphones on long flights and found them to be quite good. If they're the "tighter" ones (ie, the ones that fit tight around the ears) they can also help preventing ears from popping.
My business partner and I are going to form a company which will concentrate upon statistics software. Our product will be called Ecstatistics which is a seriously good update of my old project SalStat. It differs in that:
a) It will be able to read data from CSV files, databases (a whole range), and spreadsheets. We plan to import SPSS and SAS files too as well as any other format we can code for;
b) It will output to a range of formats (PDF, OOo, databases, MS Office, HTML). The HTML is interesting because it will allow online analysis;
c) It will have a nice range of tests;
d) It will have a great graphing / charting capability;
e) It will be modular and easy to upgrade;
f) It will be far more usable than existing programs;
g) And of course, it will be open source;
Our plan is to get the product working (the database browser does already quite nicely) and produce a version for the OLPC project. Some people have asked for a stats program that works there already and it makes sense to equip students with (possibly) the most useful tool in scientific research: statistical analysis. So far, it can import from a range of databases and analyse the data descriptively. Output is only text for now and interaction is via a custom interactive interpreter, but it's early days yet. From what we've read, the important thing is to get something released and we hope to do that very soon.
Ecstatistics is coded in Python with NumPy, SQLAlchemy, SQLite and lots of other stuff. Because of this, we can code the OLPC version down to about 200k which competes extremely well with the opposition like R, SPSS and SAS. The interface is designed not just to be useful but also to instill good statistical practice, so it's educational too.
The interface will be designed with non-expert users in mind, particularly students. We aren't aiming at calloused statisticians; they have their favourite tools (and often write them for themselves anyway). We are aiming at all those people who have to do stats but don't like it.
In other news, I saw a couple of laptops here in the Phils in a major chain of electrical stores. They came with Linux preinstalled which was nice to see.
Finally, but most importantly, my wife had her scan earlier this week - we're expecting a little baby girl! The due date is the end of July and we're both very excited.
salmoni certified others as follows:
Others have certified salmoni as follows:
[ Certification disabled because you're not logged in. ]
FOAF updates: Trust rankings are now exported, making the data available to other users and websites. An external FOAF URI has been added, allowing users to link to an additional FOAF file.
Keep up with the latest Advogato features by reading the Advogato status blog.
If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!