Older blog entries for ralsina (starting at number 518)

Smiljan, a Small Planet Generator

I maintain a couple of small "planet" sites. If you are not familiar with planets, they are sites that aggregate RSS/Atom feeds for a group of people related somehow. It makes for a nice, single, thematic feed.

Recently, when changing them from one server to another, everything broke. Old posts were new, feeds that had not been updated in 2 years were always with all its posts on top... a disaster.

I could have gone to the old server, and started debugging why rawdog was doing that, or switch to planet, or look for other software, or use an online aggregator.

Instead, I started thinking... I had written a few RSS aggregators in the past... Feedparser is again under active development... rawdog and planet seem to be pretty much abandoned... how hard could it be to implement the minimal planet software?

Well, not all that hard, that's how hard it was. Like it took me 4 hours, and was not even difficult.

One reason why this was easier than what planet and rawdog achieved is that I am not doing a static site generator, because I already have one so all I need this program (I called it Smiljan) to do is:

  • Parse a list of feeds and store it in a database if needed.
  • Download those feeds (respecting etag and modified-since).
  • Parse those feeds looking for entries (feedparser does that).
  • Load those entries (or rather, a tiny subset of their data) in the database.
  • Use the entries to generate a set of files to feed Nikola
  • Use nikola to generate and deploy the site.

So, here is the final result: http://planeta.python.org.ar which still needs theming and a lot of other stuff, but works.

I implemented Smiljan as 3 doit tasks, which makes it very easy to integrate with Nikola (if you know Nikola: add "from smiljan import *" in your dodo.py and a feeds file with the feed list in rawdog format) and voilá, running this updates the planet:

doit load_feeds update_feeds generate_posts deploy

Here is the code for smiljan.py, currently at the "gross hack that kinda works" stage. Enjoy!

# -*- coding: utf-8 -*-
import codecs
import datetime
import glob
import os
import sys

import feedparser
import peewee

class Feed(peewee.Model):
    name = peewee.CharField()
    url = peewee.CharField(max_length = 200)
    last_status = peewee.CharField()
    etag = peewee.CharField(max_length = 200)
    last_modified = peewee.DateTimeField()

class Entry(peewee.Model):
    date = peewee.DateTimeField()
    feed = peewee.ForeignKeyField(Feed)
    content = peewee.TextField(max_length = 20000)
    link = peewee.CharField(max_length = 200)
    title = peewee.CharField(max_length = 200)
    guid = peewee.CharField(max_length = 200)

Feed.create_table(fail_silently=True)
Entry.create_table(fail_silently=True)

def task_load_feeds():

    def add_feed(name, url):
        f = Feed.create(
            name=name,
            url=url,
            etag='caca',
            last_modified=datetime.datetime(1970,1,1),
            )
        f.save()

    feed = name = None
    for line in open('feeds'):
        line = line.strip()
        if line.startswith('feed'):
            feed = line.split(' ')[2]
        if line.startswith('define_name'):
            name = ' '.join(line.split(' ')[1:])
        if feed and name:
            f = Feed.select().where(name=name, url=feed)
            if not list(f):
                yield {
                    'name': name,
                    'actions': ((add_feed,(name, feed)),),
                    'file_dep': ['feeds'],
                    }
                name = feed = None

def task_update_feeds():

    def update_feed(feed):
        modified = feed.last_modified.timetuple()
        etag = feed.etag
        parsed = feedparser.parse(feed.url,
            etag=etag,
            modified=modified
        )
        try:
            feed.last_status = str(parsed.status)
        except:  # Probably a timeout
            # TODO: log failure
            return
        if parsed.feed.get('title'):
            print parsed.feed.title
        else:
            print feed.url
        feed.etag = parsed.get('etag', 'caca')
        modified = tuple(parsed.get('date_parsed', (1970,1,1)))[:6]
        print "==========>", modified
        modified = datetime.datetime(*modified)
        feed.last_modified = modified
        feed.save()
        # No point in adding items from missinfg feeds
        if parsed.status > 400:
            # TODO log failure
            return
        for entry_data in parsed.entries:
            print "========================================="
            date = entry_data.get('updated_parsed', None)
            if date is None:
                date = entry_data.get('published_parsed', None)
            if date is None:
                print "Can't parse date from:"
                print entry_data
                return False
            date = datetime.datetime(*(date[:6]))
            title = "%s: %s" %(feed.name, entry_data.get('title', 'Sin título'))
            content = entry_data.get('description',
                    entry_data.get('summary', 'Sin contenido'))
            guid = entry_data.get('guid', entry_data.link)
            link = entry_data.link
            print repr([date, title])
            entry = Entry.get_or_create(
                date = date,
                title = title,
                content = content,
                guid=guid,
                feed=feed,
                link=link,
            )
            entry.save()
    for feed in Feed.select():
        yield {
            'name': feed.name.encode('utf8'),
            'actions': ((update_feed,(feed,)),),
            }

def task_generate_posts():

    def generate_post(entry):
        meta_path = os.path.join('posts',str(entry.id)+'.meta')
        post_path = os.path.join('posts',str(entry.id)+'.txt')
        with codecs.open(meta_path, 'wb+', 'utf8') as fd:
            fd.write(u'%s\n' % entry.title.replace('\n', ' '))
            fd.write(u'%s\n' % entry.id)
            fd.write(u'%s\n' % entry.date.strftime('%Y/%m/%d %H:%M'))
            fd.write(u'\n')
            fd.write(u'%s\n' % entry.link)
        with codecs.open(post_path, 'wb+', 'utf8') as fd:
            fd.write(u'.. raw:: html\n\n')
            content = entry.content
            if not content:
                content = 'Sin contenido'
            for line in content.splitlines():
                fd.write(u'    %s' % line)

    for entry in Entry.select().order_by(('date', 'desc')):
        yield {
            'name': entry.id,
            'actions': ((generate_post, (entry,)),),
            }


Syndicated 2012-04-16 23:12:00 from Lateral Opinion

Nikola 2.1.1 + GitHub

By popular request, Nikola now has its source code at GitHub.

Also, if you tried version 2.1 and it failed, try 2.1.1, because I forgot to add a couple of files in one of the themes in 2.1.


Syndicated 2012-04-14 11:33:00 from Lateral Opinion

Nikola 2.1 out!

Released version 2.1 of Nikola, my static blog/site generator. More information at the Official Nikola Blog in the Official Nikola Site which are, of course, done with Nikola :-)


Syndicated 2012-04-13 18:13:00 from Lateral Opinion

Alimento para la Culpa

Spanish-only post, sorry!


Hace como dos años escribí un pedazo de un libro. De vez en cuando lo miro con cariño y pienso que estaría bueno terminarlo, o redefinirlo y cerrarlo, y cosas así. Hasta hice un plan que nunca pude poner en práctica porque la vida te lleva a hacer cosas distintas.

Resulta que están usando mi humilde cacho de libro como material de estudio en la materia IWI-131 Programación de computadores - 1er Semestre 2012 de la Universidad Técnica Federico Santa María en Chile.

Por un lado, me pone contento. Por otro lado me pone nervioso que un libro que cita esto sea material de estudio para chicos de 18:

Hasta que cumple veinticinco, todo hombre piensa cada tanto que dadas las circunstancias correctas podría ser el más jodido del mundo. Si me mudara a un monasterio de artes marciales en China y estudiara duro por diez años. Si mi familia fuera masacrada por traficantes colombianos y jurara venganza. Si tuviera una enfermedad fatal, me quedara un año de vida y lo dedicara a acabar con el crimen. Si tan sólo abandonara todo y dedicara mi vida a ser jodido.

—Neal Stephenson (Snow Crash)

Por otro lado más, me da ganas de terminarlo. Por el último lado (con lo cual lo que tengo es una sensación cuadrangular), pienso: que raro sería que te den un libro en la facu que está todo incompleto. Es como hacerse fan de The Event y nunca enterarse que era el famoso Event, como haber visto Twin Peaks y nunca sacarse de encima la dudas de qué era toda esa bizarreada. Como seguir todavía esperando que Mel Brooks haga la segunda parte de la historia del mundo para poder entender "Jews in Space".

Ahora podría no terminarlo como decisión artística!

Y por supuesto:


Syndicated 2012-04-13 10:40:00 from Lateral Opinion

Visitors from Strange Lands Arrived

http://lateral.netmanagers.com.ar/galleries/random/visitors.png

Strange lands indeed.


Syndicated 2012-04-13 00:25:00 from Lateral Opinion

Dogfooding a new theme

I am doing a second theme for Nikola, and what better way to test it than migrating this site to it!

In the process, I stopped using custom templates here, organized how a theme is done, how translations work, and other stuff.

Comments about broken / untranslated / missing stuff are much appreciated!


Syndicated 2012-04-12 21:21:00 from Lateral Opinion

Nikola 1.2 is out!

Version 1.2 of Nikola, my static site generator and the software behind this very site, is out!

Why build static sites? Because they are light in resources, they are future-proof, because it's easy, because they are safe, and because you avoid lockin.

New Features:

  • Image gallery (just drop pics on a folder)
  • Built-in webserver for previews (doit -a serve)
  • Helper commands to create new posts (doit -a new_post)
  • Google Sitemap support
  • A Handbook!
  • Full demo site included
  • Support for automatic deployment (doit -a deploy)
  • Client-side redirections

And of course the old features:

  • Write your posts in reStructured text
  • Clean, customizable page design (via bootstrap)
  • Comments via Disqus
  • Support any analytics you want
  • Build blogs with tags, feeds, feeds for your tags, indexes, and more
  • Works like a simple CMS for things outside your blog
  • Clean customizable templates using Mako
  • Pure python, and not a lot of it (about 600 lines)
  • Smart builds (doit only rebuilds changed pages)
  • Easy to extend and improve
  • Code displayed with syntax highlighting

Right now Nikola does literally everything I need, so if you try it and need something else... it's a good time to ask!

More info at http://nikola-generator.googlecode.com


Syndicated 2012-04-09 21:41:00 from Lateral Opinion

Senses

While walking along the river before dawn I laid down on a bench and looked up, and saw the tree, clear and green against the orange clouds in the night sky, and thought, hey, that looks cool, and tried to take a picture.

The screen in my camera stayed obstinately black. I changed settings, moved ISOs, touched on different places trying to convince it to focus and set aperture for the darkest or the lightest areas of what I knew to be there.

And it remained black. And suddenly, I had a dissenting opinion, that there was not a clear green tree there, and that the sky was not full of orange clouds, but that it was all black, starless and empty, empty of tree, of cloud.

I placed my hand above the camera, hoping to catch a glimmer of it, and still, the display was a square of darkness separating my fingers from my arm, as empty as before, mocking me featureless.

Why was it so black, if I could see clearly. If there were lampposts giving light, and I could see clearly, and there was a tree. I knew the camera worked. What was I doing, by the river, at 4AM, on a tuesday, laying on a bench, looking up, with a camera?

You expect your senses to work. You expect to perceive what is there, and not perceive what is not. You expect to see reality, to not see irreality, to listen to things, to not listen to unthings, to touch truth, to smell shit.

What would happen if you had two sets of senses, two visions, and they disagreed, and you were not sure which one to trust, which one is right, which one is true? What would happen if the camera was right and my eyes were wrong, and I was actually not seeing, but imagining, and the truth was empty, and the tree was not there, and the sky was black.

Then I enabled flash, and the ugly picture convinced me to, someday, get a better camera, and never forget to take my gastritis medicine when going for trips on isolated locations.


Syndicated 2012-04-04 20:09:00 from Lateral Opinion

Nikola 1.1 is out!

A simple yet powerful and flexible static website and blog generator, based on doit, mako, docutils and bootstrap.

I built this to power this very site you are reading, but decided it may be useful to others. The main goals of Nikola are:

  • Small codebase: because I don't want to maintain a big thing for my blog
  • Fast page generation: Adding a post should not take more that 5 seconds to build.
  • Static output: Deployment using rsync is smooth.
  • Flexible page generation: you can decide where everything goes in the final site.
  • Powerful templates: Uses Mako
  • Clean markup for posts: Uses Docutils
  • Don't do stupid builds: Uses doit
  • Clean HTML output by default: Uses bootstrap
  • Comments out of the box: Uses Disqus
  • Tags, with their own RSS feeds
  • Easy way to do a blog
  • Static pages outside the blog
  • Multilingual blog support (my own blog is english + spanish)

I think this initial version achieves all of those goals, but of course, it can be improved. Feedback is very welcome!

Nikola's home page is currently http://nikola-generator.googlecode.com


Syndicated 2012-03-30 22:59:00 from Lateral Opinion

Unicode in Python is Fun!

As I hope you know, if you get a string of bytes, and want the text in it, and that text may be non-ascii, what you need to do is decode the string using the correct encoding name:

>>> 'á'.decode('utf8')
u'\xe1'

However, there is a gotcha there. You have to be absolutely sure that the thing you are decoding is a string of bytes, and not a unicode object. Because unicode objects also have a decode method but it's an incredibly useless one, whose only purpose in life is causing this peculiar error:

>>> u'á'.decode('utf8')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe1'
in position 0: ordinal not in range(128)

Why peculiar? Because it's an Encode error. Caused by calling decode. You see, on unicode objects, decode does something like this:

def decode(self, encoding):
    return self.encode('ascii').decode(encoding)

The user wants a unicode object. He has a unicode object. By definition, there is no such thing as a way to utf-8-decode a unicode object. It just makes NO SENSE. It's like asking for a way to comb a fish, or climb a lake.

What it should return is self! Also, it's annoying as all hell in that the only way to avoid it is to check for type, which is totally unpythonic.

Or even better, let's just not have a decode method on unicode objects, which I think is the case in python 3, ad I know we will never get on python 2.

So, be aware of it, and good luck!


Syndicated 2012-03-30 13:58:00 from Lateral Opinion

509 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!