28 May 2002 lindsey   » (Journeyer)

On Academic Computer Science

Whilst I was back home a couple of weeks ago, cmiller asked whether I was as dissatisfied with Computer Science Academia as my advogato entries seem to indicate.

I've just completed my first year of graduate school in Computer Science at UNC Chapel Hill. It's been a lot of fun, but I was alarmed to realize a few facts:

  • CS grad students aren't like top CS undergrads. I was surprised to learn that many of the grad students I've met didn't study CS as undergraduates; that many of them have only been using computers for a few years; and that many of them don't even particularly like programming. My theory to explain the difference between CS undergrads and CS grad students is this: most CS undergrads who are genuinely good at making computers do things find gainful and satisfying employment.

  • ``Computer Science is embarrassed by the Computer'' There is a strong current through CS that is at once irritated by the difficulties of real-world implementations, and seeks to trivialize the value of making computers do things. In the worst cases, PhD Computer Scientists can be little more than mathematicians, ready and willing to burn years on thought experiments and research based on apparently- arbitrary assumptions. To these theoreticians, it's more important to fully explore every point along every dimension of an old problem than it is to find a new problem which might have a helpful solution. (Some other Computer Scientists claim that this isn't really CS.) Theoreticians seem to be afraid of tackling the details of real-world systems.

    I've noticed that if a professor carries a laptop, he's likely to be more realistic and do more-interesting things than otherwise.

  • Computer Science trivializes important problems. For example, Computer Security has not been considered a significant issue within CS. Nor has large-scale system configuration. These problems, CS people have said to me, are just issues of proper software engineering practice. But both of these are leading causes of operational failure in existing systems.

    Lesson: Some fields which deserve concerted CS research have not been embraced by the CS community. My theory is that CS has really solved a problem when it's not causing a problem any more for real systems and their real users. That is, it's not enough for CS types to publish some abstract descriptions of how the world should be.

So, then, why did I still enjoy a year of it, and intend to continue on further?

  • I get to have great stuff explained to me. In my first year, I've had professors carefully explain to me how some of the best-designed systems in all of CS work: the domain name system (DNS); the network file system (NFS), the Andrew File System (AFS), HTTP, TCP Congestion Control, IP, TCP, and UDP protocol operation, DiffServ, IntServ, Class-Based Queueing (CBQ), online Real-Time scheduling algorithms, MPEG audio (mp3) and video encoding, JPEG video, RTP, NTSC video encoding -- and others.

    Of course, I could have read about these myself -- and I already knew about some of them -- but there's something great about having someone carefully explain a system or an algorithm with the aid of slides, and to know that they're explaining something that's actually useful because I use them.

  • CS course provide motivation to do good things. Even CS practitioners like me can fall for the trap of believing that something is easy because we can conceptually imagine an approach to it. These are all of the programs that I imagine that I know how to write, but I don't actually go to the effort to write them. CS professors actually make me wrote some of those programs, and wrestle with the details inherent.

    I learned from undergrad CS programming assignments, but rarely very much. They weren't very hard; most of them took four hours or less, even in the senior-level classes. By constrast, graduate level programming assignments can be rough -- some take weeks of daily effort. But I implemented some neat stuff -- many parts of an operating system including distributed IPC, scheduling, and process management; a framework for distributed, fault-tolerant applications; two real-time scheduling simulators; a web server (in C); an RPC-accessible cache server; and a model checker for software validation (in ML).

  • I'm learning about scientific evaluation. CS is about identifying problems, developing solutions, and showing that the solutions actually solve the problem. This later part helps to make CS a science, but it's not something that we do naturally.

    It's about claims: if I claim that this program does something, I need to be prepared to prove it. And not just prove it to a willing accomplice. I need to be able to show that a system necessarily has certain properties. This is done with analytical proof (by developing theoretical models about which properties can be logically deduced); with simulations, and with actual experiments.

    The quality of the simulation or of the experiment matters, too -- even a skeptic should not be able to argue that my results are invalid. And the quality of your evaluation matters, too: I'm learning to be skeptical, and to require that you substantiate your claims with quality evaluation.

    Having good ideas is just the beginning.

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!