3 Feb 2008 olecom   » (Observer)


,--
| [1] ftp://flower.upol.cz/dts/Sed0000/hacks/strip-c.sh
`--

Viewing this in the browser window can scare anybody. That's why noone so far read whole links in prev. post. But let's put away the fact, that open source browser doesn't support syntax highlighting of the most basic thing -- shell language and UNIX tools, available for 20-30 years.


,--
| http://kerneltrap.org/Linux/Decoding_Oops
`--

Here is the main point of the kernel developmnet: understanding, knowledge, ways of debugging. Starting point is assembly.

Text-processing assembly language is sed. Userspace assembly is shell (mainly POSIX, but not ksh or bash). Both are in very-very upsetting state.


,--
|[3] http://lxr.linux.no/linux/scripts/cleanfile 3492 bytes
|[4] http://lxr.linux.no/linux/scripts/cleanpatch 5132 bytes
|
|I've came up with this one:
|
|ftp://flower.upol.cz/clean-whitespace/clean-whitespace.sh
`--
Looking on 1024 bytes of this one, it is clear how syntax highlighting is important in understanding and developing of BRE+sed+shell tools.

So, what is the progress?

First, shell language is very complicated in terms of parsing. I don't know if any academia compliler theory professors have actually looked at this one. Kenneth Almquist, who wrote ash, was from EDU domain, but he have done a practical implementation.

I must edit this post, to say: "What an utter crap!" An Introduction to ANTLR: A parser toolkit for problems large and small

Recursive nesting of subshells in ``,$(), quoting are main features. Even using them is not such trivial thing at all. But i saw trivial scripting isn't that easy for some kernel developers(script in 47128D76.5050407@garzik.org).


#!/bin/sh -e

#
# usage: mkmsg [branch] [output text file]
#

test "$1" && BRANCH=$1 || { echo "Empty BRANCH,
exiting.";
exit 1; }

  if test  "$2"
then exec >"$2"
else : "Print to stdout"
fi

# should work much better than 2 fork()s
REPO=${PWD##*/}

echo "Please pull from '$BRANCH' branch of
master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/$REPO.git
$BRANCH

to receive the following updates:
"
git diff -M --stat --summary master..$BRANCH
echo
git log --no-merges master..$BRANCH | git shortlog
git diff master..$BRANCH

So, with this one it is even possible to use script interactively and/or in the pipe chain to mail stdout directly, as addition to better, i.e. efficient, usage of the system and the kernel...

Second. While trying to implement highlighting i've discovered, that kernel doesn't apply background attribute to the tab symbol. I.e. tabs are always black, no matter what background color is there. This led me to more deep understanding of the tty processing, mainly tabstops (theory can be found in UNIX Power Tools).

Problem can be solved by converting tabs to spaces. But knowing, what tabstops are, it's not such trivial thing to do. In clean-whitespace.sh i've used `expand` UNIX tool. Now i wanted to do it in sed.

Andreas Schwab is making very interesting, mostly one-line comments in LKML (i even checked old archives of 90s). Once he pointed out, that i wrongly explained "\{\}" BRE construction. While i knew what it means, i really didn't used them at all. But after that occasion i sat down and did read+exercise. This helped me to do my version of `expand`.

Funny, that this[0] turned to be even more efficient, that one from recognised sed guru Greg Ubben[1].

[0] ftp://flower.upol.cz/dts/Sed0000/hacks/expand.sed.sh

4 ("3 basic" + "1 speed optimization") op-lines, 2 numbers, no loops

[1] http://sed.sourceforge.net/local/scripts/expand.sed.html

6 (5 + 1) op-lines, 2 numbers, loop

@(#)14apr89/31aug01 expand.sed by Greg Ubben

means, that in more than 10 years there was no better solution, heh. Also coloring is a bit broken: openning "\(" and closing "\)" do not match.

I hope, i have no errors there. Checking is left to the reader.

Thus, progress on such basic thing is kind of unexpected. After understanding of achievement, i was depressed. More depression followed, when i've discovered original ash/atty sources.


-- comming next --
shell, tty, text user interface
-- 
-o--=O`C
 #oo'L O
<___=E M

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!