Older blog entries for mdupont (starting at number 24)

Progress on treecc and cwm.

Hacking CWM scripts to process the GCC rdf output of the introspector.

Soon we will have a script that will convert a c program into a skeleton RDF ontology!

Is it possible transform this CWM proof into a perl program that uses redland directly. Or a C++ program?

How can you make CWM dump all the data about the proof and how it is executing it so that it is possible to optimize it.

Maybe a trace of the execution in such a way that it is easy to recognise reoccuring elements. It should be possible to turn the proof into a Finite state automaton.

mike

I have been reading some of the other journal entries, and noticed that mine are not very well written in comparison.

Just been working on CWM for processing the n3 files from the introspector.

I think the next step for me is to extract data and creating ontologies. Instead of creating new code, we need to use the existing tools to understand the code of the target modules of the introspector.

The introspector at the current state is good enough to be used for creating automatic ontologies based on the structs, fields and functions information from the gcc.

There is no need to have a full ontology, just enough information so that one can relate the various models to each other.

mike

I am on an ontology warpath.

Finally something that allows pure declaration and no meat! I love it.

OWL is the vaporware backbone of the new millenium!

mike

GNU FUD : http://gcc.gnu.org/ml/gcc/2003-08/msg01136.html

"""" The tree and rtl dumps are intended as debugging aids only. There is no need for them to be complete, they only need to include info we need for debugging. Since we never try to process them, we would never notice if they were incomplete.

As for completing them, that is a potential problem. An FSF policy, intended to prevent people from subverting the GPL, prevents us from emitting debug/intermediate files that could be used by others to use proprietary code with gcc without linking to gcc. This is an inconvenience, but it is current FSF policy so we must respect it. """"

What a primitive and unproductive policy.

What ever happened to xml term?

http://www.xml.com/pub/a/2000/06/07/xmlterm/

http://sourceforge.net/projects/xmlterm/

Why cannot we replace printf with something that emits RDF or a semantically marked up stream of data that identifies the variable printed, the format used, the datatype and the context of that emission?

Then it would be really easy to make really nice xmlterms and other funky things.

20 Aug 2003 (updated 20 Aug 2003 at 14:48 UTC) »

I have been thinking about the idea of the introspector and the relevance of rdf.

If we were to see the entire set of compiler toolchain as a set of rdf consumers and producers, what would be the result?

Each program would read in rdf and emit rdf. Each algorithm will add in new predicates into the soup.

For example : The graph layout tool would add in x y positions.

But what about context? How can we represent the partitioning of rdf data?

What about the idea of multiple view on the same base data? A View would be a context. When a object is viewed it occurs in that context with a viewed predicate.

more to come

20 Aug 2003 (updated 20 Aug 2003 at 11:26 UTC) »

Sent to the GCC List

--- Joe Buck <jbuck@synopsys.com> wrote: > On Tue, Aug 19, 2003 at 12:09:13AM -0700, James Michael DuPont wrote:

> > Dear all,

> > Ashley winters and I have produced what is the first prototype for

> a

> > gcc ontology. It describes the facts extracted

from the gcc using > the

> > introspector. >

> Merriam-Webster defines ontology as follows: > > 1 : a branch of metaphysics concerned with the nature and relations > of being

> 2 : a particular theory about the nature of being or the kinds of > existents

> > I don't think that this is the right term for a piece of code that > produces XML output from a high-level language input.

> Your right. The ontology here is a description of the gcc tree nodes in a very high level that allows you to *understand* the RDF/XML output .

The code that produces this data that matches this onto. is in my cvs, very boring stuff.

I think that this onto. is interesting because it allows you to express the high level semantics of the tree structures in a somewhat implement ion independent manner.

When this is all done and %100 tested we should be able to generate sets of c functions to process data from that ont, databases to store it, and other things like perl and java classes to process the data as well.

n3 coupled with CWM and Euler makes a logical programming language like prolog, you can express data schemas but also proofs, filters and algorithms in n3 /RDF format as well.

I hope that the proofs expressed in CWM and Euler can be translated automatically into new c functions of the compiler in the very very long term.

In any case this ont is meant to be human readable and editable, even if not very pretty. Later on in a lowel level ont. it will contain mapping to the exact gcc structures and functions that implement these ASTS.

In any case this ONT should be of interest and value to anyone wanting to study the gcc ASTS, not just someone who wants to deal with any external represention.

The proofs expressed in n3 should be executable directly on the gcc data structures in memory without any direct external represention when we are able to map out all the data structures and generate binding code.

Then users will be able to write custom filter, algorithms and rules that run inside the gcc for them on their own programs.

19 Aug 2003 (updated 19 Aug 2003 at 14:01 UTC) »

Regarding this file : http://introspector.sourceforge.net/2003/08/16/introspector.n3

Basically this is a high level class model for the GCC internal tree structures as used by the c and (not complete C++) compiler.

The file are based on the OWL[1] vocabulary, which is an RDF[2] application that allows the syntax to be described in RDF/XML[3], n3[4] or ntriples[5] format.

""""The Web Ontology Language OWL is a semantic markup language for publishing and sharing ontologies on the World Wide Web. OWL is developed as a vocabulary extension of RDF (the Resource Description Framework) and is derived from the DAML+OIL Web Ontology Language. """"

This file is describing the data extracted by the introspector [0] from the gcc. The format of the file is closly related to the -fdump-translation-units format, but more usable. I patched the gcc using the Redland RDF Application framework [8] to serialize these tree dump statements into RDF statements using the berkley db backend for fast storage.

The DB is then available for querying using C/C++, JAVA, PERL, Python, and many other interfaces via the Redland Swig interface. Even more you can filter out interesting statements into RDF/XML format for interchanging with other tools.

You can find an example file extracted from the source code of internals of the pnet runtime engine here [9].

The ontology file is basically a powerful class model, you can use many tools to edit and view them, (which i have not tried most of them) TWO of them are the rdfviz tool and owl validator[10]

I used the Closed World Machine [6] from Tim Berners-Lee to process and check this file, that tool along with the EulerSharp[7] that I am working on will allow you to run queries, filters and proof over the data extracted from the gcc.

Futher still, my intent is to embedded a small version of the Euler machine into the gcc and dotgnu/pnet to allow proofs to be made at compile time.

mike

[0] Introspector - introspector.sf.net

[1] OWL - http://www.w3.org/TR/owl-ref/

[2] RDF - http://www.w3.org/RDF/

[3] RDF/XML http://www.w3.org/TR/rdf-syntax-grammar/

[4] n3 http://www.w3.org/2000/10/swap/Primer

[5] ntriples http://www.w3.org/2001/sw/RDFCore/ntriples/

[6] CWM from timbl http://www.w3.org/2000/10/swap/doc/cwm.html

[7] Eulersharp http://eulersharp.sourceforge.net/2003/03swap/

[8] Redland http://www.redland.opensource.ac.uk/

[9] Example n3 file http://demo.dotgnu.org/~mdupont/introspector/cwm.rdf.gz

[10] RDFVIZ and validator http://www.ilrt.bristol.ac.uk/discovery/rdf-dev/rudolf/rdfviz/

http://owl.bbn.com/cgi-bin/vowlidator.pl

I have poste the first version of the introspector ontology here http://introspector.sourceforge.net/2003/08/16/introspector.n3

please review

15 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!