/dev/blogg

måndag 9 maj 2011

By the way... sqlite visualization script is on GitHub

SQLite is great, and I needed a script to visualise the schema of the data base I am using for the documentation. So, this meant the birth of sqlite2dot (https://github.com/dargosch/sqlite2dot) which takes an SQLite database and outputs a .dot file wich when vizualised using graphviz shows the tables and the foreign key constraints between these tables in a graph. Very handy.

Back to oriented programming...

I find myself going back to the object oriented programming paradigm more and more for my Tcl project. Just a very convenient way of encapsulating data, and it fits well with how I think about things too. And it maps well with the relational database idea.

But I refuse to do inheritance though. Just plain refuses..
Inheritence is evil!

fredag 3 december 2010

Putting my cheek out there...

I posted on comp.lang.tcl today. I need to make people see that documentation is important. This is what I posted:

...
I posted the initial question because I wanted the experts' point of
view. I am still learning, so I though it would be good to see what
the features are that keep people within the Tcl language.

However, I have to confess that when I saw that list, I saw a list
that sounded a bit "not updated" to me. I mean no disrespect, but as a
newcommmer to the language, I can't help but provide something of a
newbe's view of these points. I hope you'll see them as friendly
comments on something where I think more could be done:

* Rapid development
Certainly true, but up to a point. Finding good, solid information
about how to set up a development cycle for a package with large
number of internal dependencies is not easy (are there any?). For
instance, the "pkg_mkIndex" command seems to assume that all of the
parts of the package is there already. What do I do then when I just
need to check that the part I am working on right now works? I don't
need help on this really (I think I now have a good system for it) but
I need you to see that the process of actually figuring this very
simple and basic step of the development cycle is, to say the least,
opaque. The infrastructure for developing an extension through is much
more transparent.

* Graphical user interfaces
Yes, certainly true. Tk is a feature. Although, it is not only a
feature of Tcl anymore.

* Cross-platform applications
Yes, this is true. Although not unique to Tcl, the cross-platform
nature of the language is a strong feature, and not highlighted
enough. Many languages are "cross platform" but usually not as solidly
so.

* Easy to learn
Yes, well. This is true, up to a point. I think everyone is aware that
the syntax is a bit unorthodox in places, and that scares people. I
know that the rules are simple and consistent, but I have to say that
for a person with mainly C/Java/Python/Perl experience, you get
surprised quite often.

* Mature but Evolving
Yes, mature is a strong point. This should be highlighted more. I
think "stability" also goes into this point.

* Extend, Embed and Integrate
Yes, this is the main feature, and why I stay with Tcl for my current
project. I need cooperation with the R language, and the possibility
of embedding my code into other programs, and Tcl gives me that in an
efficient way.

* Deployment
Hmm, not sure that this is a strong point anymore. I think most
languages have a viable deployment option nowadays. I'm sure that
starkit/starpacks are more evolved than other solutions, but I don't
see a big surge of starpack-based applications. So, maybe this needs
to be proven?

* Testing
Sure, the testing harness and benchmarking command are quite nice.
Needs some documentation though... And, again, testing the accurate
functioning of a command loops back into the problems in setting up
the development environment properly.

* Network-aware applications
Certainly true, but not really unique. The event-based approach is
nice though.

* The Tcl community
Yes, I have gotten good response from this list.

* It's free!
Yes, well.

I would add the excellent handling of UTF-8 and encodings in general.
I see this as a major point - somewhere around 5/6th or something of
the earths population does not speak a language where the ascii
alphabet would be sufficient representation of the writing system, so
this is a strong point (ok, this is a very rough estimate based on the
number of native speakers of languages, so please do not consider this
statement as anything other than wild speculation and a ballpark
figure, so don't quote me on it).
Anyway, the no-fuss handling of UTF-8 should be highlighted more, I
think :-)

What about speed? Is speed a feature? Tcl is not included in, for
instance, the "Computer Language Benchmark Game" (http://
shootout.alioth.debian.org/). Maybe someone would be interested in
submitting code?

What about documentation? Is that a feature? I see some modules with
lots of documentation, and some with none. Maybe there is cause for a
standard API documentation procedure + a standard way of documenting
how to use the code? I suspect that the lack of a purposeful
documentation procedure makes people not put any effort into
documentation. It does not matter what the format is, as long as it is
accessible and produces nice results. Maybe the recent effort to get
Tcl capability into doxygen is the way to go (I've used Natural Docs
myself, successfully), as long as it affords the package developer to
write documentation on the usage of the _use_ of the code, not just
what each procedure does. I think it could be a feature, if the
community put focus into one effort.

What about interoperability with code developed in other languages? Is
this a feature? I see code for calling (or be called from) C (of
course), Java, Python and R (at least call). Are there others?

I think what I want to say here is that the list I found on the tcl.tk
did not sound current. There are issues, I think, with some of the
points put forward there, and there are other that I would stress
instead.
...

Hope noone is offended by this...

lördag 20 november 2010

Iterative thinking dulls the mind...

After having some more research online, I have come to the conclusion that developers are not always clever people. Maybe it's the iterative thinking that stands in the way of enlightenment..? Anyway, you can read some really stupid things online on this topic. Like this one:

"The problem with triggers is that they cannot be "seen." It is easy to view table relationships, constraints and indexes in the database. On the client-application side, it is also simple to analyze the code. On the other hand, triggers are automatic programs defined in the database that execute behind the scenes as part of the command that fired them. It is hard to follow their logic and in time, it is easy to forget about the triggers in SQL Server, especially if they are not well-documented.

Here are some trigger design tips to keep in mind:

Avoid using nested triggers
By default, if a trigger is changing other tables, the triggers declared for these tables are not fired. The "allow nested triggers" server option sets databases to have the opposite behavior. Triggers are nested when a trigger performs an action that initiates another trigger, which can initiate another trigger and so on. Triggers can be nested up to 32 levels. It is very difficult to follow the logic of nested triggers and they can affect performance.

Avoid using recursive triggers
There are two types of recursion:
- Direct recursion occurs when a trigger fires and performs an action that causes the same trigger to fire again. You can prevent that from happening by setting the "recursive trigger" database option to OFF.
- Indirect recursion occurs when a trigger fires and performs an action that causes a trigger on another table to fire. This second trigger causes an update to occur on the original table, which causes the original trigger to fire again. This can be prevented with the "nested triggers" server option."

I'm sorry, but that is just stupid. Ok, you get some of hte same issues as with threads using triggers, but if you plan it carfully, you should have no problem. Like in my example from my previous post, the creation of a transitive closure of a graph, nothing could logically go wrong. Worst case scenario, the TC will have been built for n-1 cases when the last trigger is executed. This means an entire run is done just because of one missing piece of the TC, but hey! Not a big problem, and really not dangerous. If SQLite featured "FOR EACH STATMENT" triggers, this would not even be a problem at all.

So why state that triggers should be avoided? Pointers are powerful and therefore also dangerous too. So, maybe they should not be used either? :-) Good luck developing software this way....

I think more functional programming thinking is needed, and less interative thinking.

tisdag 16 november 2010

Recursive triggers..

Iterative programming is too dominant. Some how, most programmers have been trained in using for/while loops, iterators, coroutines and similar constructs so sucessfully to see that they are not needed, or indeed not very elegant.

One case in point is computing the transitive closure of a graph in SQL. We can easilly find iterative solutions (like this one or even this one) which could easilly have been written in an interative language. SQL is not an interative language, and I see no point of using control constructs like that. All we really need is triggers.
(I'm using SQLite for this, other RDMS may work too :-) ):

So, this is what I am playing with:

PRAGMA recursive_triggers = 1;
--- SQlite verison > 3.6.18 is needed for this to work.

-- Clears the TC store and insert simple connections
-- This is the start point in the call sequence and is triggered by the user inserting data

create trigger ll_tcll_insert after insert on level_level for each row begin
delete from tc_level_level where depth > 1;
insert into tc_level_level select NEW.parent_id ancestor_id,NEW.child_id descendant_id,1 depth,NEW.parent_id || ',' ||NEW.child_id path from level_level;
end;

-- This trigger does the recursive TC building by merging of subgraphs
create trigger tcll_tcll_insert after insert on tc_level_level for each row begin
insert into tc_level_level select tc1.ancestor_id, tc2.descendant_id, tc1.depth + tc2.depth,tc1.path || substr(tc2.path, length(tc2.ancestor_id)+1) from tc_level_level tc1, tc_level_level tc2 where tc1.descendant_id = tc2.ancestor_id;
end;

-- This trigger clears the TC store and deletes the simple connection when a simple connection is deleted

create trigger ll_tcll_delete after delete on level_level for each row begin
delete from tc_level_level where depth > 1 or (OLD.parent_id = ancestor_id and OLD.child_id = descendant_id) ;
end;

-- This trigger is executed when something in the TC store is deleted. It runs exactly one TC run, so that teh main TC builing trigger is executed.

create trigger tcll_tcll_delete after delete on tc_level_level for each row when (select max(depth) from tc_level_level) = 1 begin
insert into tc_level_level select tc1.ancestor_id, tc2.descendant_id, tc1.depth + tc2.depth,tc1.path || substr(tc2.path, length(tc2.ancestor_id)+1) from tc_level_level tc1, tc_level_level tc2 where tc1.descendant_id = tc2.ancestor_id;
end;

These triggers combined results in the following example run (level_level links adjecent nodes together and the transitice closure (TC) is then stored in tc_level_level):

sqlite> select * from level_level;
sqlite> select * from tc_level_level;
sqlite> INSERT INTO "level_level" VALUES(1,2);
sqlite> INSERT INTO "level_level" VALUES(1,3);
sqlite> INSERT INTO "level_level" VALUES(3,4);
sqlite> INSERT INTO "level_level" VALUES(4,5);
sqlite> INSERT INTO "level_level" VALUES(2,5);
sqlite> select * from level_level;
parent_id child_id
---------- ----------
1 2
1 3
3 4
4 5
2 5
sqlite> select * from tc_level_level;
ancestor_id descendant_id depth path
----------- ------------- ---------- ----------
1 2 1 1,2
1 3 1 1,3
3 4 1 3,4
4 5 1 4,5
1 4 2 1,3,4
3 5 2 3,4,5
1 5 3 1,3,4,5
2 5 1 2,5
1 5 2 1,2,5
sqlite> delete from level_level where parent_id = 1 and child_id = 2;sqlite> select * from tc_level_level;
ancestor_id descendant_id depth path
----------- ------------- ---------- ----------
1 3 1 1,3
3 4 1 3,4
4 5 1 4,5
2 5 1 2,5
1 4 2 1,3,4
3 5 2 3,4,5
1 5 3 1,3,4,5
sqlite> delete from level_level where parent_id = 3 and child_id = 4;
sqlite> select * from tc_level_level;
ancestor_id descendant_id depth path
----------- ------------- ---------- ----------
1 3 1 1,3
4 5 1 4,5
2 5 1 2,5
sqlite> INSERT INTO "level_level" VALUES(1,2);
sqlite> select * from tc_level_level;
ancestor_id descendant_id depth path
----------- ------------- ---------- ----------
1 3 1 1,3
4 5 1 4,5
2 5 1 2,5
1 2 1 1,2
1 5 2 1,2,5

So, by using just a couple of triggers (actually, really just one for the TC computation, the rest of them are just maintenance triggers) we are able to get a TV computed. We even get the full path (which is the main reason for the complicated code sections above).

No for/while loops, and just simple clean code, written in SQLite SQL.

Quite neat really. Yay for recursion.

måndag 15 november 2010

In the beginning, there was nothing..

The point of this blog is to record thougths that came up while developing. Not much happening in that area right now, hence the empty space.

Stay tuned.