This are my notes in the fields of computer science and technology. Everything is written with ABSOLUTE NO WARRANTY of fitness for any purpose. Of course, feel free to comment anything.

Monday, April 20, 2009

Count distinct values in all columns of a table

The following rake task will show the number of different values of each column of a given table in the database. I wrote it to see statistics on which columns are really used.
namespace :db do
desc "Count distinct values for each non empty column of a table (TABLE=xxx) "+
"optionally using a condition (WHERE=\"yyy\")"
task :count_distincts => :environment do
raise "Specify option TABLE=<table_name>" unless ENV["TABLE"]
puts "Distinct values in table #{ENV['TABLE']}:"
puts "(condition: where #{ENV['WHERE']})" if ENV['WHERE']
c = ActiveRecord::Base.connection
columns = c.select_all("describe #{ENV['TABLE']};").map{|a|a["Field"]}
columns.each do |col|
sql = "select count(distinct #{col}) as c from #{ENV['TABLE']}"
sql << " where #{ENV['WHERE']}" if ENV['WHERE']
n = c.select_one(sql)['c']
puts "#{col}: #{n}" unless n == "0"
STDOUT.flush
end
end
end

Thursday, April 16, 2009

assert.h

Assertions: you express your expectations in a certain position of the code, and this way you avoid bugs that would otherwise very difficult to trace.

In C use the standard library macro assert. You must #include <assert.h>, then you assert(expression); in the code, and if the expression is evaluating to 0 / false, the program aborts with an error msg printed on sterr that tells you which assertion fails and where it is (file/line).

It is also possible to turn assertions off, adding a #define NDEBUG before the #include <assert.h>.

Tuesday, April 14, 2009

Why main is int and not void.

The standard says so. And it actually makes a difference: here is a detailed discussion about this topic: http://users.aber.ac.uk/auj/voidmain.shtml

Why do you have to pass scanf the address of the variable to write in?

As I was learning C the first time, I remember I was asking this myself... I would have preferred it to return the scanned values, something like:

myVar = scanf("%s") /* don't do this :) */

but of course, in this case (1) you could assign only 1 variable, (2) you would have not had the return value (number of read items)...

Here is some discussion of it: ...

Turn colors on in vi/vim

If the color syntax highlighting is off, you can turn it on by editing (or creating) ~/.vimrc, adding the following line:

:syn on

Monday, April 13, 2009

square root

When I have some time, I will have a look to this algorithm to calculate the square root of a number:
    int sqrt(int num) {
int op = num;
int res = 0;
int one = 1 << 14; // The second-to-top bit is set: 1L<<30 for long


// "one" starts at the highest power of four <= the argument.
while (one > op)
one >>= 2;

while (one != 0) {

if (op >= res + one) {
op -= res + one;
res += one << 1;
}

res >>= 1;
one >>= 2;
}
return res;
}

(from Wikipedia)

Saturday, April 11, 2009

XSLT

XSLT: take an XML doc and, through a sort of "stylesheet", make something with the data (another XML doc, or perhaps a XHTML doc). I don't know this language but I guess it's rather useful (although I have no use for it at the moment).

related stuff: libxml, libxml2, libxslt, xsltproc

Links:
- specifications@w3: http://www.w3.org/TR/xslt


PragProg media: Expression Engine Techniques

I often have a look at the list of media of "PragProg" (http://www.pragprog.com/categories/all?sort=pubdate). The newest title is the screencast serie: "Expression Engine Techniques"

Well I definitely don't need it, anyway I'm always curious, so here is the info I collected (mainly from Wiki) about the topic:

ExpressionEngine http://expressionengine.com/ is a CMS, developer: EllisLab, there is a free version, and two paid ones. A "2.0" is expected in 2009, will be based on CodeIgniter, which is a PHP framework.  


capistrano using git

The cap deploy:update task when you are using git runs the following git command: 

git checkout -q -b deploy 3fe75somehash.....

meaning : 

- creates a new branch named "deploy" (what happens if there is already one called like that?)
- the source of the branch is the commit identified by the given hash
- -q option is "quiet mode"

rsync and symbolic links

some possible behaviours of rsync (from the man page):

(1) SKIP:
default case => symlinks are simply not followed

(2) COPY THE LINK:
"symlinks are recreated with the same target on the destination"
rsync --links
also: rsync --archive implies --links

(3) FOLLOW THE LINK
rsync --copy-links

safe/unsafe:

relevant for case 3 is the "safe" vs. "unsafe" difference which I did not understand good.
This is what is written: "An example where this might be used is a web site mirror that wishes ensure the rsync module they copy does not include symbolic links to /etc/passwd in the public section of the site. Using --copy-unsafe-links will cause any links to be copied as the file they point to on the destination. Using --safe-links will cause unsafe links to be ommitted altogether."

Always back up on a dedicated partition or disk!

My home dir on a certain server was backed up by another company. I set under my home a soft link to another location (several Gb of data). What I didn't know: their backup script was rync-based *with* the option --copy-unsafe-links. Rsync followed the link and backed up tons of stuff clugging the backup hard disk. They used the same hard disk and the same partition also for another function: email server. So that softlink disrupted the email server! At first I though I was to be blamed, but after a second thought, I understood, that it's not 100% true: they should have *never* run the mail server on the same partition where they backed up that stuff!


Ruby and GUIs

Is there a good ruby GUI toolkit?

I want to write a simple app, with a simple DB backend, just to keep some notes. As I want it to run on my laptop, I don't want any webserver running, so no rails app (and also because I want to do some gui programming, it's years I am only working on console or web apps). 

I started earlier this morning my Google quest to answer this question, I am not so far yet. I am no expert in GUIs. 

Some disorganized thoughts:

- Java of course is a good choice, isn't it? Yes I want to learn Java again (I learned it 10 years ago, and not using it a long time, so I guess my knowledge is totally out-of-date). But now I want just to write a little small app in that simple little lovely ruby language. 

- I guess most Win apps are based on "native" widgets. For that maybe you need VS or the like and I am in this moment in no mood to be a MS fan. And I want something cross-platform. 

- There a libraries, and like always in IT a lot of names and acronymes, just to confuse stupid newbies like me. QT is one, I think, then there are others (I think some GUI library-flame is also the reason why Gnome and KDE are 2 different desktops, isn't it?). So I guess you use one of that with some good ruby bindings and you are on it. Right?

- I read of Shoes: very simple, maybe good idea, but probably just for beginners, I don't know if I want to waste my time with it. Looks like is just something for kids learning programming? 

About Me

My photo
Hamburg, Hamburg, Germany
Former molecular biologist and web developer (Rails) and currently research scientist in bioinformatics.