Lost Website

You Are Here

Compressing a year of timekeeping in 2 hours

without comments

I’m very bad at keeping track of the time I spend working. This tends to require manual input, and something to remind me of doing the input. The later part is where I usually fail and lose interest. This meant that last week I had to input a year worth of timekeeping data in a few hours in a web application for that purpose.

This is not a problem as opaque as it might seem to some people. We use timekeeping at work to keep track of how much time are spent doing specific projects and not to keep a precise account of who is working or not at specific time.

The only place where that data is consigned is in out revision control systems, Mercurial. It has a detailed log of the data that was commited inside a repository and, an explanation why if the commit message was good. Scanning each repository (all 72 of them) with the default log command output would have been undoable.

Luckily, Mercurial has a lesser known feature which allows users to present log data data in a more terse way that the default. This is the --template switch, which is pretty well explained in Mercurial manual.

The command I’m using in the script bellow is something like that:

hg log --template "{date|shortdate} {author|email} {rev}"

Here is an excerpt of the output of this command.

...
2009-09-09  fdgonthier@kryptiva.com 1934
2009-09-21  fdgonthier@kryptiva.com 1935
2009-09-21  fdgonthier@kryptiva.com 1936
...

So this shows some commit I have done in a specific project during the month of september in 2009. It was then trivial to extract that data from all the repositories to see what I was working on at what date. The following script loops around all my repositories and extract from the log the dates in 2009 where I have commited something. Note that I have added another field in the template, which is the name of the directory containing the Mercurial repository. This will be used to distinguish between projects in the step after the data is obtained.

#!/bin/sh

for i in $(find . -maxdepth 1 -type d | cut -c 3-); do
  if [ -e $i/.hg ]; then
    echo "Churning $i"
    (cd $i; \
       hg log \
         --template "{date|shortdate}  $i {author|email} {rev}\n" |\
      grep -E "^2009.*(fdgonthier)") > ~/churn/$i
  fi
done

From the files churn directory it’s then trivial to get a picture of everything that was worked on all through the year. Just cat the file together and sort the whole set of lines by date.

> cd ~/churn && cat * | sort | less
...
2009-03-03  bar-daemon fdgonthier@kryptiva.com 1803
2009-03-03  bar-daemon fdgonthier@kryptiva.com 1804
2009-03-04  libfoo fdgonthier@kryptiva.com 5
2009-03-04  libfoo fdgonthier@kryptiva.com 6
2009-03-04  bar-daemon fdgonthier@kryptiva.com 1805
2009-03-04  bar-deamon fdgonthier@kryptiva.com 1806
...

This will be as accurate as you keep your repositories clean. For example, it might be difficult to extract only the changesets you did if you did not pay attention to correctly configuring your default commit name. It happened to me in some contexts. I also had to use the revision number of the log to the content of some commits because I could not remember to what subproject they were attached.

This is not something you want to have to do. It’s much more accurate and easy to properly feed the timetracking program on a daily basis. There is no excuse not do to it properly, but if you tend to forget that kind of thing, this trick can help.

Written by François-Denis Gonthier

January 21st, 2010 at 3:11 pm

Interfacing with Microsoft products

without comments

I work for Kryptiva, which is a small company creating security and collaboration tools. Our server products are deployed on Ubuntu and Debian Linux, but our client-side products are naturally available on Windows. We host some of our server product ourselves, but some of the server we have are deployed inside client networks, which are usually Microsoft Windows networks.

Creating application for Microsoft Network and for Microsoft Windows desktops meant to had to deal with several Microsoft technologies. This might come as a surprise to readers of Linux technology blog, but I think that not all Microsoft technologies are nightmarish messes to deal with. To the extent to what we have needed to do, Microsoft did not fare as worse or as good than everything else I have dealt with in my programming career.

Microsoft Active Directory is the building block of all Microsoft Windows network. It is an huge database with load of information which can happily be accessed using the open LDAP protocol. Working with Active Directory is something intimidating at first because the data it provides isn’t intuitively interpretable. For a Linux user, it takes a little while to get used to the terminology and the features of a Microsoft Windows network. Once the basics are acquired, it’s actually pretty easy to get Active Directory to return the information ones want. Of course, it has Microsoft-style quirks you have to deal with, but everything is pretty decently documented especially since Microsoft was forced release to their documentation. The people working on Samba 4 would certainly have a lot more to say than me about Active Directory but I honestly can’t say that much bad things about it.

Microsoft is a big company that has to deal with an enormous amount of code and software interfaces internally, but they also need to answer to their clients which all together have a massive quantity of code interfacing with Microsoft code. This means that deprecating age-old API is hard, and can’t always be handled in a very elegant way.

I’m not arguing that API should never be deprecated, but there are good ways and bad ways to deprecated API. The MAPI (Messaging API) is an age-old interface that has been in use for years in Windows to send an receive email messages on Windows machine. It’s badly documented and quirky API but since it’s the core of all messaging applications on Windows, it’s very powerful and difficult to do without it. Sadly, Microsoft does not want people to use their shiny new .NET framework with the old MAPI. They reasons they explain is very vague. To put it simply: it’s not compatible and will crash your program, but we can’t really explain why or when. The alternative they suggest are simply not practical or not as powerful as directly dealing with MAPI.

This brings us to the worse. The interface exposed by Microsoft to plug into Outlook is just bad. It has become a kind of running gag that I have become an world expert in calculating the size of a message attachment in Microsoft Outlook 2003. The makers of the framework we use to develop Outlook plugins have even semi-acknowledged that (see the end of the post). In some context, getting the size of file attached to an email message is very difficult and the only way to get it is through… MAPI. This is why it has been disconcerting to learn that Microsoft doesn’t want programmers to use MAPI in managed (.NET) code. Happily, this has been solved in later Outlook versions and Outlook 2007 exposes the size of attachments in all context. This means the some code using MAPI code with eventually go away, along with support for Outlook 2003. Sadly, Microsoft has more surprise in store for Outlook 2010.

Details about the next Microsoft Outlook release, currently in beta are being documented by Microsoft. In a recent newsletter, the makers of Addin Express linked us with this overview of a change related to the way Outlook 2010 shuts down.

Starting in the Outlook 2010 Beta release, Outlook, by default, does not signal add-ins that it is shutting down.

Most add-ins use these events to release references to Outlook COM objects and clear memory that was allocated during the session.

Thanks Microsoft, but what about network connections? Opened databases? Plain old data files?

I can only guess why Microsoft decided that it is important that Outlook shuts down as quickly as possible but it was apparently deemed more important that data integrity. This is something I find questionable. Microsoft have left a backdoor open for system administrator to revert to the old behavior, but in practice this is not the kind of thing you like to ask system administrator to do. There must be a handful of hackish and unsupported workaround to solve this problem and Outlook programmers around the world will find and document most of them, but I’m pretty sure they could all do without that…

Written by François-Denis Gonthier

January 16th, 2010 at 10:15 pm

Posted in Misc, Programming, Windows

Tagged with , , , , ,

LD_PRELOAD fun

without comments

Here is a welcome digression from my previous Twitter oriented posts. I’m starting to play around with the LD_PRELOAD feature in the Linux dynamic linker. For those who might not know what this feature is, here is the description from ld.so (8).

      LD_PRELOAD
              A whitespace-separated list of additional,  user-specified,  ELF
              shared  libraries  to  be loaded before all others.  This can be
              used  to  selectively  override  functions   in   other   shared
              libraries.   For  setuid/setgid  ELF binaries, only libraries in
              the standard search directories that are  also  setgid  will  be
              loaded.

So in pratical term, any libraries you specify in the LD_PRELOAD environment variable will loaded before any system libraries. This means that dynamic symbols in a loading program will be first searched in those libraries before being searched anywhere else. This means you can override any defined symbol you want in standard libraries.

Let’s start with a rather juvenile example. This will change the behavior of the read (2) function in order to make the user believe a file might have a different content.

ssize_t read(int fd, void *buf, size_t count) {
    static int done = 0;
    if (!done) {
        char silly_str[] = "Haha you got overriden.\n";
        size_t s = count > sizeof(silly_str) ? sizeof(silly_str) : count;
        memcpy(buf, silly_str, s);
        done = 1;
        return s;
    }
    else return 0;
}

If you compile this inside a library that is called, for example, libread.so, you can test this code by running:

> /bin/cat /etc/fstab
# /etc/fstab: static file system information.
#
...
> LD_LIBRARY_PATH=. LD_PRELOAD=libread.so /bin/cat /etc/fstab
Haha you got overriden.

That in itself is just a rather silly prank you can play on your friend’s computer if you happen to have access to it. Experienced programmer will start seeing potential uses for LD_PRELOAD. I am getting to that.

The subject of our next example will be the honorable ls (1). ls uses the opendir (3) function to open a directory and browse its files. It should react properly if it can’t open the directory. One way to test this is to make opendir() return NULL and observe how the caller reacts. You can do that using LD_PRELOAD.

DIR *opendir(const char *name) {
    return NULL;
}
> LD_LIBRARY_PATH=. LD_PRELOAD=libls1.so /bin/ls /tmp
/bin/ls: cannot open directory /tmp

What can you do now if you want to preserve part of the behavior of the function, or modify they result it returns? Your preloaded library will then need to use libdl to dynamically load the function it wants to modify the behavior.

The following example is a very simple override of the opendir (3) function which open a different directory than what the caller expects. I will explain more in detail the details of this function below.

DIR *opendir(const char *name) {
    DIR *(*libc_opendir)(const char *name);
    *(void **)(&libc_opendir) = dlsym(RTLD_NEXT, "opendir");
    return libc_opendir("/tmp");
}

libdl is fortunately very simple to use. The naive approach would be to use dlopen (3) to open the C library, then get the pointer to the function you are calling using dlsym (3). In theory, this technique is valid and working, but doing that circumvents the LD_PRELOAD mechanisme because preloaded libraries can be chained and calling directly into the C library prevents other caller to override our own function.

In practice, calling dlopen() on libc on an Ubuntu Karmic system made some program crash and burn for reasons I will not attempt to explain. The next technique should be preferred on Linux system, especially when dealing with the system C library.

dlsym() has an option that makes the Linux dynamic linker search for the right symbol to be override. This is the RTLD_NEXT flag, which is to be used just for the purpose of wrapper dynamic library functions.

libdl the task of returning the pointer to the right symbol. The RTLD_NEXT option to dlsym() returns the right symbol.

The next and final example of the use of LD_PRELOAD will still use the valiant ls. In time for Christmas, this will modify the output of ls by randomizing the d_type field returned in the dirent structure by readdir (3). If you use colorized ls output, and I believe most of you probably do, you should see a pretty display of color whenever you list a directory by preloading this function.

struct dirent64 *readdir64(DIR *dir) {
    static struct dirent64 *(* libc_readdir64)(DIR *dir) = NULL;
    struct dirent64 *dent;
    unsigned char rnd_dtype[7] = { DT_UNKNOWN, DT_REG,
                                   DT_DIR, DT_FIFO,
                                   DT_SOCK, DT_CHR,
                                   DT_BLK };

    if (libc_readdir64 == NULL) {
        *(void **)(&libc_readdir64) = dlsym(RTLD_NEXT, "readdir64");
        srand(time(NULL));
    }

    dent = libc_readdir64(dir);

    if (dent != NULL)
        dent->d_type = rnd_dtype[rand() % 7];

    return dent;
}

There is still a problem with this code on my new Ubuntu Hardy machine. The code from the preloaded library hangs before the program terminates. I do not understand why this happen and a search for this bug did not turn up anything. The problem doesn’t happen with Ubuntu Karmic.

There is nothing new about using LD_PRELOAD this way. Several very nice libraries have been built with the intention of modifying the behavior of typical libraries.

  • fakeroot: “fakeroot provides a fake root environment by means of LD_PRELOAD and SYSV IPC (or TCP) trickery.”
  • fakechroot: fakechroot provides a fake chroot environment to programs.
  • libtrash:“[...] the shared library which, when preloaded, implements a trash can under GNU/Linux”
  • cowdancer: cowdancer is an userland implementation of copy-on-write filesystem.

There are 29 projects matching LD_PRELOAD on freshmeat.net. You might have used some of them.

The code I have written for this demonstration is available on BitBucket.

Written by fdgonthier

January 11th, 2010 at 10:10 pm

No post this week

without comments

There won’t be a blog post this week since my daughter has decided that sleeping is optional for her. I have not been able to sit and write text during the evenings. I have coded a bit though and hopefully you will see the outcome of that next week.

Written by fdgonthier

November 27th, 2009 at 2:22 pm

Posted in Misc

My experience with Adobe Air

without comments

Adobe Air is a new software platform from Adobe which mixes JavaScript and Flash technologies to enable developers to make rich Internet applications that can run on desktop computers. It is remarkable in the world of proprietary applications in the sense that it has included Linux support early on.

My previous post about Twitter clients might have hinted that I have had bad experience with Air applications in general. In this post I will vent of some of the grudges I have with this new platform on Linux.

This, again, not a very fair review. I have not taken the time to investigate the odds and ends of the platform and will overlook the developer’s point of view on the platform. I have heard over the tubes that programmers working on applications for the Air platform appreciate it, but that’s as far as my investigation (or lack thereof) have taken me.

Air applications aren’t so cross-platform

I have tried 5 Adobe Air applications, mostly Twitter clients: Spaz, TweetDeck, Seesmic Desktop, DestroyTwitter, Tumbleweed, Twhirl. Of those 6 applications only the later 3 worked out of the box on Ubuntu Hardy. The fact cross-platform compatibility isn’t guaranteed by using Air seems to be well-known of developers and most some them will not officially support Linux as an operating platform. This is a very bad average for a technology that is supposed to be cross-platform.

Air applications don’t fail gracefully

The failure mode for each of the non-working applications in also needs to be taken into account when judging quality of the cross-platform Air applications. The behavior of the applications I have tried is less than stellar. TweetDeck is supposed to work on Linux but in the cases where it fails, it shows a semi-helpful error message. Earlier version of TweetDeck failed in the same with Seemic Desktop fails. Seemisc Desktop works partially but the mail display of the application stays empty. SpaZ shows anything usable. Applications failing to work in such a way are very frustrating for the users because they are left to figure out how to use applications that are put in a undetermined state because of holes in the runtime.

Air applications that work on Linux won’t work on all desktops

Adobe Air doesn’t support anything but GNOME and KDE. I usually have the core components of both KDE and GNOME installed on the computers I use, so running applications relying on either desktop environment is not a problem. This is not the case with Adobe Air applications since they apparently rely on some specific features of GNOME or KDE window managers to work properly. This is something that is very seldom seen in the world of X-Window applications. X-Window Applications relying on components of either desktop will usually be happy running on any window manager.

Despite this, Twhirl will work okay-ish in OpenBox with only some graphical glitches to show for.

Adobe Air requires a compositing manager to decent.

I will admit that I’m rather old school and I have not joined the Compiz bandwagon. I did not use any compositing manager on any desktop I use. That means that all Adobe Air applications that use shaped windows, ie: all the ones I have tried, looked awful until I caved-in and enabled the limited compositing feature of KDE 3.5.

Air applications tend focus on the look

Look and feel was never a strong point of Adobe Flash. Flash based application usually don’t try to blend in in the rest of the system. In Adobe Air, Adobe are obviously trying to enforce some Feel in applications, but on the Look side, a lot still leaves to be desired. All Air applications I have tried put a very strong emphasis on design and not so much on blending with the user’s desktop. DestroyTwitter had been designed with fonts a lot smaller than the default size on my desktop. On my CRT screen, it made the application hardly usable. Even the option to use larger font in DestroyTwitter did not do anything to improve the situation.

The worst offender in that category in the applications I have tried is without a doubt Tumblweed, an offline client for the Tumblr web blog system. I had hoped this application would be nicer than editing blog entries through a web site, which is something I dislike. I was mistaken…

Tumblweed: look ma! I can make huge borders!

The application was removed after a few seconds as it obviously did not focus very much on making editing pleasant.

Okay, I swear this will be my last Twitter inspired post…

Written by fdgonthier

November 18th, 2009 at 8:00 am

Posted in Linux, Reviews, Ubuntu

Tagged with , , , ,

My review of Twitter clients

without comments

I have used multiple Twitter client since I’ve first started using the service. The reason for that is that most of them are in fact pretty bad and it took me a while to find the set of Twitter client I can use at home, at work and on my mobile device, on Windows but mostly on Linux.

I will not even try to make a fair review. The worst clients I have used have resided just a few minutes on the computer where I have installed them. This means the will be essentially based on first impressions. Also keep in mind that I have a strong interest in Linux even if I lately I have had to use Windows at work. I use Linux pretty much everywhere else, all the time.

To put things in perspective, I need to precise that I do not have high expectations from a Twitter client. All I need at the following features, in order of priority:

  1. it must work on Linux; this sounds simple but most so-called portable client don’t
  2. it must be visually pleasing, yet not look like an angry fruit salad; this is what shoots down a lot of Adobe Air application
  3. have a reply feature; without this, people will lose track to what your reply to when you reply to them
  4. have a retweet feature; because copying and pasting sucks

All additional feature are more layers of icing. Features such as followers management, themability, direct messaging, twitpic support, identi.ca support are useful but not required. I would happily use a Twitter client with only the feature I mentioned above.

So, here is my review of all the Twitter clients I have tried.

BAD!

Sadly, most client I have tried fall in that category. I have used some of them for so little time that I had to go back to the list of Twitter clients on the Twitter Fan wiki to make this list.

Twitux is available in the Debian and Ubuntu software repository, and that is its only quality. It’s also under-maintained compared to most Twitter client available.

Choqok is a Twitter client for KDE4. This is a client that will eventually become good, but I wasn’t impressed by the version I have tried.

Gwibber has too many GNOME dependencies for me. It did not seem to be worth the time compiling it on my outdated Ubuntu Hardy desktop and I don’t have anything GNOME related on my laptop.

twittaré is not a flattering name in french. This client works but has no distinctive features.

DestroyTwitter is a classy Adobe Air application. It works surprisingly fine in Linux but had one big problem at the time I have tried it: the default font it used was so small that I had to squint every time I looked at it. It might be worth the time to retry it since it has all the feature I want in a Twitter client.

Spaz is major Twitter client, and an Adobe Air application. Like many Air application I have tried, Spaz failed to work on Linux. I don’t even remember seeing the main window appear.

StickyTweet is made in plain C/C++ using the Win32 API so saying it is unappealing is an understatement. The guy that programmed that has probably lost a bet. Any high-schooler with a Delphi book could make a better looking application.

Ada is a minimal Twitter client for the Adobe Air platform. It’s very, very minimal. Too minimal. It worked fine on Windows. I have never used in on Linux.

twidge is a command line Twitter client from John “Real World Haskell” Goerzen. It does what it advertises and will be helpful to those more console-inclined than me.

twit.el is an Emacs script for Twitter. Even though I like Emacs a lot, I have never seriously considered using it. I have made it work on my laptop but never even the more features it may support. It is certainly a good option if you spend most of your life in Emacs. I don’t.

Seesmic, from the makers of my main Twitter client. I have never managed to make it work on Ubuntu Hardy. It starts fine but the user interface doesn’t work. It’s a well known client and some people apparently love it. I don’t remember seeing anyone saying that it works on Linux.

TweetDeck is another popular Twitter client, an Adobe Air application, that I could never get to work on Ubuntu Hardy. The program starts but then fail to work, but at least its gives a clear warning saying that it won’t work on the computer. This is also a very popular and heavily featured program that many people are using on a daily basis. It is also reported that it works fine on Linux but it seems my current distribution isn’t supported.

Okay

This is the list of clients that are probably not bad but that I have not seriously used for some reasons.

Twirssi is an irssi script. I have used irssi as my IRC client for a while now so it wasn’t illogical for me to try and see if I could merge Twitter and IRC. I have got it to work on my main computer and used it for a little while. I have not adopted it since installing it manually with CPAN has sent me into a dependency hell that I could not get out of.

Twitter Opera Widget is a sort of plugin which Opera calls a widget. It made a lot of sense to me to use an Opera plugin to use Twitter since I constantly have Opera loaded at home and at work. I have actually used this client extensively when I started using Twitter. It had all the functionality I needed and was correctly maintained by its author. Sadly, this widget suffers from a pretty bad memory leak which made made my desktop, let alone my browser, unusable after a few days. I grew tired of waiting after the leak to be fixed so I ended up switching to Twippera. This is what makes me categorize this client in the Okay category, otherwise I would still be using it.

The Twitter webpage itself is pretty poor in features. The retweet feature that is present in most good name Twitter clients is currently being deployed to some user as a beta feature. This should put twitter.com down in the standings of Twitter client in terms of features. It’s nonetheless a fact that most Twitter user start by using the Twitter website itself and since it’s by far the most used of all client, it probably gets the job done good enough. I have used Twitter.com to sometimes to tweet and I am still using it to check on my follower list and my followers requests.

Good

Twippera is another plugin for the web browser Opera. It has less features than the Twitter Opera Widget but it has no memory leak which cripples my browser. I only left this client to search for a client with more features. I’m still using it on my laptop because I have not yet looked for a replacement.

Twhirl is the client I use on my desktop computer. It is an Adobe Air application, one of the few which work fine out of the box on Ubuntu Hardy. It’s a excellent Twitter client which supports all the basic features I need but a lot more. It’s unfortunate that it is using the Adobe Air platform and that its thus quite resource intensive and only work on GNOME or KDE.

Mauku is the only client I have tried for the Maemo platform on the Nokia N800, and it may be the only decent one. The Maemo 4.x version is missing several feature I desire from a Twitter client but the display is good enough to work with and I’m using it whenever I’m away from my computers but close to my N800.

Brizzly is what Twitter.com should be. It is an excellent web client to access Twitter which include several features that are available in desktop clients. Brizzly exposes its features in a single web page, which is better than what Twitter.com offers. The feature I love the most in Brizzly is the display of Twitter trends with explanations that are free to edit by Brizzly users. This has since been implemented in Twitter.com but in a way that is less efficient that what is shown by Brizzly.

Written by fdgonthier

November 13th, 2009 at 8:00 am

Posted in Reviews

Tagged with , , , , ,

Twitter privacy levels

without comments

I have been a Twitter user for nearly one year now. It has been an generally pleasant experience. Microblogging is now part of my array my information tools, which also include: RSS (through Google Reader), Email, IRC. A more complete post about own I see Twitter as useful is something I need to do eventually.

I have recently made my Twitter page protected due to the rampant increase in the prevalence of spamming on Twitter. Doing this has bought me peace of mind, but it is pretty unfortunate that Twitter privacy is an all-or-nothing option. If you want keep spammers away from your profile, you need to set your profile to protected and you have no other options. A protected account has the following restriction:

  1. People that want to follow you need to be approved
  2. Your tweets don’t appear on search.twitter.com
  3. “@replies” to user not following you will not be seen
  4. You cannot share direct links to your tweet with others

For Twitter outsiders, it is interesting to know that, on Twitter, spammers are not like the obscure, badly worded email that sometimes slip through your email spam filter. A mom an pop plumbing shop hunting potential clients using keywords on Twitter search is also considered to be spamming, admittedly to a lesser level. Some people will call this marketing but it’s a pretty thin line between that a really spamming, especially if the shop is located in the central US where it’s unlikely I’ll ever end up travelling.

The third limitation is what has annoyed me the most about protecting my profile. I follow several people that have good reasons not to follow me and yet I would like to sometimes directly address those persons. For this purpose, I had to create another unprotected account. This is not unpractical as long as your Twitter client supports multiple Twitter account.

It seems to me that those problems would not happen if the privacy options were more fine-grained. I don’t need the full protection of a protected profile, but there are some features of it I can’t live without. Here is how I would split the privacy options.

Visibility

This would determine if only followers can see your tweets. If the user decides that his tweets are private then its of course the should not be searchable.

Followability

It should be possible to screen people that want to follow you on Twitter. Most people I have seen using a protected profile will accept being followed by just any human being, and just the fact of making the profile protected puts it out of reach of majority of spammers.

I understand why this is a feature reserved to protected accounts: if your tweets are public and searchable, then whoever really wants to follow you can do it manually without adding you to their followers list. So, without making the profile private, this option is next to useless to protect keep your profile away from spammers.

Still, there is nothing keeping Twitter from making this feature available independently of private profile. Weeding out your Followers page of the many spammers that stitches to any unprotected profile is something know to all Twitter users and it’s not a fun task if you ever forget it for a few months.

Searchability

Some users might not want their profile to be found on search.twitter.com. I simply don’t need that level of privacy, but it’s not unconceivable that some particular users of Twitter would want to be of the public view.

@replies to you

This would determine if you want to messages directly addressed to you from users you don’t follow. Since Twitter already has a Block feature which does that on an user per user basis, this would be like automatically blocking all users.

This could be useful for star Twitter users with little tons of followers that want to limit who can address them. Most of those people probably don’t read all those tweets anyway, so why give people the illusion they can talk to them?

This of course would be of no use to me since there is just not enough people sending me tweets.

@replies to others

This would determine if want your @replies to be automatically public. This would allow you to communicate with people You want.

This is of particular interest with low-rank Twitter user like me who just want to send messages to Twitter microcelebrities that have few enough fan to read what they say to them and that sometimes might even reply.

Findability

Since we are into adding privacy levels, why not put that in too? Some users might not want to be found using the Twitter “Search User” feature. A privacy setting could be added to disable that.

I’m proposing this just because I thought of it… The Twitter Search User page is pretty crappy and already ensure a base level of privacy which suits me fine.

Explicitely public tweets?

I think most of the problems I have with Twitter privacy settings would be solved if tweets could be made public on an individual basis. It would then be possible to address directly people that don’t follow you and take part in your favorite meme without leaving the comfy cloak of your protected account.

I must say that I have not checked what other microblogging platforms, like Jaiku and identi.ca have to offer. I admit that what I’m suggesting in this post might already exists somewhere else.

Written by fdgonthier

November 10th, 2009 at 8:00 pm

Google Wave invites

with 6 comments

Edit: … and we are done. There is no more invites left. Thank you.


Edit: There is only 1 invite remaining. I’ve given away 5 remaining invites to my Twitter friend @balty for a contest on his blog Les 2 Geeks (french). You can participate in his contest until November 14th. There is still 1 invite up for grab right now on this blog so grab it before I give them to somebody else.


I have received 8 Google Wave invites to give away. I know this is still in demand, but it seems everyone that was interested in Google Wave in my immediate cyber-vicinity already surfing the Wave.

So, I’m opening up my invites to the whole Internet and will give away a invite to the first 8 interested people that comments on this entry. Don’t forget to write a reachable email address.

I expect nothing in return but if want to spend a few minutes considering the rests of the posts on my blog, I’ll be grateful.

Written by fdgonthier

November 10th, 2009 at 12:04 pm

On String.intern()

without comments

Where the author realizes the significance of the String.intern() method

I might have hinted about in in my previous post on the subject of strings in Java, yet I did not realize the significance of String.intern() method. The following code sample demonstrates the behavior of the String.intern() method, similar to what I demonstrated in the post.

public class TestClass2 {
    public static void main(String[] args) {
        String s1 = "hello";
        String s2 = new String("hello");

        // This is going to be false.
        if (s1 == s2) System.out.println("s1 == s2");

        // This is going to be true.
        if (s1 == s2.intern()) System.out.println("s1 == s2.intern()");
    }
}

It’s a didactic example at best. It’s when you consider that strings also come from input/output that it String.intern() becomes a thing of interest.

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;

public class TestClass2 {
    public static void main(String[] args) {
        String s1 = "hello";
        String s2 = null;

        try {
            // Enter hello at this point.
            s2 = new BufferedReader(new InputStreamReader(System.in)).readLine();

            // This is going to be false.
            if (s1 == s2) System.out.println("s1 == s3");

            // This is going to be true.
            if (s1 == s2.intern()) System.out.println("s1 == s2.intern()");

        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

As you can see in that example, the String.intern() method returns a reference to the string “hello” already in the constant pool. The virtual machine maintains an table of string instances that can be shared between all the string references in the program.

An immediate and obvious benefit of this technique called String interning is reduced memory footprint because of object reuse. Wikipedia also describes that the technique is also used by programs that need to do fast string comparisons such as compiler. This allow to compare strings by simple comparing references instead of possibly scanning the full length of both strings.

The JDK documentation gives a better description of the behavior of the String.intern() method. It’s a surprise to me that I never took the time to understand this behavior of such a core class of the Java library.

Microsofties might also find interesting that the .NET Framework also has a String.intern() method which behaves approximatively in the same way.

Written by fdgonthier

November 6th, 2009 at 8:00 pm

Don't dust off your tinfoil hat for Skype just yet…

without comments

So Skype is evil because it’s proprietary?

It’s not hard to find rumors about spyware being deployed with the Skype VOIP software. What is hard to find amongst those rumors are concrete facts. Most of the rumors seems to be unsubstantiated, and some other are based on interpretation on the EULA of Skype. I won’t bother with the later case since legalese is not a language I speak.

This blog is one of the few blog around that take seem to take the matter seriously and brings forward something looking like a real proof that Skype may be stepping over the boundary of user privacy.

For the people who don’t read french, I will summarize the article. The author’s hypothesis is that when a new profile is registered through the Skype desktop client, the software accesses bookmarks stored in the user’s Mozilla Firefox profile. Since it’s not immediately obvious why Skype needs to be doing that, he concludes that the Skype software must be sending that information home for data warehousing, or some other shady practices.

For proof he shows his data that he obtained using the strace command on Linux. strace is a lovely, lovely utility I’ve learned to master in the last few years. It is an utility which shows the system calls that are used by a Linux application. strace is not hard to use but its output can be very voluminous and difficult to decipher. This is not the case here.

…Naaah

The data he obtained looked inoffensive to my eyes just 2 seconds after examining it (I won’t claim I’m the first that saw that: several commenter have pointed it to him).

The blogger singles out several calls to stat64(), which is a system call returns information about a file like its size and last modifications or last access date.

[pid 23964] stat64("/home/phil/.mozilla/firefox/bstiq480.default/bookmarkbackups/bookmarks-2008-12-17.json", {st_mode=S_IFREG|0600, st_size=41718, ...}) = 0
[pid 23964] stat64("/home/phil/.mozilla/firefox/bstiq480.default/bookmarkbackups/bookmarks-2008-12-20.json", {st_mode=S_IFREG|0600, st_size=42052, ...}) = 0

An higher level view of the data shows that Skype actually calls stat64() on all files on the the Mozilla profile of the user, and call open() on the directories he finds, then call getdents() to obtain the list of entries in that directory and so on…. Like any software recursively scanning the filesystem would do. The scan in the profile is stopped at the moment the software finds the user preference file.

This is easily explainable: Skype tries to install a FireFox plugin. It seems the Windows version has an option in the installer to disable that plugin but I have not found the same option in the Skype package.

So, Skype does search inside the user’s Firefox profile, but the only thing he does with the result it obtains is the installation of a plugin for the user’s convenience. It’s not even useful to search the place where it might be sending data since there is no data to send other that what it gathered through its registration wizard.

The final nail can be driven in the coffin on this theory by simple listing all the files opened by Skype during registration. None of the files contain personal information. You can see list of opened files I have extracted from the strace output at the end of this post.

Not evil on an evil operating system either…

Those results have been independently confirmed on Windows by DrFrakenstein, a twitterful, but blogless Code Ninja. He used Process Monitor and confirmed me roughly the same behavior but targeted at Internet Explorer.

So, probably not evil…

I can’t conclude this post by saying that Skype doesn’t include spyware. I simply spent one hour examining very limited data on the activity of the software during registration. Yet, I’m confident enough about my result to keep recommending its use to my family. Use Free alternatives such as Ekiga if you give high important to software freedom. It’s a opinion I respect. Just make sure you have something better that crappy strace analysis before dissing good but proprietary software.

See for yourself…

Here are the data I have obtained by running strace during Skype account creation server.

Since I love some good shell-one-liner action, here is the command that extracts the list of opened files from the strace data.

grep open skype.trace | perl -ne '/\"(.*)\"/ && print $1."\n"' | sort | uniq

Written by fdgonthier

November 3rd, 2009 at 8:00 pm