Lost Website

You Are Here

Archive for the ‘Programming’ Category

Don’t ever break MY trunk!

without comments

Nico’s last blog post touches a subject that has been in my mind for some time now. I must first say that I don’t write this text strictly in reaction to Nico’s post and that I have not verified with if he acknowledges the points I’m about to make.  During the time I’ve spent at Kryptiva it was pretty common to see what I will call WiP (Work-in-progress) commits pushed in our team Mercurial repositories. The reason usually given for pushing broken or incomplete changesets to repositories are the one cited by Nico: people need to backup big changes they are making, or want to complete those changes from another computer.

It would be unacceptable to commit WiP changes on a centralized source control system like SubVersion or CVS because the repository can be checked out by other users at any point in time. Those user tend to expect a working repository even if checking out from a public repository usually means there is a risk that whatever you are checking out will not work. At least, the minimum expectation is that the checked out copy will be compilable.

In distributed version control system (DVCS), like git, everybody commits on it own copy of a repository. Changes get pushed across repositories in discrete bundles. Unless the programming was careless, what ends up in the master repository usually is correct. So, even if programmers have committed broken changes at some point in the repository history, people that clone the repository will usually get a sound copy.

Committing broken code will rarely if ever hurt if all you work on are personal and/or small scale, ashort term, projects. If you are a single programmer tracking changes to a project will git and want to break your trunk every so often, then, go on, be my guess. You are the only person that will suffer your broken history. If you work in a group with several distributed repositories, then you need to read the rest of this post to understand why committing broken trunks is a bad thing.

History

The history of a code repository is the documentation of all the changes that was ever done to a project during its lifetime. As is, it’s the only external documentation that programmer will continuously maintain. This is not something that is obvious when working on projects that have a few tens or maybe hundreds commits. As long as the whole project fits in your head, it is unlikely that you will need to refer to the project change history.  This happens when the project stretches over long time periods and has over thousands of commit. The change history is also something that is very useful when a project changes hand.

WiP commits come into this picture because they usually come with a commit message that not very explicit: “work in progress”, “to be continued”, “I’m not done”, “Finishing tomorrow”, etc. Such a message is extremely not useful if you need to inspect the project history, a blame/annotate log.

In effect, the WiP changesets are separated from the documentation of the change that usually happens at the last commit done on the feature. Tracking back the reason of the change is never unworkable but gets progressively more difficult as the project and the repository age.

Bissection

Bissection is actually a debugging technique that is mostly exclusive to the use of DVCS.  It is a way to find regressions in the repository history by testing past commits using a binary search pattern. At each step of the bissection procedure, the DVCS system updates the repository, putting it in a state represented by a past changeset. The automated bissection procedure then leave the programmeur to test the resulting repository. The programmer should at that point run automated tests or reproduce the problem manually.

Normal bissection

This graph represents a set of commit in a repository. The solid lines are connected changesets in the project history. The dashed line represents the changesets touched by the bissection procedure. In this picture, the initial broken changeset is F and the first known good changeset is A. The changeset consulted are, in order, D, B, then C, which is then found to be the changeset that introduced the bug.

Bad bissection

This graph illustrates what happens when a few WiP commits are introduced in the tree. WiP commits means the project can’t be compiled at all point in its history which it might be impossible to find a regression using bissection.

This is the most serious problem that can happen if you commit broken code to a repository used by a team. It can seriously hamper debugging in big shared repositories.

To be continued…

If you are not impressed by the 2 reasons I explain here, then you need to read my next post. I think the best reason not to commit broken code is that DVCS offers you all the tools you need to make proper commit. I’ll explain how this is possible with Git and Mercurial in my next post on this subject.

Written by François-Denis Gonthier

June 20th, 2010 at 9:09 pm

Getting the handle off any Outlook window

with 2 comments

When embedding some window inside Microsoft Outlook, it is not understandable that at some point you need the handle of some an Outlook window object, an Outlook Inspector or an Explorer. The Outlook Object Model does not expose a method to obtain the handle to a window. This is based on some information from Dmitry Streblechenko.

Yes, but not in VBA: you need to QI the Inspector (or Explorer) object for the IOleWindow interface, then call IOleWindow::GetWindow()

If you work with low-level Microsoft Outlook you will eventually find some information, very often forum posts, answered by Dmitry. You will also quickly learn that he is very often right.

I have written the following code in the very last days I have worked on the EchoTracker. It was part of refactorization I have not had time to finish so this code is UNTESTED. This is based solely on my interpretation of the indication of Dmitry.

/// <summary>
/// Embed the Outlook panel in *any* Outlook explorer. Thanks Dimitry.
/// http://www.pcreview.co.uk/forums/thread-1837879.php
/// </summary>
public static MSOWindow GetOutlookWindow(Outlook.Explorer olExp)
{
  IntPtr olExpUnk = IntPtr.Zero;
  IntPtr oleWinPtr = IntPtr.Zero;
  IntPtr hWnd = IntPtr.Zero;
  Guid oleWinGuid = typeof(IOleWindow).GUID;
  IOleWindow oleWin = null;

  try
  {
    olExpUnk = Marshal.GetIUnknownForObject(olExp);
    oleWinPtr = IntPtr.Zero;

    if (Marshal.QueryInterface(olExpUnk, ref oleWinGuid, out oleWinPtr) != 0)
      throw new Exception("QueryInterface failed.");

    oleWin = (IOleWindow)Marshal.GetObjectForIUnknown(oleWinPtr);
    if (oleWin == null)
      throw new Exception("GetObjectForIUnknown failed.");

    oleWin.GetWindow(out oleWinPtr);
  }
  finally
  {
    if (oleWin != null) Marshal.ReleaseComObject(oleWin);
  }
  return new MSOWindow(hWnd);
}

For this code to hopefully work, you need to have the COM interop declaration for the IOleWindow interface. You can find this information on pinvoke.net.

Please, if you stumble upon that code, and happen to have a need for it, use it, or adapt it to your need, leave a comment on this post. I repeat that this is untested. I have no plan to test it, I don’t have a machine on which I can develop Microsoft Outlook.

Written by François-Denis Gonthier

April 27th, 2010 at 9:01 pm

Getting 2 C# Outlook addins to talk together

without comments

The Outlook Object Model (OOM) exposes the COMAddins collections of COMAddin object which can be used by Outlook plugins to communicate together. The communication needs to be done through a COM interface. C# and the .NET framework makes it very easy.

You first need to make a ComVisible interface which the caller will use to communicate with the callee. The sample we will work with is a simple class that will call System.Windows.Forms.MessageBox.

using System;
using System.Runtime.InteropServices;

namespace MessageBox
{
    [ComVisible(true)]
    public interface IMessageBox
    {
        void MessageBox(String msg);
    }
}

Next you need to make the callee addin. This addin can be made using VTSO or without it. Addin Express would work too. The callee of course needs to implement IMessageBox interface.

[ComVisible(true)]
[ComDefaultInterface(typeof(IMessageBox))]
public partial class ThisAddIn : IMessageBox
{
    public void MessageBox(String msg)
    {
        System.Windows.Forms.MessageBox.Show("MessageBox() call: " + msg);
    }

    protected override object RequestComAddInAutomationService()
    {
        return this;
    }
    /* ... snip ... */
}

This is not the full code of the addin. I have removed the boring part generated by VTSO.

There are 2 important things to notice in the above code. The first is the ComDefaultInterface attribute, which defines which default interface is exposed to COM by the addin. This is important because the ThisAddin class derives from a non-COM-visible class. The page on the NonCOMVisibleBaseClass Managed Debugging Assistant (MDA) has the information on why this is important.

The next important thing is the implementation of the RequestComAddinAutomationService, which return an instance of the COM class to the caller. This is part of the communication protocol between addins. You can make this method return any instance of a COM visible objects. We’ve used the addin class itself to keep things simple.

Finally, the caller code amounts to accessing the COMAddins collection and finding the right object inside it to get the interface.

object msgBoxID = "MessageBox.Addin";
Office.COMAddIns addins = null;
Office.COMAddIn msgBox = null;
IMessageBox imsgBox = null;

try
{
    addins = Application.COMAddIns;
    msgBox = addins.Item(ref msgBoxID);
    imsgBox = (IMessageBox)msgBox.Object;

    // Actually use the other addin interface.
    imsgBox.MessageBox("Hello, this is a selection change.");
}
finally
{
    if (addins != null) Marshal.ReleaseComObject(addins);
    if (msgBox != null) Marshal.ReleaseComObject(msgBox);
}

This code is pretty straightforward. The 2 important things to notice are the ref object parameter to the COMAddin.Item method and the fact that you need to use the ProgID of the callee addin when searching inside the COMAddins collection. I put emphasis on this because I lost some time trying to find the ProgID of the IMessageBox interface. The ProgID of the addin, when using VTSO, is usually the name of the assembly.

This research was done in the context of the development of the Echotracker although it is still way too early to say what feature will be built on top of that.

Written by François-Denis Gonthier

March 26th, 2010 at 9:35 am

C# array marshalling with P/Invoke

with 2 comments

This took me a while to figure out and nobody on the Internet seems to have figured out exactly this particular task.

Working on the Echotracker, I wanted to be able to manipulate an array of string from the JScript runtime inside an embedded web browser. Microsoft script runtimes are able to easily manipulate ActiveX objects, and it is easy in C# to make a object visible to COM so interoperability is generally not a problem.

This interoperability wasn’t so intuitive when the time came to deal with arrays. The offending C# declaration was the following.

string[] GetPhoneNumbers();

This is converted by C# into a SAFEARRAY containing BSTR objects. SAFEARRAY is the native name of VB-style multidimensional arrays and BSTR is, in general, the type used by COM to exchange strings. The declaration is nothing exotic as far as COM is concerned. With the MarshalAs attribute added, the declaration now looks like this.

[return: MarshalAs(UnmanagedType.SafeArray, SafeArraySubType=VarEnum.VT_BSTR)]
string[] GetPhoneNumbers();

This doesn’t work well with JScript. The first I found thing is that JScript array aren’t compatible with VB array. Microsoft provides a VBArray class for that purpose. It is supposed to deal with SAFEARRAY. Using this class sadly adds some semantic complexity to using arrays returned by ActiveX objects because they can only be used through methods such as getItem(), ubound(), lbound() instead of the normal way for JScript arrays.

Secondly, JScript can’t deal with SAFEARRAY(BSTR) object. Using the GetPhoneNumbers() method crash with the error message “VBArray expected”. The problem is that while JScript can deal with SAFEARRAY, it can’t deal with SAFEARRAY of type other than VARIANT. VARIANT is the catch-all, magic datatype introduced by Microsoft for its dynamic, untyped languages like VB and JScript. This is how I came to test the next declaration.

[return: MarshalAs(UnmanagedType.SafeArray, SafeArraySubType=VarEnum.VARIANT)]
string[] GetPhoneNumbers();

This is also not enough for the array to be accepted by JScript, giving out the same “VBArray expected” error message. The problem is in the way the array is marshaled by C#. This page on MSDN clued me to the solution. The .NET marshaller is converting System.String objects automatically into BSTR objects. In this case, the object type needed is a VARIANT object of subtype VT_BSTR. The declaration alone isn’t enough to make the marshaller convert the string to the proper variant type.

[return: MarshalAs(UnmanagedType.SafeArray, SafeArraySubType=VarEnum.VARIANT)]
object[] GetPhoneNumbers();

This last declaration makes the .NET marshaller properly convert the strings into VT_BSTR variant object and the array becomes usable by the JScript runtime. In fact, once you know how the .NET runtime converts System.Object arrays, you can do without MarshalAs declaration.

object[] GetPhoneNumbers();

This is, by default, converted into a method returning a SAFEARRAY(VARIANT) object.

The lesson is: leave the .NET COM interop marshaller alone, it probably knows more about COM than you do.

Written by François-Denis Gonthier

March 24th, 2010 at 12:17 pm

Microsoft has 2 contact management API which you shouldn’t use

with 2 comments

This day has not improved the problems I have with developing on Microsoft Windows. I was doing some research for future development of the Echotracker. I was looking at how to access the Windows address book features through an API and found there are only 3 options open to the programmer wanting to access address book features on Windows.

The main API we use to deal with Microsoft Outlook 200x, the Outlook Object Model (OOM), offers no ways to directly access the address book provided by Outlook. This is bad enough by itself because it means that access to the address book must be done through the dreaded, and unsupported with .NET, MAPI.

Microsoft Windows XP, and probably some earlier version, also had the Windows Address Book API. This is a COM API that give access to the default Windows address book.

This API is deprecated and Microsoft explicitly says that programmers should not use it.

New applications should not use these interfaces. These interfaces exist for backward compatibility with legacy applications. These interfaces will be unavailable in the future.

The newer, Windows Vista and up, alternative is the Windows Contacts API, but here is what Microsoft has to say about this API.

New applications should not use these interfaces. These interfaces exist for backward compatibility with legacy applications. These interfaces will be unavailable in the future.

So Microsoft deprecated the old API, which isn’t necessarily unreasonable, but replaced it with an API that they ask programmers not to use. Of course, they don’t provide any of alternative nor rationale for that message.

So where does that leave us? Nowhere. Programmers who wants to use the Windows address book features will use those APIs, deprecated or not and screw around until it works. Technical analyst taking project decision could also say that Windows has no contact management API. I’m not sure which path Microsoft wants people to take regarding that. I’m lost myself.

Written by François-Denis Gonthier

March 18th, 2010 at 2:53 pm

RAII COM wrapper

without comments

COM stands for Component Object Model. This is the technology behind most of non-trivial things in Windows, the why of the obscure HKEY_CLASSES_ROOT registry subtree and the core of the infamous ActiveX technology.

Now, I don’t know much about COM programming so this won’t be a tutorial on the subject. I know just the minimum I need to safely work with MAPI in the context of what I do at Kryptiva.

COM programming include many concepts of typical object-oriented programming languages. It’s just a lot more verbose. You can view COM interfaces as structures containing pointers to functions. COM classes which implemented those interfaces are managed using reference counting. This means that whenever you carry a reference to a COM class around you need to increment the reference count manually and release it when you are done.

This is something tedious and error-prone but unavoidable when you are using COM objects in C. Luckily, C++.NET isn’t C and it means we can use RAII, just like the code I presented in Naivety in C++.

Before

Before presenting the simple wrapper, let’s see how plain COM programming would look like without the wrapper.

IUnknown *pUnk;
IMAPIProp *pprop;
HRESULT r;

pUnk = (IUnknown*)Marshal::GetIUnknownForObject(mapiObj).ToPointer();

r = pUnk->QueryInterface(::IID_IMAPIProp, (void **)&pprop);
if (FAILED(r))
  Marshal::ThrowExceptionForHR(r);

r = pProp->SaveChanges(FORCE_SAVE | KEEP_OPEN_READWRITE);
if (FAILED(r))
  throw gcnew MapiException(pProp, r, "Failed to save attachment data");

pUnk->Release();
pProp->Release();

This is simple MAPI code that I hacked as a demonstration. It doesn’t do anything interesting.

This code is pretty simple but gives an idea about how hairy code that is calling COM can get, especially if you throw along exception handling.

After

Here is the same code, using the simple wrapper that is listed below.

COM<IMAPIProp> ifProp;
COM<IUnknown> ifUnk;
HRESULT r;

pUnk.Ptr = static_cast<IUnknown *>(Marshal::GetIUnknownForObject(mapiObj).ToPointer());

r = ifUnk.Ptr->QueryInterface(::IID_IMAPIProp, reinterpret_cast<void**>(&ifSession.Ptr))
if (FAILED(r))
   Marshal::ThrowExceptionForHR(r);

r = pProp->SaveChanges(FORCE_SAVE | KEEP_OPEN_READWRITE);
if (FAILED(r))
   throw gcnew MapiException(pProp, r, "Failed to save attachment data");

You can see that the COM wrapper releases the COM objects which is a precious feature as the calling procedure gets more complete. If the end result is a bit anticlimatic, it is because error handling is still in the way. The wrapper I have made for error handling is a lot less pretty so I won’t be presenting it here.

The code itself

// RAII friendly wrapper over COM interface usage.
template <class C> class COM
{
public:
   C *Ptr;

COM<C>(C *pCom)
{
   assert(pCom);
   Ptr = pCom;
}

COM<C>()
{
   Ptr = 0;
}

COM<C>& operator=(const COM<C>& com)
{
   assert(com.Ptr != 0);
   Ptr = com.Ptr;
   Ptr->AddRef();
   return *this;
}

COM<C>(const COM<C>& com)
{
   assert(com.Ptr != 0);
   Ptr = com.Ptr;
   assert(Ptr->AddRef() >= 1);
}

~COM<C>()
{
   if (Ptr) assert(Ptr->Release() >= 0);
   Ptr = 0;
}
};

There is not much to say about this code. It’s very simple and does the job. The copy constructor is there to increase the reference count in case an instanciated COM object is passed between callers and callee.

I have added assert calls to catch potential problems I may have missed but none of them have ever been triggered.

Written by François-Denis Gonthier

February 10th, 2010 at 9:40 pm

Compressing a year of timekeeping in 2 hours

without comments

I’m very bad at keeping track of the time I spend working. This tends to require manual input, and something to remind me of doing the input. The later part is where I usually fail and lose interest. This meant that last week I had to input a year worth of timekeeping data in a few hours in a web application for that purpose.

This is not a problem as opaque as it might seem to some people. We use timekeeping at work to keep track of how much time are spent doing specific projects and not to keep a precise account of who is working or not at specific time.

The only place where that data is consigned is in out revision control systems, Mercurial. It has a detailed log of the data that was commited inside a repository and, an explanation why if the commit message was good. Scanning each repository (all 72 of them) with the default log command output would have been undoable.

Luckily, Mercurial has a lesser known feature which allows users to present log data data in a more terse way that the default. This is the --template switch, which is pretty well explained in Mercurial manual.

The command I’m using in the script bellow is something like that:

hg log --template "{date|shortdate} {author|email} {rev}"

Here is an excerpt of the output of this command.

...
2009-09-09  fdgonthier@kryptiva.com 1934
2009-09-21  fdgonthier@kryptiva.com 1935
2009-09-21  fdgonthier@kryptiva.com 1936
...

So this shows some commit I have done in a specific project during the month of september in 2009. It was then trivial to extract that data from all the repositories to see what I was working on at what date. The following script loops around all my repositories and extract from the log the dates in 2009 where I have commited something. Note that I have added another field in the template, which is the name of the directory containing the Mercurial repository. This will be used to distinguish between projects in the step after the data is obtained.

#!/bin/sh

for i in $(find . -maxdepth 1 -type d | cut -c 3-); do
  if [ -e $i/.hg ]; then
    echo "Churning $i"
    (cd $i; \
       hg log \
         --template "{date|shortdate}  $i {author|email} {rev}\n" |\
      grep -E "^2009.*(fdgonthier)") > ~/churn/$i
  fi
done

From the files churn directory it’s then trivial to get a picture of everything that was worked on all through the year. Just cat the file together and sort the whole set of lines by date.

> cd ~/churn && cat * | sort | less
...
2009-03-03  bar-daemon fdgonthier@kryptiva.com 1803
2009-03-03  bar-daemon fdgonthier@kryptiva.com 1804
2009-03-04  libfoo fdgonthier@kryptiva.com 5
2009-03-04  libfoo fdgonthier@kryptiva.com 6
2009-03-04  bar-daemon fdgonthier@kryptiva.com 1805
2009-03-04  bar-deamon fdgonthier@kryptiva.com 1806
...

This will be as accurate as you keep your repositories clean. For example, it might be difficult to extract only the changesets you did if you did not pay attention to correctly configuring your default commit name. It happened to me in some contexts. I also had to use the revision number of the log to the content of some commits because I could not remember to what subproject they were attached.

This is not something you want to have to do. It’s much more accurate and easy to properly feed the timetracking program on a daily basis. There is no excuse not do to it properly, but if you tend to forget that kind of thing, this trick can help.

Written by François-Denis Gonthier

January 21st, 2010 at 3:11 pm

Interfacing with Microsoft products

without comments

I work for Kryptiva, which is a small company creating security and collaboration tools. Our server products are deployed on Ubuntu and Debian Linux, but our client-side products are naturally available on Windows. We host some of our server product ourselves, but some of the server we have are deployed inside client networks, which are usually Microsoft Windows networks.

Creating application for Microsoft Network and for Microsoft Windows desktops meant to had to deal with several Microsoft technologies. This might come as a surprise to readers of Linux technology blog, but I think that not all Microsoft technologies are nightmarish messes to deal with. To the extent to what we have needed to do, Microsoft did not fare as worse or as good than everything else I have dealt with in my programming career.

Microsoft Active Directory is the building block of all Microsoft Windows network. It is an huge database with load of information which can happily be accessed using the open LDAP protocol. Working with Active Directory is something intimidating at first because the data it provides isn’t intuitively interpretable. For a Linux user, it takes a little while to get used to the terminology and the features of a Microsoft Windows network. Once the basics are acquired, it’s actually pretty easy to get Active Directory to return the information ones want. Of course, it has Microsoft-style quirks you have to deal with, but everything is pretty decently documented especially since Microsoft was forced release to their documentation. The people working on Samba 4 would certainly have a lot more to say than me about Active Directory but I honestly can’t say that much bad things about it.

Microsoft is a big company that has to deal with an enormous amount of code and software interfaces internally, but they also need to answer to their clients which all together have a massive quantity of code interfacing with Microsoft code. This means that deprecating age-old API is hard, and can’t always be handled in a very elegant way.

I’m not arguing that API should never be deprecated, but there are good ways and bad ways to deprecated API. The MAPI (Messaging API) is an age-old interface that has been in use for years in Windows to send an receive email messages on Windows machine. It’s badly documented and quirky API but since it’s the core of all messaging applications on Windows, it’s very powerful and difficult to do without it. Sadly, Microsoft does not want people to use their shiny new .NET framework with the old MAPI. They reasons they explain is very vague. To put it simply: it’s not compatible and will crash your program, but we can’t really explain why or when. The alternative they suggest are simply not practical or not as powerful as directly dealing with MAPI.

This brings us to the worse. The interface exposed by Microsoft to plug into Outlook is just bad. It has become a kind of running gag that I have become an world expert in calculating the size of a message attachment in Microsoft Outlook 2003. The makers of the framework we use to develop Outlook plugins have even semi-acknowledged that (see the end of the post). In some context, getting the size of file attached to an email message is very difficult and the only way to get it is through… MAPI. This is why it has been disconcerting to learn that Microsoft doesn’t want programmers to use MAPI in managed (.NET) code. Happily, this has been solved in later Outlook versions and Outlook 2007 exposes the size of attachments in all context. This means the some code using MAPI code with eventually go away, along with support for Outlook 2003. Sadly, Microsoft has more surprise in store for Outlook 2010.

Details about the next Microsoft Outlook release, currently in beta are being documented by Microsoft. In a recent newsletter, the makers of Addin Express linked us with this overview of a change related to the way Outlook 2010 shuts down.

Starting in the Outlook 2010 Beta release, Outlook, by default, does not signal add-ins that it is shutting down.

Most add-ins use these events to release references to Outlook COM objects and clear memory that was allocated during the session.

Thanks Microsoft, but what about network connections? Opened databases? Plain old data files?

I can only guess why Microsoft decided that it is important that Outlook shuts down as quickly as possible but it was apparently deemed more important that data integrity. This is something I find questionable. Microsoft have left a backdoor open for system administrator to revert to the old behavior, but in practice this is not the kind of thing you like to ask system administrator to do. There must be a handful of hackish and unsupported workaround to solve this problem and Outlook programmers around the world will find and document most of them, but I’m pretty sure they could all do without that…

Written by François-Denis Gonthier

January 16th, 2010 at 10:15 pm

LD_PRELOAD fun

with 2 comments

Here is a welcome digression from my previous Twitter oriented posts. I’m starting to play around with the LD_PRELOAD feature in the Linux dynamic linker. For those who might not know what this feature is, here is the description from ld.so (8).

      LD_PRELOAD
              A whitespace-separated list of additional,  user-specified,  ELF
              shared  libraries  to  be loaded before all others.  This can be
              used  to  selectively  override  functions   in   other   shared
              libraries.   For  setuid/setgid  ELF binaries, only libraries in
              the standard search directories that are  also  setgid  will  be
              loaded.

So in pratical term, any libraries you specify in the LD_PRELOAD environment variable will loaded before any system libraries. This means that dynamic symbols in a loading program will be first searched in those libraries before being searched anywhere else. This means you can override any defined symbol you want in standard libraries.

Let’s start with a rather juvenile example. This will change the behavior of the read (2) function in order to make the user believe a file might have a different content.

ssize_t read(int fd, void *buf, size_t count) {
    static int done = 0;
    if (!done) {
        char silly_str[] = &amp;quot;Haha you got overriden.\n&amp;quot;;
        size_t s = count &amp;amp;gt; sizeof(silly_str) ? sizeof(silly_str) : count;
        memcpy(buf, silly_str, s);
        done = 1;
        return s;
    }
    else return 0;
}

If you compile this inside a library that is called, for example, libread.so, you can test this code by running:

> /bin/cat /etc/fstab
# /etc/fstab: static file system information.
#
...
> LD_LIBRARY_PATH=. LD_PRELOAD=libread.so /bin/cat /etc/fstab
Haha you got overriden.

That in itself is just a rather silly prank you can play on your friend’s computer if you happen to have access to it. Experienced programmer will start seeing potential uses for LD_PRELOAD. I am getting to that.

The subject of our next example will be the honorable ls (1). ls uses the opendir (3) function to open a directory and browse its files. It should react properly if it can’t open the directory. One way to test this is to make opendir() return NULL and observe how the caller reacts. You can do that using LD_PRELOAD.

DIR *opendir(const char *name) {
    return NULL;
}
> LD_LIBRARY_PATH=. LD_PRELOAD=libls1.so /bin/ls /tmp
/bin/ls: cannot open directory /tmp

What can you do now if you want to preserve part of the behavior of the function, or modify they result it returns? Your preloaded library will then need to use libdl to dynamically load the function it wants to modify the behavior.

The following example is a very simple override of the opendir (3) function which open a different directory than what the caller expects. I will explain more in detail the details of this function below.

DIR *opendir(const char *name) {
    DIR *(*libc_opendir)(const char *name);
    *(void **)(&libc_opendir) = dlsym(RTLD_NEXT, "opendir");
    return libc_opendir("/tmp");
}

libdl is fortunately very simple to use. The naive approach would be to use dlopen (3) to open the C library, then get the pointer to the function you are calling using dlsym (3). In theory, this technique is valid and working, but doing that circumvents the LD_PRELOAD mechanisme because preloaded libraries can be chained and calling directly into the C library prevents other caller to override our own function.

In practice, calling dlopen() on libc on an Ubuntu Karmic system made some program crash and burn for reasons I will not attempt to explain. The next technique should be preferred on Linux system, especially when dealing with the system C library.

dlsym() has an option that makes the Linux dynamic linker search for the right symbol to be override. This is the RTLD_NEXT flag, which is to be used just for the purpose of wrapper dynamic library functions.

libdl the task of returning the pointer to the right symbol. The RTLD_NEXT option to dlsym() returns the right symbol.

The next and final example of the use of LD_PRELOAD will still use the valiant ls. In time for Christmas, this will modify the output of ls by randomizing the d_type field returned in the dirent structure by readdir (3). If you use colorized ls output, and I believe most of you probably do, you should see a pretty display of color whenever you list a directory by preloading this function.

struct dirent64 *readdir64(DIR *dir) {
    static struct dirent64 *(* libc_readdir64)(DIR *dir) = NULL;
    struct dirent64 *dent;
    unsigned char rnd_dtype[7] = { DT_UNKNOWN, DT_REG,
                                   DT_DIR, DT_FIFO,
                                   DT_SOCK, DT_CHR,
                                   DT_BLK };

    if (libc_readdir64 == NULL) {
        *(void **)(&libc_readdir64) = dlsym(RTLD_NEXT, "readdir64");
        srand(time(NULL));
    }

    dent = libc_readdir64(dir);

    if (dent != NULL)
        dent->d_type = rnd_dtype[rand() % 7];

    return dent;
}

There is still a problem with this code on my new Ubuntu Hardy machine. The code from the preloaded library hangs before the program terminates. I do not understand why this happen and a search for this bug did not turn up anything. The problem doesn’t happen with Ubuntu Karmic.

There is nothing new about using LD_PRELOAD this way. Several very nice libraries have been built with the intention of modifying the behavior of typical libraries.

  • fakeroot: “fakeroot provides a fake root environment by means of LD_PRELOAD and SYSV IPC (or TCP) trickery.”
  • fakechroot: fakechroot provides a fake chroot environment to programs.
  • libtrash:“[...] the shared library which, when preloaded, implements a trash can under GNU/Linux”
  • cowdancer: cowdancer is an userland implementation of copy-on-write filesystem.

There are 29 projects matching LD_PRELOAD on freshmeat.net. You might have used some of them.

The code I have written for this demonstration is available on BitBucket.

Written by fdgonthier

January 11th, 2010 at 10:10 pm

On String.intern()

without comments

Where the author realizes the significance of the String.intern() method

I might have hinted about in in my previous post on the subject of strings in Java, yet I did not realize the significance of String.intern() method. The following code sample demonstrates the behavior of the String.intern() method, similar to what I demonstrated in the post.

public class TestClass2 {
    public static void main(String[] args) {
        String s1 = &quot;hello&quot;;
        String s2 = new String(&quot;hello&quot;);

        // This is going to be false.
        if (s1 == s2) System.out.println(&quot;s1 == s2&quot;);

        // This is going to be true.
        if (s1 == s2.intern()) System.out.println(&quot;s1 == s2.intern()&quot;);
    }
}

It’s a didactic example at best. It’s when you consider that strings also come from input/output that it String.intern() becomes a thing of interest.

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;

public class TestClass2 {
    public static void main(String[] args) {
        String s1 = &quot;hello&quot;;
        String s2 = null;

        try {
            // Enter hello at this point.
            s2 = new BufferedReader(new InputStreamReader(System.in)).readLine();

            // This is going to be false.
            if (s1 == s2) System.out.println(&quot;s1 == s3&quot;);

            // This is going to be true.
            if (s1 == s2.intern()) System.out.println(&quot;s1 == s2.intern()&quot;);

        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

As you can see in that example, the String.intern() method returns a reference to the string “hello” already in the constant pool. The virtual machine maintains an table of string instances that can be shared between all the string references in the program.

An immediate and obvious benefit of this technique called String interning is reduced memory footprint because of object reuse. Wikipedia also describes that the technique is also used by programs that need to do fast string comparisons such as compiler. This allow to compare strings by simple comparing references instead of possibly scanning the full length of both strings.

The JDK documentation gives a better description of the behavior of the String.intern() method. It’s a surprise to me that I never took the time to understand this behavior of such a core class of the Java library.

Microsofties might also find interesting that the .NET Framework also has a String.intern() method which behaves approximatively in the same way.

Written by fdgonthier

November 6th, 2009 at 8:00 pm