One for the Morning Glory: June 2007

Thursday, June 21, 2007

Jealousy

I've been suffering for a while from mild jealousy of the people who have a spare week of vacation time to make it to Edinburgh for DebConf. (is there a support group for this? :) ) But now I read that some people apparently have so much time off that they travelled to Edinburgh a full week ahead of time, just because, y'know, they could. Darn Europeans.

But then, I only have to work for another year and a half before I qualify for another whole week of vacation time, raising the total to three weeks off per year. Maybe I can make DebConf 2009, assuming that President Hillary doesn't invade Luxembourg or something...

Tuesday, June 19, 2007

Fun with Functions

Pete Nuttall writes about the fnreduce function:

fnreduce :: [(a->a)]->a->a
fnreduce [] value = value
fnreduce (a:as) value = freduce as $ a value

A shorter, although not necessarily clearer, expression of this function would be

fnreduce as value = (foldr (flip (.)) id as) value

This just says that to apply a list of functions to a value, we first compose all the functions by folding down the list, then apply the result to the value. The important thing here is (flip (.)), which says to compose backwards; it means that fnreduce [f, g, h] returns (h . g . f . id) instead of (f . g . h . id). An interesting side note is that it doesn't matter (from a semantic point of view) whether I use foldl or foldr, since function composition is associative:

f . (g . (h . id)) = ((id . f) . g) . h

But "flip" here feels a little confusing. A clearer approach would be to reverse the list instead of reversing the operator:

fnreduce as value = foldr (.) id (reverse as) value

This might be less efficient, but it says in a much clearer way what I'm doing.

Yet another option is to eschew the use of function composition:

fnreduce as value = foldl (flip ($)) value as

Here ($) is the application function; (f$x) applies f to x. So if we reverse it (call this $$), we get (x$$f) applies f to x. Folding this from the left down the list [f, g, h] gives us (((value$$f)$$g)$$h), which is exactly what the original fnreduce did.

I think this is maybe the best fold-based solution to the original problem, because it models what's happening (repeatedly transforming a value with a series of functions) directly. To use a right-fold, you have to remove the flip and reverse as (the reason this is necessary is left as an exercise to the reader).

Pete also asks whether this can be done for heterogenous lists in a type-safe manner. The core problem is, what is the type of our list? It can't actually be a list: in order to check that it's OK to combine the output of one function with the input of another, we would need the elements of the list to have different types. You can't do this in Haskell, even a crazy ghc-extended Haskell.

So, we could try something like this:

module Test where

data TransformList a b where
    Nil   :: TransformList a a
    Cons  :: (a -> b) -> TransformList b c -> TransformList a c

apply :: TransformList a b -> a -> b
apply Nil         = id
apply (Cons f fs) = (apply fs) . f

This lets us do stuff like

> apply (Cons (\x -> x + 1) (Cons (\x -> show x) Nil)) 5
"6"

This is cute, but I don't know that it's really that useful. After all, how is it any better than the following?

> ((\x -> show x) . (\x -> x + 1)) 5
"6"

Everything the above code lets you do with TransformList could just as well be done with the humble compose operator. The only benefit would be if we could somehow do "list-like" stuff with a transform list. For instance, say we want to write the following function:


traceIntermediates Nil        x = (x, [])
traceIntermediates (Cons f l) x = let x'        = f x
                                      (x'', ss) = traceIntermediates l x'
                                  in
                                    (x'', (show x'):ss)

This won't compile, because we don't know that x' is a member of the Show typeclass. And there is no way to fix this in a reusable way using any technique that relies on building a type parameterized on the input and output types: the problem is that we need to say something about the types of the intermediate values, but their type variables are inaccessible.

In other words, there isn't any way to write down a constraint on the type "TransformList a b" that constrains the intermediates. So I've just wasted a lot of typing duplicating the compose operator. Yay.

On the other hand, it's entirely possible to do this if you don't mind some boilerplate code.

data ShowTransformList a b where
    Nil   :: (Show a) => ShowTransformList a a
    Cons  :: (Show a, Show b, Show c) => (a -> b) -> ShowTransformList b c -> ShowTransformList a c

traceIntermediates :: ShowTransformList a b -> a -> (b, [String])
traceIntermediates Nil        x = (x, [])
traceIntermediates (Cons f l) x = let x'        = f x
                                      (x'', ss) = traceIntermediates l x'
                                  in
                                    (x'', (show x'):ss)

apply :: ShowTransformList a b -> a -> b
apply Nil         = id
apply (Cons f fs) = (apply fs) . f

Now I can do this:

> traceIntermediates
    (Cons (\x -> x + 1)
          (Cons (\x -> x * 2)
                (Cons (\x -> x - 5)
                      (Cons show
                            (Cons (read :: String -> Double)
                                  Nil))))) 5
(7.0,["6","12","7","\"7\"","7.0"])

Sunday, June 17, 2007

Freudian slips in commit messages

changeset: 700:d8308a89c519

user: Daniel Burrows

date: Sun Jun 17 16:22:43 2007 -0700

summary: Call pre_package_state_changed() before undoing, to resent the dependency resolver; call package_state_changed() after an undo if necessary.

Saturday, June 16, 2007

aptitude development moves to Mercurial, back to Alioth

Short version: the aptitude development tree is now maintained using Mercurial, go here to fetch it.

aptitude development has been using darcs as a version control system for the last two years. Darcs is a distributed version control system with many great concepts -- but it has one huge, glaring, enormous flaw, which is that innocuous operations, like merging one branch into another, can require practically unbounded amounts of time and space. And once two branches get screwed up in this way, there is nothing, nothing at all, that you can do to get yourself unwedged; the cause is a design flaw in darcs that no-one seems to know how to fix or work around.

Recently, I made the mistake of changing some files in both the Debian branch of aptitude and the head branch, and when I tried to merge the two branches to upload a new upstream release, darcs went off into la-la land. So, I figured that it was probably time to switch version control systems.

I'd like to say that I did extensive research on the matter, but the truth is that I chose Mercurial based on the fact that (A) a lot of ex-darcs users have written nice things about it, (B) git and arch-derived systems always feel ugly to me, and (C) John Goerzen implemented support for mercurial on Alioth. Yes, I am so a follower.

So if you want the up-to-date aptitude source, you should go to http://hg.debian.org/hg/aptitude/head. You can also fetch an RSS feed from http://hg.debian.org/hg/aptitude/head?style=rss.

You can fetch a copy of the repository with "hg clone http://hg.debian.org/hg/aptitude/head aptitude-head". To update, I recommend putting the following lines in ~/.hgrc to enable the "fetch" extension:

[extensions]
hgext.fetch =

Once you've done this, just run "hg fetch" and it should download and merge the latest code. You may want to install kdiff3 to get a better merge tool as well.

To commit a patch locally, run "hg commit". Unfortunately, there doesn't seem to be any easy way for people to send patches back within the VCS like there is for darcs (one of many nice features I had to give up in favor of being able to actually get work done -- what, me bitter?), but it looks like you can get somewhere with "hg bundle <file>", then mailing the resulting file off.

Sunday, June 10, 2007

Dear Lazyweb: Mercurial Python API

So, in the midst of trying to convert some Darcs repositories to Mercurial repositories (see here for the reason), I hit an "interesting" bug: tailor processes "rename" patches by deleting the renamed file. A little debugging later, I get to this simulation of what tailor does:


>>> from mercurial import hg,ui
>>> uio = ui.ui(debug=True,verbose=True)
>>> r = hg.repository(ui=uio, path='.', create=True)
>>> f = file('A', 'w')
>>> f.write('Some data\n')
>>> f.close()
>>> r.add('A')
>>> r.commit(text='Add A')
A
'i\xe9\xb4\xc8\x9d\xb8\xeb<\xbe\xb1k\xae\xb1\x182\xa9ao\x80\xd0'
>>> r.copy('A', 'B')
B does not exist!

Huh? Of course B doesn't exist, that's why I want to create it! So, we look at the code in localrepo.py to handle copying...


    def copy(self, source, dest, wlock=None):
        p = self.wjoin(dest)
        if not os.path.exists(p):
            self.ui.warn(_("%s does not exist!\n") % dest)
        elif not os.path.isfile(p):
            self.ui.warn(_("copy failed: %s is not a file\n") % dest)
        else:
            if not wlock:
                wlock = self.wlock()
            if self.dirstate.state(dest) == '?':
                self.dirstate.update([dest], "a")
            self.dirstate.copy(source, dest)

Double huh?

Why is this method checking whether the destination exists before it does anything? It appears to me that this is actually necessary; just disabling this test leads to a crash later.
How does mercurial work at all with such huge breakage in its core code? Does /usr/bin/hg use a different implementation of the repository code than mercurial.hg?
Why, in the name of $DEITY, is this a warning and not a fatal exception? One example of what happens when you don't signal to your users that you're ignoring them is this bug: tailor happily goes ahead and removes the original file, safe in the knowledge that mercurial has copied it to the new location. Except, of course, that mercurial hasn't actually copied it. Whoops.

I'm hoping that someone will tell me that it's just too late at night for me and I'm being an idiot. Please tell me this. It would make me very sad if code like that was in my VCS.

:-( <--- sad Daniel

Closing bugs just because they're old is not cool

Two bug reports that I wrote were recently closed with the following explanation:

Last message was posted nearly three years ago. Considering abandoned and closing.

Whoops, I thought, I must have missed a request for more information. So I went to the bug page and, nope, the only message on the bug was my initial report.

Let me repeat that: the total sequence of interactions on these bugs was:

I report a bug in 2004
Someone closes it in 2007 because it's "too old".

Now, there are times when it's at least somewhat excusable to close old bugs without verifying that they're fixed (or even knowing that they might not be). When a package has hundreds of bug reports, closing old bugs that can't be reproduced can be a good thing: even if the reports are valid, keeping reports that can't contribute towards a fix around is arguably less useful than cleaning out the bug list so it's usable.

In my case, though, the package in question has only 20 or so bug reports, an easily managed number. Worse, both bugs I reported could be trivially reproduced. One of them was arguably not a bug, but the guy who closed my bugs did not argue that; as far as I can tell, he didn't even look at my bug report.

Notice that I haven't mentioned the maintainer of the package? That's because he didn't close the bug. In fact, the person who closed the bug is not a member of Debian, nor is he in the NM process. Evidently he's just a random user who decided that old bugs offend him, so they should be closed.

So please, everyone, exercise a modicum of common sense when it comes to closing bug reports. This will keep the BTS useful for all of us and keep blood pressures at a reasonable level. :)

Friday, June 08, 2007

Blogger annoyingness

Two days ago, I noticed that a post from last year about my wallet getting stolen had attracted over 60 spams. So I went in, deleted all of them individually (because I can't select multiple comments to delete in Blogger), then closed comments on the post.

This morning, I caught up on my Planet Debian email, and discovered that Blogger had decided to make that post "new", so it showed up on Planet again! Argh!

So don't worry, I didn't lose my wallet again. Blogger is just nuts.

One for the Morning Glory