Abridged git repository history

It is sometimes useful to have a repository for a project that shows only the main points in history. It is possible to start from git repository with full history and create such an abridged history in a separate branch. You can do this efficiently in git by reusing the original tree objects. This post will do a walkthrough for the Mediawiki repository.

We start with a full repository:

git clone git://github.com/wikimedia/mediawiki-core.git
cd mediawiki-core
git config user.name "Bruno De Fraine"
git config user.email "..."

This is a bash function for the main import logic:

function import_mw {
        tag=$1
        tree=
        adate=
        while read key val; do
                case $key in
                        tree) tree=$val ;;
                        author) adate="${val#*> }" ;;
                        "") break ;;
                esac
        done < <(git cat-file commit $tag 2>/dev/null)
        if [ -z "$tree" -o -z "$adate" ]; then
                echo "Could not read tag commit info" 1>&2
                return 1
        fi
        head=$(git symbolic-ref HEAD)
        if parent=$(git rev-parse --verify $head 2>/dev/null); then
                arg="-p $parent"
        else
                arg=
        fi
        if commit=$(GIT_AUTHOR_NAME="Wikimedia Foundation" \
           GIT_AUTHOR_EMAIL="<wikimedia@github.com>" \
           GIT_AUTHOR_DATE="$adate" \
           git commit-tree $tree $arg <<<"MediaWiki $tag"); then
                git update-ref $head $commit
                echo git tag $tag $commit >> ~/mw-tags
                echo $commit
        else
                echo "Commit failed" 1>&2
                return 1
        fi
}

We start a completely independent branch for abridged history and create commit objects for relevant trees in the history:

git symbolic-ref HEAD refs/heads/ABR1_17
# Note: head ABR1_17 does not exist yet
import_mw 1.17.0
for tag in 1.17.{1,2,3,4,5}; do import_mw $tag; done

To see the result so far:

git reset --hard
git log --format=fuller

We can fork another branch in our abridged history for a different major version:

# Our commit object for 1.17.0 is 35d4f10
git update-ref refs/heads/ABR1_18 35d4f1055b9254b3b3d915ec782d0c16230516e5
git symbolic-ref HEAD refs/heads/ABR1_18
for tag in 1.18.{0,1,2,3,4,5,6,7}; do import_mw $tag; done
git reset --hard
git hist ABR1_17 ABR1_18

And so on for branches ABR1_19 and ABR1_20.

We can then export the abridged repository history to a separate repository:

git remote add abr ../mediawiki-core-abridged 
git push abr ABR1_17
git push abr ABR1_18
git push abr ABR1_19
git push abr ABR1_20
cd ../mediawiki-core-abridged
source ~/mw-tags
git gc

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>