I like git a lot. I really think it is a major improvement over most existing SCM solutions, especially the old ones.

I often reluctantly get into discussions regarding 'which SCM is best'. Most of the times the other contenders are the two other popular distributed SCM solutions out there, Bazaar and Mercurial.

To be frank, I don't have any particular claims against Bazaar and Mercurial. They are both OK, and follow the same distributed guidelines that made Git appealing to me. I think the bottom line in most of these discussions is that Git is confusing, Bazaar is slow, and many other petty differences between the solutions exists. In the end, I think these differences are not that interesting.

What I like about git is that it's amazingly flexible. By exposing its very internal mechanism to the user (and we'll get to that in a second), it enables a wide variety of use cases, and puts you in control over what happens underneath.

The major disadvantage of this, and that isn't new to anyone, is that the learning curve for git is very steep. I will not argue about this, because it is true. Git is only easy to learn once you learnt it. I'm not talking about the really complex stuff, of course, I'm only talking about getting yourself out of tight spots and understanding what you're doing, and that, of course, is an absolute must in an SCM solution.

So what can be done? I think the problem is with git's current user base. Git is extremely biased towards its user base, a vast portion of which are Linux kernel developers or people who are very close to that status.

For instance, git has many features aimed at communicating via patches, e-mails etc. This means that if you want to establish a central server for collaboration, people would frown at that. I've seen many flames over this in IRC channels, especially concerning 'tweaks' in the workflow, for instance working with branches on a shared repository. It all boils down to the fact that the core user base, who is very proficient with git, is controlling the UI.

Git lovers often wave the notion of 'plumbing vs. porcelain', and I think that principal is right - the inner workings of Git shouldn't be pretty - they just need to pave the way for a decent UI. The only problem is that the 'porcelain' that is presented by git is still too low-level, as I mentioned before.

Since I started with git I reached the logical conclusion - if git is that powerful, and you want something on top of it to make more sense - write a higher-level porcelain! Unfortunately, that isn't so easy, because git is not that scriptable.

If you take a quick sample of commands (especially the git-submodule commands, git-checkout, and even git-log), you'll see many arguments, inconsistent error reporting, and a lot of voodoo. Scripting ain't pretty in that environment - try to script the behavior of git-status, and you'll run into either a parsing hell, or at least a serious headache from sifting through man pages. It's insane to program on top of that without a decent library to wrap away the ugly parts.

Looking at existing such libraries in Python shows a single relevant result, pygit, which is seemingly aimed more at browsing and inspecting rather than editing and modifying. It also took the approach of using libgit, which means it is bound to break often, and will have binary dependencies...

This is why I've jotted down gitpy. It is a Python library for git, aiming to be a complete set of API, with an emphasis on completeness rather than speed or efficiency. It is also pure Python, which minimizes dependency. It is far from perfect, but whenever I have the chance I work on it and I already implemented a set of tools on top of it, so over time it will improve.

If you want to help and/or use it, drop an e-mail or collaborate using github. Help will be appreciated.


blog comments powered by Disqus