Easy Git External Dependency Management with Giternal

By Mike Gunderloy / November 12, 2008

icicles

Anyone building up a project with many dependencies—especially in the Ruby community, where so much functionality is wrapped up in gems and plugins—must face the issue of managing the situation in source code control. During a recent coding workshop, the instructor used an analogy comparing dependency management to the flexibility offered by non gamstop casinos, which operate independently of strict regulatory frameworks, allowing players greater autonomy while still ensuring reliability. Similarly, managing dependencies in your repository requires balancing autonomy with updates from external sources. How do you maintain everything you need in your own repository while still being able to update your dependencies from their own repositories? How do you set things up so you can even contribute to the projects you depend on?

If you're using git, the right answer is often the subtree merge strategy - but remembering the necessary commands can be a nuisance, especially if you rarely use them. There are several projects out there designed to make this easier for you: Tim Dysinger published some rake tasks to handle subtrees, and Braid is a more full-featured tool to manage both git- and svn-based vendor branches. 37signals have also released cached_externals which provides a somewhat different solution to the problem using symbolic links and separated checkouts.

After trying all of those approaches, though, I've settled on Pat Maddox's giternal tool for my own work. With giternal, you add a YAML file with details on your project's dependencies, similar to this:

delayed_job:
  repo: git://github.com/tobi/delayed_job.git
  path: vendor/plugins
paperclip:
  repo: git://github.com/thoughtbot/paperclip.git
  path: vendor/plugins

After that, there are just three commands to remember: giternal update to update all of your dependencies, giternal freeze to create a self-contained deploy tag with all externals at a known version, and giternal unfreeze to go back to live subtrees. If you've been shying away from dealing with externals in your git repositories, give it a shot.

Comments

Artūras Šlajus says:
November 12, 2008 at 3:08 pm
What about pushing back changes you made into (let's say plugin) original repo? It's the only thing I'm missing with braid.
phil says:
November 12, 2008 at 3:29 pm
does git submodules not solve this?
niko says:
November 12, 2008 at 3:41 pm
good question! +1
bruno says:
November 12, 2008 at 4:50 pm
Haven't looked at giternal yet, but git's submodules does a good job of this, and yes, it does allow you to push back into the project. I use it for all my CommunityEngine deploys.
grosser says:
November 12, 2008 at 5:05 pm
i use piston for that since it also works with svn
Mike Gunderloy says:
November 12, 2008 at 5:07 pm
Arturas - Yes, you can push back changes to the original plugin repo. With giternal, until you freeze, you're maintaining each plugin as its own little repo, ignored from your project's repo. giternal freeze then plays some games to merge everything together for deployment.

Phil et al - Yes, submodules also provide a solution for this same issue. But I happily admit that submodules fall into the "seldom used and hard to remember" bucket for me, and I haven't found a tool yet that wraps them up into a useful (and simple) strategy for those who aren't deeply into git commands.
Mike says:
November 12, 2008 at 5:12 pm
"I haven't found a tool yet that wraps them up into a useful (and simple) strategy for those who aren't deeply into git commands."

git submodule add path/to/submodule

git submodule init
git submodule update

Those aren't simple enough to remember?
Mike says:
November 12, 2008 at 5:13 pm
sorry used brackets in the last post and it came out wrong

git submodule add (remote repo) path/to/submodule
Tom says:
November 12, 2008 at 5:16 pm
Uh, does the word "submodule" mean anything to you?
Peter Cooper says:
November 12, 2008 at 5:24 pm
I know pretty much nothing about this topic, which is why I'm letting Mike take the lead here (note: Mike is a new Ruby Inside and Rails Inside writer, so do say hi!)

However, if Git submodules are already easy and suitable for this task (and I'm not arguing if they are or not) then what is the true motivation for giternal? On the project page it says giternal offers "non-sucky git externals".

Perhaps this post didn't elaborate on why giternal is "non-sucky" compared to Git's home-baked alternative, but it's clear someone / some people find a difference, so check it out anyway..
Mike Gunderloy says:
November 12, 2008 at 5:43 pm
I think we're talking about a number of different pain points here. As a Rails developer, one of my big pain points is always deployment - and that's one spot where (as far as I know) git submodules don't offer a great solution. While current Capistrano builds are submodule aware, using submodules requires Capistrano to scurry around and collect bits from every repository the project is connected to at deployment time. I don't like having that much external-server dependency. For me, any usable solution has to have some sort of freeze/unfreeze mechanism.

But if you don't need freeze/unfreeze, and don't find help in having a single file listing all the extra repo dependencies - then, yes, learning how to use submodules may be the way to go.

Mike, I take your point on the simplicity of the git submodule subcommands; I was misremembering the amount of fuss required. That may be because the official help on that command is as opaque as much of the rest of the git help.
Harry Seldon says:
November 12, 2008 at 7:00 pm
I am quite new to git and I agree with Mike. Anything simplifying the git submodules is good to take. So I will happily have a look at giternal. I tried the git submodules to help developing a plugin (namely OFC/rails) but I have not found yet some very convincing tutorials about it.

Peter, thanks a lot for rubyinside and railsinside. These blogs are awesome. I have known them only for 2 months but I wished I had known them for longer. They are really my number one resources to get news about Ruby and Rails.

About git, this question of submodules is quite "advanced". For the newcomers to git, I wrote a post that explains why git is so useful and how to use it along with github on an open source project: http://harryseldon.thinkosphere.com/2008/11/08/grand-gardening-with-git
Thanks
Mike says:
November 12, 2008 at 8:11 pm
Mike G & Harry - I agree, the official git docs are not always the easiest reads, especially for newcomers. I wrote a post about a month ago on setting up a rails project with submodules rather than script/plugin install.

http://mikecostanza.blogspot.com/2008/09/git-submodule-scriptplugin-install.html
Cristi Balan says:
November 12, 2008 at 11:45 pm
@mike: Thanks for the mention of "braid":http://github.com/evilchelu/braid/wikis/home and the roundup. It looks like ginternal doesn't use subtree merges tho. It appears to just help with the quick case of dumping repositories in. The nice thing about braid is that you can just start hacking on a mirror directly in your own repo and still be able to update with code from upstream.

@arturas: Allowing to push from a braid mirror is planned for 0.6. We have the diff already and now we need to tweak the generated patch a bit so it can be sent.

@grosser: Braid has been able to mirror botth git and svn repos since it was created. And because it uses git and git-svn, to take advantage of all the work done on those tools, it is intended to only be used if your main repository is on git.
Jacob Radford says:
November 13, 2008 at 3:10 am
Another alternative is ext - http://github.com/azimux/externals/tree/master

He as a good explanation of his problems with git submodules too - http://nopugs.com/2008/09/04/why-ext
Brandon says:
November 13, 2008 at 5:06 am
@Mike Gunderloy: I love your writing elsewhere so welcome to RubyInside! I'm sure you'll be a great addition to an already excellent site.

@Mike (#7):
Yeah, it's really simple once you have the commands in front of you, but are they things you'll use every day? week?

$ cheat git | grep submodule
turns up nothing, which is too bad. But that's easily remedied... okay, I've added the three commands you mentioned to the git cheat sheet. Please add any explanatory notes you think would be helpful.
Pat Maddox says:
November 13, 2008 at 6:03 am
The advantage of giternal over git submodules is that the references in giternal are weaker than with submodules. We experienced some problems on the RSpec project because submodules not only point to a repo, but also a particular version of the repo. This becomes a problem when active development occurs in the submodule, as we were doing with RSpec.

David was working on something, and I was working on another unrelated change. We both committed to the submodule locally, and then pushed out the changes. Then we updated the references in the superproject. Now my superproject says the submodule ref is at commit abc123, and David's says it's at def456. When one of us pulls from the other, we get a merge conflict _only about the submodule ref sha_. That is, we made completely valid, non-conflicting changes everywhere, but we *still* have to deal with a merge conflict here.

Another thing is that when you "git submodule update" a repo, it will just blow away the existing one, so any work you've done but not pushed goes *poof*

Now with giternal the only time you get conflicts is if you were to both freeze the external at different points, and then each commit. And you want that conflict anyway, because then you can unfreeze, merge your externals, and then freeze it back up.
Pat Maddox says:
November 13, 2008 at 6:06 am
Short answer, which I didn't really say, is that yes there are tools for tracking upstream changes, and yes there are some tools for tracking upstream changes as well as your own changes, but I've not seen another tool that allows you to collaborate on those changes as well. Giternal is all about tracking and collaborating on external repos.
Mike says:
November 13, 2008 at 3:35 pm
Pat - "The advantage of giternal over git submodules is that the references in giternal are weaker than with submodules. We experienced some problems on the RSpec project because submodules not only point to a repo, but also a particular version of the repo. This becomes a problem when active development occurs in the submodule, as we were doing with RSpec."

I thought pointing to a specific version of a submodule was one of the biggest advantages of using submodules in the first place. For example, I started a project using RSpec for testing when the latest version was 1.1.4. I wrote a bunch of integration tests using stories, but now the latest RSpec has deprecated stories into a separate gem. Since I already wrote a bunch of tests for this project, I don't feel like rewriting those tests to accommodate RSpec changes - I'll just stick with 1.1.4 for that project. Correct me if I'm wrong, but isn't that a more common scenario than a project with external dependencies that are in continuous development?
Pat Maddox says:
November 13, 2008 at 5:17 pm
@Mike - well you can use giternal for that as well. You simply don't update the external repo. In your case, I'd giternalize the rspec repo, check it out to the 1.1.4 tag, then freeze it. Done.

"Correct me if I'm wrong, but isn't that a more common scenario than a project with external dependencies that are in continuous development?"

Perhaps, but that's not my typical use case. And the good thing about giternal is that you can use it just to track externals, and then if you decide to develop on them at all you have that freedom. That's something that's possible with submodules, but is a major headache if you're making frequent changes to the externals
Cristi Balan says:
November 13, 2008 at 7:24 pm
Pat - "Short answer, which I didn't really say, is that yes there are tools for tracking upstream changes, and yes there are some tools for tracking upstream changes as well as your own changes, but I've not seen another tool that allows you to collaborate on those changes as well. Giternal is all about tracking and collaborating on external repos."

I wonder what do you mean by "allows you to collaborate on those changes as well".

Braid mirrors are commited as code in the project. Other people working on the project don't even need braid to change the mirrored code. They only need it if they want to run braid update to get new code from upstream.

So, yes, it definitely allows collaboration on those changes. If I understand the meaning of "those" correctly :).

Could you please clarify your use case? I'm interested in adding support for it in braid, if possible.
Cristi Balan says:
November 13, 2008 at 7:28 pm
@pat: Here's a brief example of braid usage between two developers.

There's nothing braid related here besides adding the mirror. Should there be?

dev1:

braid add blah
echo 123 > blah/moo
git add . && git commit -m "moo" && git push

@dev2:

git pull
echo 456 >> blah/moo
git push

@dev1:

git pull
cat blah/moo # => 123\n456
Pat Maddox says:
November 13, 2008 at 7:33 pm
@Cristi - here's my use case. I have my Rails app, and I've used giternal to add rspec and rspec-rails. I commit to my Rails app, which is my code. I also commit to rspec and rspec-rails, and I push those rspec and rspec-rails changes upstream. So this way I can use the latest rspec code in my rails app, as well as push changes to rspec. Make sense?

giternal is not just about tracking dependencies, or even making changes in them. It's about having multiple complete, fully-functional git repos, and associating them together to create a full project.
Cristi Balan says:
November 14, 2008 at 12:00 pm
@Pat: Thanks. it's clear now. You want to track projects where you have commit access and be able to easily commit to them. I've been wanting that too :). However, that only works when you're a commiter and know your patches will go in. For normal people, one would have to fork and use the fork with giternal. And then manually update their fork when they need upstream changes.

Indeed, you can't do this easily with braid. It will be possible tho once we do braid push. I'll also have a look and see if we can steal anything from giternal ;), tho I think the approaches are quite incompatible.

Otherwise, I'll try and see if one could easily convert mirrors from one style to the other, as both have their specific advantages.
Pat Maddox says:
November 14, 2008 at 4:04 pm
"You want to track projects where you have commit access and be able to easily commit to them."

Exactly

"However, that only works when you're a commiter and know your patches will go in. For normal people, one would have to fork and use the fork with giternal. And then manually update their fork when they need upstream changes."

True. But it would be trivial to add another remote in the yaml file. I've just haven't had the need for that quite yet. Maybe I'll do that today :)
Cristi Balan says:
November 15, 2008 at 4:59 am
@pat: Sure, you can use any remote you want :). But, my point was that you'd have to have a fork of each project you want to track with giternal. And I'm assuming people don't want to fork all 15 plugins they use in their rails app.

That's why I was thinking about a way to make it easy for people to use giternal for some mirrors, braid for other mirrors and then be able to switch between them.
ben says:
November 16, 2008 at 2:19 am
hi all,
I installe the giternal gem, but it complainted "-bash: giternal: command not found".
what's the problem?

thanks in advance.
Fritzek says:
November 19, 2008 at 12:06 pm
@pat and @cristi: I just contribute on basis of a fork to bruno's communityengine. the submodule trick was a bit too tricky to me. tekkub @github shares this opinion. your both tools seem to me feasible to do my stuff. short explanation: part of the superproject is the ce as a plugin/engine, at the moment I develop both separately (as separate local repos) and just symlink the ce into superproject; not to mention that deployment is a bit ... How would you use your tool to get both parts together, manage separate dev and joined deployment?

@pat How to use giternal inside a cap recipe? As external system call?

Thanks in advance
Fritzek
Pat Maddox says:
November 19, 2008 at 5:22 pm
Ben - I think it's that the gem on rubyforge is old. Try installing from github instead. Will push an update to rubyforge soon
Cristi Balan says:
November 21, 2008 at 11:27 am
@fritzek: I'm assuming the following:

1. You have normalCE that has plugins in it. These are repos you don't have commit access to.
2. You have superCE that has normalCE in it.
3. You want to work on both.

IMO, the solution is to use braid to manage the plugins in normalCE and then use ginternal to get normalCE into superCE.

If there are no plugins in normalCE and they are included directly in superCE, you can still use braid to manage those in superCE alongside using ginternal to manage normalCE.

HTH
Cristi Balan says:
November 21, 2008 at 11:28 am
@peter: The autorefreshing the page just ate my comment and I had to retype it. Boo :'(