Ruby Weekly is a weekly newsletter covering the latest Ruby and Rails news.

Easy Git External Dependency Management with Giternal

By Mike Gunderloy / November 12, 2008

iciclesAnyone building up a project with many dependencies - and in the Ruby community, with so much functionality wrapped up in gems and plugins, it's hard to imagine not having external dependencies! - must face the issue of managing the situation in source code control. How do you maintain everything you need in your own repository, while still being able to update your dependencies from their own repository? How do you set things up so you can even contribute to the projects you depend on?

If you're using git, the right answer is often the subtree merge strategy - but remembering the necessary commands can be a nuisance, especially if you rarely use them. There are several projects out there designed to make this easier for you: Tim Dysinger published some rake tasks to handle subtrees, and Braid is a more full-featured tool to manage both git- and svn-based vendor branches. 37signals have also released cached_externals which provides a somewhat different solution to the problem using symbolic links and separated checkouts.

After trying all of those approaches, though, I've settled on Pat Maddox's giternal tool for my own work. With giternal, you add a YAML file with details on your project's dependencies, similar to this:

delayed_job:
  repo: git://github.com/tobi/delayed_job.git
  path: vendor/plugins
paperclip:
  repo: git://github.com/thoughtbot/paperclip.git
  path: vendor/plugins

After that, there are just three commands to remember: giternal update to update all of your dependencies, giternal freeze to create a self-contained deploy tag with all externals at a known version, and giternal unfreeze to go back to live subtrees. If you've been shying away from dealing with externals in your git repositories, give it a shot.

Comments

  1. Artūras Šlajus says:

    What about pushing back changes you made into (let's say plugin) original repo? It's the only thing I'm missing with braid.

  2. phil says:

    does git submodules not solve this?

  3. niko says:

    good question! +1

  4. bruno says:

    Haven't looked at giternal yet, but git's submodules does a good job of this, and yes, it does allow you to push back into the project. I use it for all my CommunityEngine deploys.

  5. grosser says:

    i use piston for that since it also works with svn

  6. Mike Gunderloy says:

    Arturas - Yes, you can push back changes to the original plugin repo. With giternal, until you freeze, you're maintaining each plugin as its own little repo, ignored from your project's repo. giternal freeze then plays some games to merge everything together for deployment.

    Phil et al - Yes, submodules also provide a solution for this same issue. But I happily admit that submodules fall into the "seldom used and hard to remember" bucket for me, and I haven't found a tool yet that wraps them up into a useful (and simple) strategy for those who aren't deeply into git commands.

  7. Mike says:

    "I haven't found a tool yet that wraps them up into a useful (and simple) strategy for those who aren't deeply into git commands."

    git submodule add path/to/submodule

    git submodule init
    git submodule update

    Those aren't simple enough to remember?

  8. Mike says:

    sorry used brackets in the last post and it came out wrong

    git submodule add (remote repo) path/to/submodule

  9. Tom says:

    Uh, does the word "submodule" mean anything to you?

  10. Peter Cooper says:

    I know pretty much nothing about this topic, which is why I'm letting Mike take the lead here (note: Mike is a new Ruby Inside and Rails Inside writer, so do say hi!)

    However, if Git submodules are already easy and suitable for this task (and I'm not arguing if they are or not) then what is the true motivation for giternal? On the project page it says giternal offers "non-sucky git externals".

    Perhaps this post didn't elaborate on why giternal is "non-sucky" compared to Git's home-baked alternative, but it's clear someone / some people find a difference, so check it out anyway..

  11. Mike Gunderloy says:

    I think we're talking about a number of different pain points here. As a Rails developer, one of my big pain points is always deployment - and that's one spot where (as far as I know) git submodules don't offer a great solution. While current Capistrano builds are submodule aware, using submodules requires Capistrano to scurry around and collect bits from every repository the project is connected to at deployment time. I don't like having that much external-server dependency. For me, any usable solution has to have some sort of freeze/unfreeze mechanism.

    But if you don't need freeze/unfreeze, and don't find help in having a single file listing all the extra repo dependencies - then, yes, learning how to use submodules may be the way to go.

    Mike, I take your point on the simplicity of the git submodule subcommands; I was misremembering the amount of fuss required. That may be because the official help on that command is as opaque as much of the rest of the git help.

  12. Harry Seldon says:

    I am quite new to git and I agree with Mike. Anything simplifying the git submodules is good to take. So I will happily have a look at giternal. I tried the git submodules to help developing a plugin (namely OFC/rails) but I have not found yet some very convincing tutorials about it.

    Peter, thanks a lot for rubyinside and railsinside. These blogs are awesome. I have known them only for 2 months but I wished I had known them for longer. They are really my number one resources to get news about Ruby and Rails.

    About git, this question of submodules is quite "advanced". For the newcomers to git, I wrote a post that explains why git is so useful and how to use it along with github on an open source project: http://harryseldon.thinkosphere.com/2008/11/08/grand-gardening-with-git
    Thanks

  13. Mike says:

    Mike G & Harry - I agree, the official git docs are not always the easiest reads, especially for newcomers. I wrote a post about a month ago on setting up a rails project with submodules rather than script/plugin install.

    http://mikecostanza.blogspot.com/2008/09/git-submodule-scriptplugin-install.html

  14. Cristi Balan says:

    @mike: Thanks for the mention of "braid":http://github.com/evilchelu/braid/wikis/home and the roundup. It looks like ginternal doesn't use subtree merges tho. It appears to just help with the quick case of dumping repositories in. The nice thing about braid is that you can just start hacking on a mirror directly in your own repo and still be able to update with code from upstream.

    @arturas: Allowing to push from a braid mirror is planned for 0.6. We have the diff already and now we need to tweak the generated patch a bit so it can be sent.

    @grosser: Braid has been able to mirror botth git and svn repos since it was created. And because it uses git and git-svn, to take advantage of all the work done on those tools, it is intended to only be used if your main repository is on git.

  15. Jacob Radford says:

    Another alternative is ext - http://github.com/azimux/externals/tree/master

    He as a good explanation of his problems with git submodules too - http://nopugs.com/2008/09/04/why-ext

  16. Brandon says:

    @Mike Gunderloy: I love your writing elsewhere so welcome to RubyInside! I'm sure you'll be a great addition to an already excellent site.

    @Mike (#7):
    Yeah, it's really simple once you have the commands in front of you, but are they things you'll use every day? week?

    $ cheat git | grep submodule
    turns up nothing, which is too bad. But that's easily remedied... okay, I've added the three commands you mentioned to the git cheat sheet. Please add any explanatory notes you think would be helpful.

  17. Pat Maddox says:

    The advantage of giternal over git submodules is that the references in giternal are weaker than with submodules. We experienced some problems on the RSpec project because submodules not only point to a repo, but also a particular version of the repo. This becomes a problem when active development occurs in the submodule, as we were doing with RSpec.

    David was working on something, and I was working on another unrelated change. We both committed to the submodule locally, and then pushed out the changes. Then we updated the references in the superproject. Now my superproject says the submodule ref is at commit abc123, and David's says it's at def456. When one of us pulls from the other, we get a merge conflict _only about the submodule ref sha_. That is, we made completely valid, non-conflicting changes everywhere, but we *still* have to deal with a merge conflict here.

    Another thing is that when you "git submodule update" a repo, it will just blow away the existing one, so any work you've done but not pushed goes *poof*

    Now with giternal the only time you get conflicts is if you were to both freeze the external at different points, and then each commit. And you want that conflict anyway, because then you can unfreeze, merge your externals, and then freeze it back up.

  18. Pat Maddox says:

    Short answer, which I didn't really say, is that yes there are tools for tracking upstream changes, and yes there are some tools for tracking upstream changes as well as your own changes, but I've not seen another tool that allows you to collaborate on those changes as well. Giternal is all about tracking and collaborating on external repos.

  19. Mike says:

    Pat - "The advantage of giternal over git submodules is that the references in giternal are weaker than with submodules. We experienced some problems on the RSpec project because submodules not only point to a repo, but also a particular version of the repo. This becomes a problem when active development occurs in the submodule, as we were doing with RSpec."

    I thought pointing to a specific version of a submodule was one of the biggest advantages of using submodules in the first place. For example, I started a project using RSpec for testing when the latest version was 1.1.4. I wrote a bunch of integration tests using stories, but now the latest RSpec has deprecated stories into a separate gem. Since I already wrote a bunch of tests for this project, I don't feel like rewriting those tests to accommodate RSpec changes - I'll just stick with 1.1.4 for that project. Correct me if I'm wrong, but isn't that a more common scenario than a project with external dependencies that are in continuous development?

  20. Pat Maddox says:

    @Mike - well you can use giternal for that as well. You simply don't update the external repo. In your case, I'd giternalize the rspec repo, check it out to the 1.1.4 tag, then freeze it. Done.

    "Correct me if I'm wrong, but isn't that a more common scenario than a project with external dependencies that are in continuous development?"

    Perhaps, but that's not my typical use case. And the good thing about giternal is that you can use it just to track externals, and then if you decide to develop on them at all you have that freedom. That's something that's possible with submodules, but is a major headache if you're making frequent changes to the externals

  21. Cristi Balan says:

    Pat - "Short answer, which I didn't really say, is that yes there are tools for tracking upstream changes, and yes there are some tools for tracking upstream changes as well as your own changes, but I've not seen another tool that allows you to collaborate on those changes as well. Giternal is all about tracking and collaborating on external repos."

    I wonder what do you mean by "allows you to collaborate on those changes as well".

    Braid mirrors are commited as code in the project. Other people working on the project don't even need braid to change the mirrored code. They only need it if they want to run braid update to get new code from upstream.

    So, yes, it definitely allows collaboration on those changes. If I understand the meaning of "those" correctly :).

    Could you please clarify your use case? I'm interested in adding support for it in braid, if possible.

  22. Cristi Balan says:

    @pat: Here's a brief example of braid usage between two developers.

    There's nothing braid related here besides adding the mirror. Should there be?

    dev1:

    braid add blah
    echo 123 > blah/moo
    git add . && git commit -m "moo" && git push

    @dev2:

    git pull
    echo 456 >> blah/moo
    git push

    @dev1:

    git pull
    cat blah/moo # => 123\n456

  23. Pat Maddox says:

    @Cristi - here's my use case. I have my Rails app, and I've used giternal to add rspec and rspec-rails. I commit to my Rails app, which is my code. I also commit to rspec and rspec-rails, and I push those rspec and rspec-rails changes upstream. So this way I can use the latest rspec code in my rails app, as well as push changes to rspec. Make sense?

    giternal is not just about tracking dependencies, or even making changes in them. It's about having multiple complete, fully-functional git repos, and associating them together to create a full project.

  24. Cristi Balan says:

    @Pat: Thanks. it's clear now. You want to track projects where you have commit access and be able to easily commit to them. I've been wanting that too :). However, that only works when you're a commiter and know your patches will go in. For normal people, one would have to fork and use the fork with giternal. And then manually update their fork when they need upstream changes.

    Indeed, you can't do this easily with braid. It will be possible tho once we do braid push. I'll also have a look and see if we can steal anything from giternal ;), tho I think the approaches are quite incompatible.

    Otherwise, I'll try and see if one could easily convert mirrors from one style to the other, as both have their specific advantages.

  25. Pat Maddox says:

    "You want to track projects where you have commit access and be able to easily commit to them."

    Exactly

    "However, that only works when you're a commiter and know your patches will go in. For normal people, one would have to fork and use the fork with giternal. And then manually update their fork when they need upstream changes."

    True. But it would be trivial to add another remote in the yaml file. I've just haven't had the need for that quite yet. Maybe I'll do that today :)

  26. Cristi Balan says:

    @pat: Sure, you can use any remote you want :). But, my point was that you'd have to have a fork of each project you want to track with giternal. And I'm assuming people don't want to fork all 15 plugins they use in their rails app.

    That's why I was thinking about a way to make it easy for people to use giternal for some mirrors, braid for other mirrors and then be able to switch between them.

  27. ben says:

    hi all,
    I installe the giternal gem, but it complainted "-bash: giternal: command not found".
    what's the problem?

    thanks in advance.

  28. Fritzek says:

    @pat and @cristi: I just contribute on basis of a fork to bruno's communityengine. the submodule trick was a bit too tricky to me. tekkub @github shares this opinion. your both tools seem to me feasible to do my stuff. short explanation: part of the superproject is the ce as a plugin/engine, at the moment I develop both separately (as separate local repos) and just symlink the ce into superproject; not to mention that deployment is a bit ... How would you use your tool to get both parts together, manage separate dev and joined deployment?

    @pat How to use giternal inside a cap recipe? As external system call?

    Thanks in advance
    Fritzek

  29. Pat Maddox says:

    Ben - I think it's that the gem on rubyforge is old. Try installing from github instead. Will push an update to rubyforge soon

  30. Cristi Balan says:

    @fritzek: I'm assuming the following:

    1. You have normalCE that has plugins in it. These are repos you don't have commit access to.
    2. You have superCE that has normalCE in it.
    3. You want to work on both.

    IMO, the solution is to use braid to manage the plugins in normalCE and then use ginternal to get normalCE into superCE.

    If there are no plugins in normalCE and they are included directly in superCE, you can still use braid to manage those in superCE alongside using ginternal to manage normalCE.

    HTH

  31. Cristi Balan says:

    @peter: The autorefreshing the page just ate my comment and I had to retype it. Boo :'(

Other Posts to Enjoy

Twitter Mentions