Getting changes from upstream in Git
If you’ve ever clicked the fork button on GitHub, you’ll find yourself with a snapshot of the code. You think about working on the changes for a bit, but you decide to leave it for the afternoon and come back to it later.
Then, you look back at the original repository, and suddenly see that those guys have made some changes, committed them, and pushed them online. You can tell, because it’s right there!
But your copy doesn’t have those changes. What gives?
Regardless of the reasoning behind it, it’s probably a good idea to pull the changes from the original repository before you make your own modifications. I mean, it’s possible to make your own beforehand, but that can make things a lot more messy. Plus, if you want to contribute back to the original project, being able to pull those changes is essential.
So, if you open up your terminal and browse to your copy of the repository, you will discover that it has one remote, like this:
~/my-repo $ git remote
origin
The Git documentation has this to say about a remote:
Remote repositories are versions of your project that are hosted on the Internet or network somewhere. You can have several of them, each of which generally is either read-only or read/write for you. Collaborating with others involves managing these remote repositories and pushing and pulling data to and from them when you need to share work.
The origin repo is most likely to be the master repo from your fork. In order to get our updates, we need to add a remote repository to the configuration so we can pull the changes into our repo:
~/my-repo $ git remote add upstream https://github.com/megacorp/my-repo
~/my-repo $ git remote -v
origin https://github.com/me/my-repo (fetch)
origin https://github.com/me/my-repo (push)
upstream https://github.com/megacorp/my-repo (fetch)
upstream https://github.com/megacorp/my-repo (push)
You can substitute `upstream` for any name, but it’s always a good idea to use a descriptive name.
Now we’ve done that, we can fetch those new changes, with `git pull upstream master`. The order matters in this case—the first argument (`upstream`) is the remote we want to pull from, and the second (`master`) is the branch that we want to pull into.
You should now have the latest changes to start working on things!
# But I’ve already committed my files!
No need to panic! You just need to pull as you have before. If you’re lucky, the changes made will automatically merge. But you might run into this message:
```bash
~/my-repo $ git pull upstream
remote: Counting objects: 3, done.
remote: Compressing objects: 100% (2/2), done.
remote: Total 3 (delta 0), reused 0 (delta 0)
Unpacking objects: 100% (3/3), done.
From https://github.com/megacorp/my-repo
5ccaa0e..60e80ce master -> origin/master
Auto-merging README.md
CONFLICT (content): Merge conflict in README.md
Automatic merge failed; fix conflicts and then commit the result.
This can happen if you’ve happened to edit the same file as the others. So you’ll need to resolve this conflict before we can keep going. So if we open up README.md
, you’ll see your file, but it has a few extra things:
<<<<<<< HEAD
Readme
=======
This is a readme from Megacorp
>>>>>>> 60e80ce224f9e6ebdfc38a9b0182cd1a35bedb4e
The lines from <<<<<<<
to =======
refers to the changes that are in your repo, and from =======
to >>>>>>>
refers to the changes made upstream.
So, to fix this, we can:
- Keep our changes only
- Keep the changes from upstream, or
- Make a new change entirely
To go about any of these processes, we have to delete a few lines. The markers for one have to go. Say we want to do option 2, and keep only their changes, the modified file would look like this:
This is a readme from Megacorp
Note in this case, Megacorp decided to add an extra line at the end of the document.
Commit our changes, and we’re done!
~/my-repo $ git add README.md
~/my-repo $ git commit -m "Resolve readme conflict with upstream"
[master 4f8c41f] Resolve readme conflict with upstream
Hopefully this has helped with getting your work in!
Other useful links
Here’s some other resources I think will help you resolve your conflicts when pulling in the latest work from a repo:
- GitHub: Resolving a merge conflict using the command line: This covers what I’ve written, plus how to handle when a file gets removed entirely from a repository.
- GitHub: Merging a pull request: This specifically details the process of a pull request, if that’s how your workflow goes. A bit off-topic, but the people upstream will have to handle this!
- Pro Git: Git Branching—Remote Branching: Some more comprehensive reading into how remote branches work. They use
git fetch
instead ofgit pull
, as some people may get confused. - Flight rules for Git: An excellent primer on how to do common things in Git, as well as ways out of tricky situations you may get into, such as deleting your branch before you’re done.