Version control with Git
Table of Contents
- 1. Motivation
- 2. Getting Started with Git
- 3. Git: Basic Concepts
- 3.1. Project and Versions
- 3.2. Lifecycle of a git project
- 3.3. Git project lifecycle diagram
- 3.4. Exercise: Create a project on Gitlab
- 3.5. Cloning
- 3.6. Exercise: Clone your project
- 3.7. Exercise: Editing and status
- 3.8. Exercise: Adding (staging) files
- 3.9. Exercise: Committing and commit logs
- 3.10. Exercise: Pushing commits
- 3.11. Checksums and Tagging
- 3.12. Exercise: Tag the initial commit
- 4. Git Branching and Merging
- 5. Git for Collaboration
- 6. Git Best Practices
- 7. Thank You
1 Motivation
1.1 The nature of software development
- software artefacts are documents
- programs, test cases, documents, makefiles, etc.
- software documents are in plain text
- S/W developers rarely write their programs in Excel or Word or pdf.
- a typical software project has many files
- Large projects have 100's of files, many have 1000's.
- and multiple people
- many users, distributed but working on the same code base.
- and multiple versions
- production version, bug-fix version, new-feature version, etc.
1.2 Example
- versions of file A.c
- A1.c, A2.c, A3.c
- versions of file B.c
- B1.c, B2.c, B3.c
- only some combinations make sense
- v1: A1.c, B1.c
- v2: A1.c, B2.c
- v3: A2.c, B3.c
- tracking gets worse with more files
- tracking working combinations is too error-prone to do manually. Combinatorial explosion with increasing number of files.
- gets impossible with multiple users
- Hard to do by hand when users are working locally and asynchronously. The problem is much more acute when the documents are programs, because programs need to work correctly and are brittle.
1.3 What is version control?
- creation and management of project versions
- Each version corresponds to a checkpoint or milestone of your project.
- independent evolution of multiple versions
- Each version evolves along a branch.
- automatic merging of different versions
- Combine two versions intelligently and flag conflicts if any.
- access to all past versions
- Any past version may be accessed.
- sharing of versions by multiple users
- protocols to share versions between multiple users.
1.4 Are git and gitlab for version control difficult to use?
- quite simple for personal use
- If you plan to use it for just yourself, git can be very easy to use.
- could get tricky with multiple users
- with multiple users, you need to manage permissions, merges, merge-requests, and occasional conflicts.
- demands expertise when managing a large project
- Will need you to understand issue tracking, continuous integration, typical in large software or enterprise projects.
1.5 Do I need git version control?
You will find git version control invaluable when
- programming and writing software
- Programming involves multiple files, source code, build files, configuration files, test files. Version control is designed for programmers.
- dealing with lots of text files
- e.g., s/w development, scientific paper writing in \(\LaTeX\)
- working in a team
- e.g., writing a joint paper or executing a large software project.
- driving a complex workflow
- Git systems like Gitlab and Github work as excellent backups systems.
- backing up
- gitlab and github are at the least great backup systems storing all past versions of your project.
1.6 You might not need version control
You will not need version control if
- You have shared documents in google docs or Office 365
- Google docs (and similar services) work well for collaboratively editing a small set of documents.
- You don't write software
- You wouldn't want to worry
1.7 What is git?
- Commands
- A set of commands manage the versioning of a project.
- Architecture
- A collection of entities and artefacts and their relationship that together achieve versioning.
- Workflows
- A collection of patterns normally employed by users collaborating on a project.
1.8 What are these other names?
- Gitlab
- An online platform and repository for projects versioned using git. Manages your projects, groups, permissions, etc. We will be using Gitlab for this tutorial.
- Github
- Like Gitlab, but hosts many more open source projects. We will not be using Github for this tutorial.
- Github classroom
- A platform for managing homeworks in a class. We will not be using it for this tutorial. With enough simple scripts, gitlab is sufficient for managing homeworks.
2 Getting Started with Git
2.1 Install a git client
OS | Command Line client | Other clients |
---|---|---|
Installation | ||
Linux | apt-get install git-all |
magit (for Emacs users) |
Github Desktop fork for Linux | ||
Windows | GitForWindows | Github Desktop |
TortoiseGit | ||
MacOS | git --version |
Github Desktop |
Other docs | Attlassian docs for git installation | Using Github desktop with Gitlab |
Installation docs from the Git manual | ||
List of GUI clients |
2.2 Exercise: create an account on Gitlab
- Gitlab
- Go to https://gitlab.com.
2.3 Exercise: install git clients
2.4 Learn git
- Atlassian git tutorials
- Atlassian is a company offering git hosting services. Start from the tutorial what is version control?
- Git tutorial from Pro Git
- A short tutorial on git. You'll find it more handy as a quick reference.
- Git Cheatsheet (pdf)
- A 2-page cheatsheet with a summary of git commands and a nice diagram. Here is an interactive cheat sheet.
- Pro Git book
- The standard online reference book. Comprehensive. Complements the git reference manual.
3 Git: Basic Concepts
3.1 Project and Versions
- A project is a set of files
- A project is a logical
collection of files existing on a repository (like
gitlab.com
or your own system).
- A project evolves
- Files are created, modified or deleted during the lifetime of a project.
- A Version is a snapshot of your project
- The snapshot snapshot can taken at any time during the evolution of your project.
- How is versioning done?
- Using a set of git commands described presently.
3.2 Lifecycle of a git project
- Creation
- A project is created
- Cloning
- A project is cloned into a directory on your machine, called the project workspace.
- Editing
- Files in your workspace are created and deleted. While they are there, they can also be edited multiple times.
- Adding (Staging)
- The current state of the specified file is captured in preparation for a new version.
- Committing
- The state of all your added files during
their most recent
add
is captured as a single version and assigned a number.
3.3 Git project lifecycle diagram
3.4 Exercise: Create a project on Gitlab
- 1. Login to gitlab
- go to https://gitlab.com username/passwd
- 2. Choose ``New Project''
- click the green colored "New Project" button on the top right.
- 3. Name your project
- choose a name for your project
- 4. Add project description
- write a short blurb about your project.
- 5. Pick visibility level
- choose from `Private' or `Public'.
- 6. Initialize with README
- click this option to have a README.md file automatically added.
- 7. Create project
- click the green button at the bottom to create the project.
- Result
- Your project is now created on gitlab.
3.5 Cloning
- What is cloning?
- Cloning is the process of getting a copy of your project on the gitlab server to your local machine.
- Where does the clone reside on my machine?
- The clone will be directory on your machine.
- What is command to clone
git clone <url-of-project>
3.6 Exercise: Clone your project
- 1. Navigate to your project
- You can search for your project on your gitlab landing page after you login.
- 2a. Locate project url
- If you're at your project, it's the url on the OR
- 2b. or use the Clone button
- click on the blue button on the right and choose HTTPS and copy the url.
- 3. Over to your terminal
Type the command
git clone <project url>
- Result
- Your project is now cloned as a directory on your machine. You may edit it by creating and modifying new files.
3.7 Exercise: Editing and status
- Assumption
(a) Your current directory is your workspace.
(b) You have created and edited some files in the directory:
a.txt
andtmp.txt
.
- Check status
run
git status
This lists the files that are untracked or modified but yet to be staged (added) for a commit.
3.8 Exercise: Adding (staging) files
- Add (stage) files
After some editing, stage the file
a.txt
:git add a.txt
- Check status
- check the status again using
git status
. The status indicates changes to be committed.
3.9 Exercise: Committing and commit logs
- Commit command
git commit -m "<nice commit message>"
- git log
The see a list of all your commits [additionally, in pretty printed short format].
git log [--pretty=short]
- git status
- A
git status
command after the commit should tell you everything is `up-to-date'.
3.10 Exercise: Pushing commits
- Commit version in workspace
- a
git commit
leaves the committed versions in the workspace.
- Push `commit' to server
git push
- Result
- The gitlab server now has the new version of your project.
3.11 Checksums and Tagging
- Commits have clumsy names
- These are system generated
names (aka checksums). Try
git log --pretty=oneline
- Tags identify milestones
- Tagging is a way of generating human names for your commit.
- Tagging
git tag <tag-name>
- Annotated tagging
git tag -a <tag> -m "<msg>"
- Listing tags
git tag
- Examining a particular tag
git show <tag>
- Sharing tags
git push origin --tags
orgit push origin <tag>
- Tagging an old commit
git tag -a <checksum-or-part-of-it>
3.12 Exercise: Tag the initial commit
- Find all commits
git log --pretty=oneline
- Locate the initial one
- The log message is probably `Initial Commit'
- Tag the initial commit
git tag -a v-init-0.0 <checksum-prefix>
4 Git Branching and Merging
4.1 What is a branch?
- Branch = evolutionary path
- The path is dotted with commits.
- Commits happen on branches
- A commit is always done within the context of a branch.
- Master branch
- This is the default branch on which all commits happen.
- Multiple branches
- A project may evolve along multiple branches.
- Motivation
- You have a working version of your code on
branch
master
. But you suddenly want to explore a separate idea. You create a new branch, sayexperimental
. Later on you can merge the two branches.
4.2 Exercise: Commit to a different branch
- List existing branches
git branch
- Create a new branch
git branch <new-branch>
- Switch to the new branch
git checkout <newly-created-branch>
- Start working on this branch
- Create, edit and commit on this branch.
- Switch back to master
- If you created a file in the new branch, you won't see it on master.
4.3 Exercise: Switch to a tag on a new branch
- List tags
git tags
- Initial commit
- Its tag is
v-init-0.0
.
- Switch to this tag (on a new branch)
git checkout tags/v-init-0.0 -b <new-branch-name>
4.4 Merging
- What is merging
- It is the process of absorbing the changes of another branch into the current branch.
- Motivation
- Once your experiment is successful, you want to incorporate it into production, so you need to merge the experiment branch to production.
- Diffuse changes across branches
- Merging allows for changes to diffuse into other branches in a controlled way.
4.5 Exercise: Merge a branch with another
- Switch to
master
branch - Use
git checkout master
if you're on a different branch.
- Note files in
master
- Examine the files in master.
- Merge the master with experimental branch
git merge exp
(Hereexp
is the experimental branch.)
- Note files in
master
- The changes introduced by
exp
are now incorporated intomaster
.
4.6 Conflicts
- What is a conflict?
- A situation when two incompatible changes have been made to the same file. Normally, git is smart enough to merge changes to the same file in two different branches, but occasionally it is stumped.
- Locating a conflict
A conflict in a file looks like this:
<<<<<<<< HEAD
content in your current branch
=================
content in the other branch
>>>>>>>> other_branch_checksum
4.7 Resolving a conflict
- Edit the offending file
- Choose (which parts of) which versions should stay or go. You can also completely edit the file, adding whatever you like. Just make sure the conflict markers are gone.
- Home (our) branch overrides
git checkout --ours <conflicted-file>
- Incoming (their) branch overrides
git checkout --theirs <conflict-file>
See Stack Overflow discussions Resolve merge conflicts: Force overwrite all files and git merge with force overwrite. Also see, merge conflicts during rebasing.
4.8 Exercise: introduce and resolve a conflict
- Switch to master
- Create a new file
conflict.txt
with a line "This is the master branch." Add and commit.
- Switch to
exp
branch - Create a new file
conflict.txt
with a line "This is exp.". Add and commit. - Merge with master
git merge master
- Identify the conflict
- Examine
conflict.txt
. Fix conflict (in any of the different ways) and commit.
5 Git for Collaboration
5.1 Remotes
- git is peer-to-peer
- git is designed to allow peer-to-peer communication. You don't really need gitlab to work using git as a team.
- but git servers make it easier
- gitlab (or github) make collaboration easier by keeping track of shared projects and branches.
- communication needs protocols
- gitlab supports the
ssh
andhttps
internet protocols. We will use onlyhttps
.
5.2 Fetching from a remote
5.3 Pulling from a remote
5.4 Controlling access to your project on gitlab
5.5 Creating a merge request
5.6 Approving a merge request on gitlab
6 Git Best Practices
6.1 Initialize and protect project
- Keep repositories private
- You can always open it up to others later.
- Add a README file
- If you plan to share your repository with others, add a README.md or README.txt file, describing the project.
6.2 Commit frequency and etiquette
- Do frequent commits!
- Commit often, so your versions are close enough and you can get back into a past version mostly closely resembling what you want.
- Commit only `consistent' states
- In a programming project, make, only commit code when your programs compile, and perhaps pass some of the test cases. When working in a a team, you'll quickly lose friends if you commit code that doesn't compile.
- Write good commit messages
- Good messages let you and others know what changed and why.
6.3 Liberally use Tags and Branches
- Tag often!
- Tagging helps you identify important checkpoints that you can get back to.
- Create branches
- Experiment fearlessly in new branches and commit and tag them them as well. You or your team may want to work on multiple features of your project at the same time. Branches are indispensable for that.
7 Thank You
- Advanced topics
- Many advanced topics have been skipped.
- More sessions
- If there is interest, we could do more sessions.
- Meanwhile
- Complete the exercises, start using git and refer to the vast online documentation, tutorials and Stack Overflow discussions!