Git Guide I - Basic
Old schooler's guide to Git source control.
More Git Guide
In 2005, Source Code Management (SCM) got a huge boost made necessary by the growing complexity of managing large public projects like Linux. These tools have gradually matured enough to help almost any software project.
Overview
Why use Git?
Git (and source/version control management software in general) is mostly touted for coordinating work across teams, but it is also has incredible value for solo coders and even infrequent coders.
The safety nets that are available in git can speed development, find/fix bugs faster, and provide useable change documentation. They help organize and archive your source code, and allow for code pattern re-use in other projects.
Adding cloud services like github, bitbucket, etc isn't necessary but extends the usefulness of Git tremendously. These services provide a simple way to keep a centralized master copy and still retain all the advantages of local copies. In addition, they add a simple interface for teams, build/deployment and testing automation, code analysis, issue/bugfix management, documentation, etc.
(click on topic for more info)
- lightweight and portable
- free command line interface (although there are gui available)
- portable across most vendors and op-systems
- doesn't require constant internet access
- can run directly from a thumbdrive with no install
- converts a directory into a repository using normal file structures
- more information on installing Git
- code tools like grep, zip/tar, and distribution
- search for text across the project with familiar unix grep command
- zip (or tar) directory for releases
- package repository w/ history into .git file
- keep single file backups that can directly clone new repositories
- easily keep git software on same media to preserve recovery
- short and long term snapshots
- basic, unobtrusive, low overhead version and source control
- your choice of simple housekeeping options, flexible workflows
- automatic revision history with changes and notes
- synchronized undo/redo across multiple files/directories
- easy branching
- keeping a good working copy while making breaking changes
- quickly stopping/starting work on a topic
- working on multiple features simultaneously
- experiment with much less risk
- develop workflows that can be used with teams or solo
- github, bitbucket, etc
- off-site centralization/backup
- small accounts are free on most of the bigger cloud platforms
- synchronize working/testing source on multiple devices
- easy access to automated cloud build/deploy
- supports team workflows, public domain projects, and private projects
Git Basic Startup
- Installing Git software
- Setup repository
Installing Git
Downloads
Git comes pre-installed on most Linux stations, Windows, macOS, and Linux/Unix installs are on the git downloads page.
Quick Config
At minimum, you should setup a few general things using the git config command. Your identity:
git config --global user.name "John Doe"
git config --global user.email johndoe@example.com
Your editor if you don't want to use the system default (usually vi). Here is a editor list with setup instructions.
You can override any value in a specific repository by dropping the global parameter. In example:
git config user.email johndoe@some-project.com
More detailed information on installation and customization is in the second part of our guide, under the topic Customizing Installs.
initialize repository
There are two basic commands to initialize a repository:
To create a new project, or convert an existing project:
- cd /path/to/project
git init
- sets up git hidden files- if new project, copy any files you want to start the project with
git add .
- tracks and adds the files to a staging indexgit commit -m "Initial commit"
- commits files in the staging index with the title "Initial commit"
If your project generates files during build or other automated tasks, consider adding a .gitignore file, which lists files that shouldn't be tracked or stored with the source.
More information on staging and commit below.
Status, log, show, diff, whatchanged
status
git status is probably my most used git command. It displays a summary of any files that have been changed, are untracked, or are staged up for the next commit. It also displays info about remote tracking branches that you may need to sync up with using push/pull.
log
git log --oneline --graph -20 is another frequently used command that lists the last 20 commits in history, one line each, with a graphical representation of the branch. It also indicates which commits have references (like branches or tags).
git log branch1..branch2
list the commits that are different between
each branch.
show
git show displays the info in the last commit. Display other commits by adding a commit id. The info also contains the diff from the previous commit.
diff
git diff displays the difference line by line for each file changed since the last commit. You can also display the changes between any two commits by specifying up to two commit ids, or anything that can be translated into a commit id (such as branch names, tags, etc).
git diff --cached
displays what changes are staged and ready to commit.
git diff "@{yesterday}"
whatchanged
git whatchanged --since="2 weeks ago" lists the files that were changed by each commit in the last 2 weeks.
Changes and Commits
Now that the repository is setup, it is ready to start recording changes. The frequency of snapshot recording is up to the individual developer. Common practice guidelines include:
- Commit early and often - don't wait to perfect things, save your work.
- Try to make single purpose commits - this makes it easier to review, to document, and reverse if necessary. It can also help your organize tasks and line them up with project management tools.
- Use meaningful commit titles. Personally, I include a category prefix to the title such as (fix), (feature), (chore) etc.
- In contrast to the the first guideline, I prefer to only make public commits that compile/build. This may not include linting and testing steps, but helps things like bisect debugging, which is discussed later as an advanced topic.
To commit changes into history, you prepare the staging index to describe the files and changes involved, then define the commit with a descriptive title.
git add /path/to/file1
git add /path/to/file2
git commit -m "fix - corrected issue x"
Staging Overview
- git add Staging offers a chance to organize the files that have changed into a commit, primarily via the add command. The staging area is often called the index in git docs. From the docs:
The "index" holds a snapshot of the content of the working tree, and it is this snapshot that is taken as the contents of the next commit. Thus after making any changes to the working tree, and before running the commit command, you must use the add command to add any new or modified files to the index.
git status
shows which files have been changed since the last
snapshot, along with which files have been added (and are untracked)
and which have been deleted/renamed.
To add a file into the staging area, use the command
git add /path/to/filename
You can also add all modified/added files with the command
git add .
or to see what would have been added use
git add . --dry-run
Using staging to cleanup
As mentioned above, creating a snapshot with commit does not necessarily update the branch with every file you have changed. You can group sets of non-interdependent changes together to make the change more descriptive.
For instance, every time I touch a react project, I run an npm update first and test it. I can then go ahead and start updating code, knowing that when I am committing the changes I can stage and commit the package.json and package-lock.json into a separate commit to keep each commit focused on related changes.
After commit touch ups
Modifying the history after commit is a like changing the past in Git, so you will have to do a little extra work if there are copies or remotes to keep them synced up. If you aren't a solo coder, rewriting history can be strongly discouraged, but to keep it simple we will ignore those complications for the moment.
If you want to make a quick touch up to a commit, go ahead and stage the changes, then use the command
git commit --amend --no-edit
This will use the same commit title and will replace the commit id with a new one.
There are a lot of reasons to do daily or regular small commits, and to break changes into many smaller commits during the main development. After the feature is largely done, it may be handy to consolidate all those changes into a fewer number of changes that will make reading the history more useful. This is a much larger touch up.
It is possible to do very complex history and commit revisions,
I use the interactive rebase tool git rebase i
when things get
very complicated.
More information on rebasing, and the issues (and solutions) involved in rewriting history in the next article Git II - Intermediate
Using Branches
Git is designed to help manage and organize snapshots without using much space or effort. One of the cleaner ways to do this is with branches.
Underneath the surface, branches are a way to organize sets of commits. Each branch is really only a 40-byte pointer in a file (a commit id, hash of the contents and history) but represents a complete copy (along with the commit history that got it to this point) of the project directory.
Each repository has a default branch, we use the name main
but
older versions use master
or prod
. You are free to use any
unique name that suits you, (aside from a few reserved words like HEAD)
and you can change it at will. You can set the default branch name,
see [custom config](/blog/git-intermediate#custom]
When you create branches it looks like you are making a complete copy of the current state of the branch you are currently in.
Parallel branches
The concept of parallel branches provides a simple model to keep a
clean working production branch while you are introducing features
on a development branch. When you are ready to update the production
branch, you can apply the commits from the development branch to it with
a git merge branchname
command.
---> time
init current
O------O Main Branch
\------O------O Development Branch
chg1 chg2
(locally current)
after the merge becomes
init current
O------O-------O------O Main Branch (production)
\------O------O Development Branch
This is referred to as a fast-forward merge. The main branch didn't change (no commits on main that aren't on development) between the time the development branch diverged and was merged back.
You can have more than one development branch running in parallel.
These development branches (if successful) are expected to eventually
get merged into the main production branch. If other development branches
have already been merged into main (or really if main has picked up any commits
since branching off into the development branch) the merge is much
more complicated behind the scenes, but to the user it is usually just the
same git merge
command. There are cases where the standard merge
cannot resolve with help. These conflicts can be avoided for the
most part with workflows, but can require manual intervention.
This is discussed later in the advanced merge section.
Additional basic commands
git branch -vv
lists existing local branchs extra verboselygit diff main
shows the difference between the current branch and the branch named main.
More information on branches and related commands in the next article Git II - Intermediate
Basic Workflow Examples
Exploring with branches
Example: you get a crazy idea you want to try out in the code, but don't want to stomp on your working version. In the old days, I would copy /s the entire directory into a backup directory, and work on the code there. When I was done, I could copy the changed files to the regular project directory or just abandon the backup directory and go back to the regular project untouched.
Git simplifies this pattern: (starting from main branch)
- create a new branch with
git switch -c crazy-idea-branch
- perform necessary code changes, add and remove files, etc.
- test it, organize/stage/commit changes to crazy-idea-branch
- go back to main for now with
git switch main
- like the new changes?
-
- put changes into main with
git merge crazy-idea-branch
- put changes into main with
-
- ditch changes with
git branch -d crazy-idea-branch
- ditch changes with
- if you can't make up your mind yet, you can always hold on to the branch for a while, maybe start a new branch from main and see which solution you like better. Clean it up whenever.
Checkpoints
Example: You can also do the equivalent of making a backup and then continuing to work on the current copy. This can be handy if you have things pointed to the current copy for testing or deployment and don't want to have to duplicate them to test or build in the backup copy. The simple pattern is to: (starting from main branch)
- start a new branch (but stay on main) with
git branch milestone-1
- perform code changes, add and remove files, etc.
- test it
- like the changes?
-
- organize/stage/commit changes to main,
-
- remove the checkpoint with
git branch -d milestone-1
- remove the checkpoint with
- don't like the changes?
-
- move main back to checkpoint with
git reset --hard milestone-1
- move main back to checkpoint with
Note that the last step git reset --hard milestone-1
doesn't
require a branch name, and could use a tag, or a commit id (long or short)
directly. This means that you could just as easily have done this
without using a branch beforehand, simply by looking in the log for
the commit id that you want to reset to.
Documenting with branches
The command git log --oneline --graph
shows the
merging history of branches. The pattern of adding features using a
branch named after the feature, and chores like npm updates with simple commits
makes a nice self-documenting version history. There are times
when I am making a similiar change to one made before that it is nice
to be able to quickly find and view the previous changes.
More information on branches and workflows in the next article Git II - Intermediate