Introduction to Git: Version Control Fundamentals

A comprehensive guide to Git version control system, covering basic concepts, commands, branching, merging, and advanced features for effective code management and collaboration.

Introduction to Git: Version Control Fundamentals

Table of Contents

What is Git?

Git is a distributed version control system designed to handle everything from small to very large projects with speed and efficiency. It allows multiple developers to collaborate on a project without interfering with each other’s work. Git tracks changes in the source code, maintaining a complete history of changes and enabling smooth teamwork.

Why is Git important?
Git is crucial for developers because it ensures:

  • Version Tracking: Every change to the codebase is saved.
  • Collaboration: Multiple developers can work on the same project.
  • Efficiency: Quick operations, even with large repositories.

What is a Version Control System?

A Version Control System (VCS) is a tool that helps manage changes to source code over time. It keeps track of every modification in a database, making it easy to revert to previous versions if mistakes occur.

Why do developers need a VCS?

  • To avoid losing progress during changes.
  • To collaborate seamlessly.
  • To maintain a history of changes for debugging or documentation purposes.

What Does “Distributed” in Git Mean?

In Git, “distributed” means every developer has a complete copy of the entire repository, including its full history. This differs from centralized systems where a single server hosts the repository.

Key Features of Distributed Git

  1. Local Repositories: Work offline with the full history available locally.
  2. Collaboration: Share changes through push and pull commands.
  3. Redundancy: No dependency on a central server—reduces the risk of data loss.

Understanding Local and Remote Repositories

Local Repository

A local repository in Git is a copy of a project’s entire repository that is stored on a developer’s local machine. It includes all the project’s files, history, branches, and commits. Developers can perform all Git operations (such as commit, branch, merge, etc.) locally without needing a network connection. Changes can later be synchronized with a remote repository.

Key Components of Local Repository:

  • Working Directory: Where files are modified.
  • Staging Area: Prepares changes for the next commit.
  • Local Repository (.git): Stores metadata and commit history.

Remote Repository

A remote repository in Git is a version of your project that is hosted on the internet or another network. It allows multiple developers to collaborate on the same project by pushing and pulling changes to and from the remote repository. Common platforms for hosting remote repositories include GitHub, GitLab, and Bitbucket. Developers synchronize changes with it via push and pull.

Installing Git

Always run sudo apt update before installing any package in Ubuntu.

To install Git on Ubuntu:

sudo apt update
sudo apt install git

What does sudo apt-get install git-man do?
It installs Git manual pages, which provide documentation for all Git commands, making it easier to learn and reference commands.

Initializing Git Repositories

Standard Repository

  1. Navigate to the project directory.
  2. Run git init. This creates a .git folder to track changes.

Example:

cd /path/to/project
git init

Verify Initialization:
Run ls -la to see the .git folder.

Bare Repository

What is a Bare Repository?
The git init –bare command initializes a new, empty Git repository, but unlike a standard repository, it does not have a working directory. This type of repository is typically used as a remote repository to share changes between different developers.

Command:

git init --bare /path/to/repository.git

Why use a bare repository?

  • To act as a central repository for pushing and pulling changes.
  • Avoids modifying files directly.
  • No working directory.

Configuring Git

Setting Username and Email

Why is this necessary?
Git associates each commit with a username and email, which is essential for collaboration and accountability.

Commands:

git config --global user.name "Your Name"
git config --global user.email "[email protected]"

Managing Files with Git

Staging and Committing Files

  1. Check Status: git status Output of git status will be something like this:
On branch master
No commits yet
nothing to commit (create/copy files and use "git add" to track)

Git Status

  1. Stage Changes:

    git add .
    

    This command adds all the files in the current directory to the staging area.

  2. Commit Changes:

    git commit -m "Your Commit Message"
    

    This command commits the changes to the repository.

What is .gitignore?
A .gitignore file specifies files and directories Git should ignore. Example:

# Ignore all log files
*.log
# Ignore node_modules
node_modules/

Tracking Changes with Git Logging

Git logging Git logging refers to the process of viewing the commit history of a Git repository. It allows you to see what changes have been made, who made them, and when they were made. This is useful for tracking the progress of a project and understanding its history.

Key Commands

  1. View Full History:
    git log
    
  2. Condensed View:
    git log --oneline
    
  3. Graph View:
    git log --graph
    
  4. Filter by Author:
    git log --author="Author Name"
    
  5. Filter by Date:
    git log --since="2 weeks ago"
    
  6. Show File Names:
    git log --name-only
    

Git Diff

Git diff Git diff is used to show the differences between two commits, branches, or files. It is useful for reviewing changes, detecting conflicts, and ensuring that changes are made correctly.

Key Commands

  1. Show Differences:
    git diff
    
  2. Show Differences in Staging Area:
    git diff --cached
    
  3. Show Differences in a Specific Commit:
    git diff <commit-hash>
    

Understanding Branches in Git

In Git, a branch is a lightweight movable pointer to a commit. Branches are used to develop features, fix bugs, or experiment with new ideas in isolation from the main codebase. This allows multiple developers to work on different parts of a project simultaneously without interfering with each other’s work.

Why Use Branches?

  • Isolation: Each branch can have its own set of commits, allowing you to work on different tasks independently.
  • Switching: You can switch between branches using the git checkout or git switch commands.
  • Merging: Changes from one branch can be merged into another branch using the git merge command.
  • Branch Creation: New branches can be created using the git branch command.

Common Commands

  1. List Branches:
    git branch
    
  2. Create a Branch:
    git branch new-branch
    
  3. Merge Branches:
    git merge feature-branch
    
  4. Checkout Branch:
    git checkout <branch-name>
    
  5. Delete Branch:
    git branch -d <branch-name>
    
  6. Switch Branch:
    git switch <branch-name>
    

Visualize Branches:
Git Branches

Merging Branches

Merging branches in Git is the process of combining the changes from one branch into another. This is typically done to integrate new features, bug fixes, or updates from a development branch into the main branch (often called main or master).

Key points about merging branches:

  • Fast-Forward Merge: If the target branch has not diverged from the source branch, Git simply moves the pointer forward. This happens when there are no new commits on the target branch since it was branched off.
  • Three-Way Merge: If there have been changes on both branches, Git performs a three-way merge, which involves creating a new commit that combines the changes from both branches.
  • Conflict Resolution: If there are conflicting changes in the branches being merged, Git will prompt you to resolve these conflicts manually.

Common Commands for Merging Branches

  1. Merge a Branch:
    git merge <branch-name>
    
  2. Merge without Fast-Forward:
    git merge --no-ff <branch-name>
    
  3. Squash Merge:
    git merge --squash <branch-name>
    

Advanced Concepts

HEAD in Git

In Git, HEAD is a reference to the current commit in the currently checked-out branch. It is essentially a pointer that tells you where you are in the repository’s history.

Key points about HEAD:

  • Current Branch: HEAD points to the latest commit on the current branch.
  • Detached HEAD: When HEAD points directly to a specific commit rather than a branch, it is called a “detached HEAD” state.
  • Navigation: You can use HEAD to navigate through commits, such as HEAD~1 to refer to the parent commit of HEAD.
  1. Moves HEAD to a specific commit:
    git checkout <commit-hash>
    
  2. Moves HEAD to the latest commit on the specified branch:
    git checkout <branch-name>
    
  3. Creates a new branch and moves HEAD to it:
    git checkout -b <new-branch-name>
    
  4. Moves HEAD to a specific commit and resets the current branch to that commit:
    git reset --hard <commit-hash>
    

Detached HEAD

A “detached HEAD” state in Git occurs when the HEAD pointer is not pointing to a branch but rather to a specific commit. This means that you are no longer on any named branch and are instead working with a single commit directly.

Why avoid a detached HEAD?
Changes in this state are not tied to any branch and may be lost.

Introduction to Remote Repositories

1. Initializing a Remote Repository

To start using a remote repository, follow these steps:

  1. Create a Remote Repository:

    • Go to a Git hosting service like GitHub, GitLab, or Bitbucket.
    • Create a new repository and note its URL (e.g., https://github.com/username/repository.git).
  2. Initialize a Local Repository:

    • If you haven’t initialized a Git repository locally yet, do so with:
      git init
      
  3. Link Your Local Repository to the Remote:

    • Use the git remote add command to connect your local repository to the remote one:
      git remote add origin <repository-url>
      
  4. Push Local Changes to the Remote:

    • Push your local commits to the remote repository with:
      git push -u origin master
      
      What does -u mean?
      • The -u flag sets the remote branch (origin/master) as the upstream branch for your local branch. This means future git push or git pull commands will default to this branch without needing explicit specification.

2. Cloning a Remote Repository

Cloning creates a complete copy of a remote repository on your local machine.

  1. What does git clone do?

    • The git clone command downloads the repository, including all branches, tags, and commit history, to your local machine.
  2. Steps to Clone:

    • Obtain the repository URL from the hosting service.
    • Navigate to the directory where you want to clone the repository.
    • Run:
      git clone <repository-url>
      
      This creates a directory named after the repository and copies all its content.

3. Pull Requests

What is a Pull Request (PR)?

  • A PR is a feature provided by platforms like GitHub and GitLab that allows contributors to propose changes to a codebase. Team members can review, discuss, and merge the changes into the main branch.

Key Features:

  • Code Review: PRs enable detailed discussions about code changes.
  • Collaboration: They enable collaboration by allowing multiple developers to contribute to the same project and review each other’s work.
  • Continuous Integration: PRs can run automated tests or checks.
  • Merge Management: Keeps a history of proposed changes and their discussions.

Typical Workflow:

  1. Create a Branch: Use a descriptive name for your branch.
  2. Make and Commit Changes: Implement the changes and commit them.
  3. Push Changes: Push the branch to the remote repository.
  4. Create the PR: Use the hosting platform’s interface to open a PR.
  5. Review and Merge: Discuss and merge changes after approval.

4. Fetching vs. Pulling

What do git fetch and git pull do?

  1. git fetch:

    • Downloads updates (objects and refs) from the remote repository without merging them.
    • What are objects and refs?
      • Objects: Include data like commits, blobs (files), and trees (directory structure).
      • Refs: Short for references, these are pointers to commits (e.g., branches, tags).
    • And later you have to do git merge to merge the changes to your local branch
      git merge origin/master
      
  2. git pull:

    • Combines git fetch and git merge in one step.
    • Fetches changes and integrates them into your current branch.

Commands:

git fetch origin
git merge origin/master

Or simply:

git pull

Summary

  • git fetch: Updates your local repository with changes from the remote repository without merging them. You can review the changes before merging.
  • git pull: Fetches changes from the remote repository and merges them into your current branch in one step.

5. Merge Conflicts

What are merge conflicts?

  • Merge conflicts occur when Git is unable to automatically resolve differences in code between two branches that are being merged. This typically happens when changes are made to the same lines of code in both branches or when one branch modifies a file that has been deleted in the other branch.

Steps to Resolve:

  1. Identify conflicts using Git’s notifications.
  2. Open the files with conflicts. Git marks conflicting areas like this:
    <<<<<<< HEAD
    // Current branch changes
    =======
    // Incoming branch changes
    >>>>>>> feature-branch
    
  3. Manually resolve conflicts.
  4. Stage the resolved files:
    git add .
    git commit -m "Resolved merge conflicts"
    

6. Fast-Forward Merges

What is a fast-forward merge?

  • A fast-forward merge occurs when the target branch has not diverged, allowing Git to simply move the pointer forward to the latest commit.
  • Git moves the current branch pointer to the latest commit on the source branch without creating a new merge commit.

7. Forking a Repository

What is forking?

  • Forking in Git is a process where you create a personal copy of someone else’s repository on your own Git hosting account (e.g., GitHub, GitLab, Bitbucket). This allows you to freely experiment with changes without affecting the original repository. Forking is commonly used in open-source projects to contribute to the project by making changes in your fork and then submitting a pull request to the original repository.

Key Points:

  • Personal Copy: A fork is a personal copy of a repository that resides in your own account.
  • Independent Development: You can make changes to your forked repository independently of the original repository.
  • Contributing Back: After making changes, you can contribute back to the original repository by creating a pull request.
  • Collaboration: Forking is a common workflow for collaborating on open-source projects.

Steps to Fork:

  1. Fork the repository on the hosting platform.
  2. Clone your fork locally.
  3. Make and push changes to your fork.
  4. Submit a PR to the original repository.

Rebasing and Cherry-picking

What is Rebasing?

Rebasing in Git is the process of moving or combining a sequence of commits to a new base commit. It is an alternative to merging and is often used to maintain a clean, linear project history.

Key Points:

  • Linear History: Rebasing ensures that the project history appears linear by replaying your changes on top of another branch.
  • Rewriting History: It modifies the commit history to make it look as if the changes were applied sequentially.
  • Avoid in Shared Branches: Do not rebase branches shared with others, as it rewrites history, potentially causing conflicts and confusion for collaborators.

Common Use Cases:

  1. Updating a Feature Branch: Keeping a feature branch up-to-date with the latest changes from the main branch.
  2. Squashing Commits: Combining multiple commits into a single one to clean up commit history before merging.

Commands:

  • Rebase a branch onto another branch:

    git checkout feature-branch
    git rebase main
    
  • Interactive Rebase:
    Interactive rebasing allows you to edit, reorder, squash, or drop commits during the process.

    git rebase -i HEAD~n
    

    Replace n with the number of commits you want to rebase interactively.

Example:

Before rebasing, the history might look like this:

A---B---C feature-branch
 \
  D---E---F main

After rebasing the feature-branch onto main, it becomes:

D---E---F---A'---B'---C' main

In this scenario, the commits from the feature-branch (A, B, C) are replayed on top of the main branch as new commits (A', B', C').

What is Cherry-picking?

Cherry-picking in Git involves selecting specific commits from one branch and applying them to another branch. It allows you to pick and choose individual changes without merging entire branches.

Key Points:

  • Selective Commit Application: Enables applying only specific commits from one branch to another.
  • Independent of Branch History: Does not merge the branch’s full history, making it useful for isolated changes.
  • Conflict Resolution: Similar to merging and rebasing, cherry-picking may result in conflicts that need manual resolution.

Commands:

  • Cherry-pick a specific commit:
    git cherry-pick <commit-hash>
    
  • Cherry-pick a range of commits:
    git cherry-pick <commit-hash1>..<commit-hash2>
    

Example Use Cases:

  1. Bug Fixes: Apply a critical bug fix from the development branch to the main branch without merging unrelated changes.
  2. Feature Porting: Transfer specific features or updates to another branch.

Example:

Imagine you have the following branches:

Main branch:

D---E---F main

Feature branch:

A---B---C feature-branch

If you only want to apply commit B from the feature-branch to the main branch, you can use:

git cherry-pick <hash-of-B>

After cherry-picking, the main branch will include the B commit:

D---E---F---B main

Questions and Explanations

What is the difference between rebasing and merging?

  • Rebasing: Moves commits from one branch to another base, rewriting history to create a linear sequence of commits.
  • Merging: Combines changes from one branch into another without altering the commit history, creating a new merge commit.

What does interactive rebase do?

Interactive rebase allows you to modify the commit history by:

  • Reordering commits.
  • Squashing multiple commits into one.
  • Editing commit messages.
  • Dropping unnecessary commits.

What happens if there are conflicts during rebasing or cherry-picking?

During rebasing or cherry-picking, Git may encounter conflicts when changes overlap. In such cases:

  1. Git pauses the operation and marks the conflicting files.
  2. You manually resolve conflicts by editing the files.
  3. Use git add to stage the resolved files.
  4. Continue the operation with:
    git rebase --continue
    
    or:
    git cherry-pick --continue
    

Conclusion

Git is a powerful tool that enhances collaboration and code management for developers. By understanding its core concepts and commands, you can effectively track changes, work with branches, resolve conflicts, and contribute to projects. Whether you’re working solo or in a team, Git’s distributed nature provides flexibility and reliability for your development workflow.

Remember that Git has a learning curve, but mastering it will significantly improve your productivity and collaboration abilities as a developer. Start with the basics, practice regularly, and gradually explore more advanced features as your comfort level increases.

Additional Resources

  1. Git Official Documentation
  2. GitHub Learning Lab
  3. Pro Git Book
  4. Atlassian Git Tutorials

Table of Contents