Git Introduction

Modified on Tue, 06 Oct 2020 at 02:45 PM

Article Content


What is Git?

Git is a widely adopted distributed version-control system (DVCS) which keeps track of changes you apply to a collection of files. It is an actively maintained open-source project that is used in a wide variety of commercial and non-commercial software projects. It enables your team and each member to work on different revisions of a project simultaneously, managing changes, and discuss them.
The following subsections will describe the advantages of using Git in ONE DATA Apps and will describe its basic principles and terminology. The subsequent sections will give an overview of the relevant features of GitLab and how to use Git with ONE DATA Apps.


Why should my team use Git?

  • Git tracks changes you or your team members make

Git tracks the changes made to your files, creating a record of what has been done and by whom.

  • Git makes collaboration easier 

Git allows creating so-called branches, i.e. copies of the project that you can work on without having to deal with concurrent changes by other people. After you're done with your work, you can bring back your changes to the original or main branch of the project, integrating your changes with those made by other people.

  • Git is great for quality-assurance

After you're done with your changes, you can create a so-called merge request. The changes can be reviewed by another person before they are made available to others. This encourages a Four-eyes principle were only peer-reviewed and quality assured changes get to merged into your main branch.

  • Git helps to keep track of what is happening in your project

With Git, the complete history of your project can be viewed. Thus you can see for every line in your code who wrote or changed it and whom to ask if you have questions regarding that line. Also, you can see in what context a certain part of your code was originally written and the changes that have occurred in the past, allowing you to get a better understanding of it.

The following subsections will describe the basic principles and terminology of Git at a high level. Some of this information is not necessary to work on apps with Git, but helps to understand the underlying principles.


Repositories

A Git repository (or short: repo) is a directory that contains all files of your project and the entire history of these files. When you checkout a certain version of your project, Git will adjust the contents of the folder to the state that was present when the version was created.

In most cases, the repository is stored in some central location, for example, a GitLab server. This is called the remote repository. You can clone the repository from that location, creating a copy of it on your local machine (called the local repository). Now you can make changes to the files of that local copy and push the changes back to the remote repository. This makes the changes available to other people, who can pull them into their own local repositories and continue working on this state of the project. Therefore git is called a distributed version control system as everyone is working on local copies of the repository and synchronizes those via the central copy by pushing and pulling.


Commits

You can think of the state of your project as the history of all changes made to it so far. Git is keeping track of the sequence of changes made to the files instead of the content of the files themself. Thus it can use the history of the changes to reconstruct the state at any point in time. You can tell git to record the changes you made to the project with a so-called commit. A commit is a set of deletions and additions made to the files of your project since the last commit (the so-called parent commit). For every commit, you can specify a message, which should describe the changes it contains.
By making a commit you essentially create a snapshot of your work that should appear in the project's history and that you may want to jump back to later. It is not necessarily a stable state of the feature you're working on but can be some work-in-progress state you don't want to lose.

The entire series of commits made since the project was created is called the history. As every commit contains the difference to the previous commit, the current state of the project could be restored by applying one commit after the other.


Branches

There are potentially many people working on different features in your project. By synchronizing with the remote repository you will also get their commits and could wind up with work-in-progress changes. Ideally, you would want to work isolated from the changes made by others and only bring your changes back to the main codebase once they are stable and quality-assured.
To allow such behavior, Git provides a functionality called branching. By branching away from the main history of a repository you are creating a copy of the original codebase in a new branch. Branches are clones of the main history of the project that can be worked on in parallel without being impacted by changes made to other branches or the main history of the project (which is itself a branch). Once you're done with your work, you can integrate the history of your branch back into the main history.

The following figure depicts the concept of branching graphically. The lines represent branches and the circles represent the commits made on them. A branch (green) is created starting at a specific commit on the source branch (red). Then commits are made to this new branch. The history of that branch now consists of these new commits and the old commits of the source branch until the point where the branch was created. Newer commits made to the source branch after this point do not affect the history of the new branch. In the end, the new branch is brought back to the source branch, integrating its changes in its history. Now both the changes made to both branches are available in the source branch.


Merging

Once you created a branch and finished your work, you want to bring back these changes to the main branch. This is done by merging. While you did your work on a branch, changes may have been made to the main branch. These changes can potentially affect the same files you were working on. So when integrating the changes from your branch, the changes from both branches must be joined. This is automatically done by Git if the changes do not affect the same spot in a file. As the files need to be changed by Git to bring the changes together, a so-called merge commit is created (see figure above).
In case the same part of the file was changed in both branches, the changes can not be automatically merged as Git does not know which change to prefer. In this case, a merge conflict will be reported and a human agent needs to decide which of the conflicting changes should be kept or how they can be unified. You can do this in any text editor or use one of the many merge tools available.

The GitLab article describes how conflicts can be resolved via the GitLab web interface.

GitFlow

GitFlow is a branching model that is widely used in git repositories of many software projects. It encourages a clean separation of feature development and thus improving clarity and reducing the risk of conflicts. GitFlow tries to model the process of software development with branches. It defines conventions on which branches to use, how they are related, and what purpose these branches have. A GitFlow repository has the following branches:

  • master: This is the live-branch of your project containing the production code. Only quality assured code that is ready to go live (e.g. into production) should be pushed to it.
  • develop: This branch contains the current development state that is not live yet. Only finished and quality-assured work should be pushed to this branch. It should always be in a state that is ready to go live.
  • feature/*: The feature branches in the repository contain the development changes for a certain feature. For each feature or bugfix that should be implemented, a feature branch is created starting from the develop branch. Then the changes for the work item are created on it. Once the work is finished and quality-assured, the feature-branch is merged back to develop. The name of feature-branch always starts with feature/ followed by the name or description of the feature that it includes.
  • release: Once you want to go live with the state of your current develop branch, you create a release-branch from develop. This allows you to freeze the state that should be released while features are still being merged to develop. The release-branch is typically tested again and then merged to master once everything is fine. Afterwards the master branch is merged back to the develop branch in order to make any changes made on the release branch also available in the develop branch.
  • hotfix/*: If you need to apply a change in your production code and do not want to wait until develop is released again, you can use a hotfix. A hotfix-branch is created from master and is merged back directly into master.

The following figure illustrates how the branches in the GitFlow-model are related to each other:


For more details on GitFlow please refer to this documentation and this cheatsheet.
This article described the basic principles and terminology of Git at a high-level. If you need more details or want to learn about the many other features of the version-control software, please refer to the reference manual
or the book Pro Git by Scott Chacon and Ben Straub that can be read online for free.


Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select atleast one of the reasons

Feedback sent

We appreciate your effort and will try to fix the article