Getting Started

Using PyDriller is very simple. You only need to create Repository: this class will receive in input the path to the repository and will return a generator that iterates over the commits. For example:

for commit in Repository('path/to/the/repo').traverse_commits():
    print('Hash {}, author {}'.format(commit.hash,

will print the name of the developers for each commit.

Inside Repository, you will have to configure which projects to analyze, for which commits, for which dates etc. For all the possible configurations, have a look at Repository.

We can also pass a list of repositories (both local and remote), and PyDriller will analyze sequentially. In case of a remote repository, PyDriller will clone it in a temporary folder, and delete it afterwards. For example:

urls = ["repos/repo1", "repos/repo2", "", "repos/repo3", ""]
for commit in Repository(path_to_repo=urls).traverse_commits():
    print("Project {}, commit {}, date {}".format(
           commit.project_path, commit.hash, commit.author_date))

Let’s make another example: print all the modified files for every commit. This does the magic:

for commit in Repository('path/to/the/repo').traverse_commits():
    for file in commit.modified_files:
        print('Author {} modified {} in commit {}'.format(, file.filename, commit.hash))

That’s it!

Behind the scenes, PyDriller opens the Git repository and extracts all the necessary information. Then, the framework returns a generator that can iterate over the commits.

Furthermore, PyDriller can calculate structural metrics of every file changed in a commit. To calculate these metrics, Pydriller relies on Lizard, a powerful tool that can analyze source code of many different programming languages, both at class and method level!

for commit in Repository('path/to/the/repo').traverse_commits():
    for file in commit.modified_files:
        print('{} has complexity of {}, and it contains {} methods'.format(
              file.filename, file.complexity, len(file.methods)))