User Guide: CLI: Git Commit History Processing Commands


DV8 Console Commands - Git Commit History Processing Commands

Git Commit History Processing Commands

These commands allow users to process Git commit history, including:

Converting Git commit logs to DV8 dependency files;

Generating CSV-formatted change lists;

Generating a CSV-formatted target list using regular expressions.

Convert Git Commit Logs to DV8 Dependency Files

This command converts a Git log file (in text format) to a DV8 dependency matrix file. The Git log can be exported using the following Git command:

git log --numstat --date=iso

Usage

dv8-console scm:history:gittxt:convert-matrix [-h] [-matrix <MATRIX>] [-maxCochangeCount <MAXCOCHANGECOUNT>] [-outputFile <OUTPUTFILE>] [-paramsOutputFile <PARAMSOUTPUTFILE>] [-start <DATETIME>] [-stop <DATETIME>] INPUT_FILE

Input

A text Git log exported from your software revision history. You can use the following command to get the log from your Git repository:

git log --numstat --date=iso

Options

matrix: Path to a structure matrix that models file dependencies of a particular snapshot. With this matrix, files in the revision history that are not part of the structure matrix will be excluded. Without this option, DV8 will extract a history DSM with all the files in the revision history log, which may take a long time and require significant memory if the project has a long history.

maxCochangeCount: Maximum count of co-changed files per commit in history (default is 1000). You may reduce this number if performance issues arise due to hardware limitations.

outputFile: Specifies the output dependency matrix file to create (*.dv8-dsm). By default, it uses the same filename as the input history file in the current working directory.

paramsOutputFile: Specifies the output file (*.json) to record the parameters used when executing the command. If no parameters (such as start and stop dates) are provided, the entire history in the log file is processed, and the date range will be saved in this file.

start: Defines the start of the date/time range to convert (in ISO-8601 format), e.g., 2017-07-08T00:00:00Z.

stop: Defines the end of the date/time range to convert (in ISO-8601 format), e.g., 2018-01-08T00:00:00Z.

Dextended.config: To match file names across different data sources, DV8 provides a preprocessor to fix inconsistencies. You may need to configure the prefix strings to be removed in the extended configuration file (dv8-console\samples\dv8-extended-config.xml). Multiple prefixes can be excluded using multiple <value> blocks.

Example

dv8-console scm:history:gittxt:convert-matrix -outputFile history.dv8-dsm "-Dextended.config=dv8-extended-config.xml" gitlog.txt

Generate CSV-formatted Change List from Git Commit History

This command generates a CSV-formatted change list from a Git log file. You can export the log using the following command:

git log --numstat --date=iso

Usage

dv8-console scm:history:gittxt:generate-changelist [-h] [-outputFolder <OUTPUTFOLDER>] [-start <DATETIME>] [-stop <DATETIME>] INPUT_FILE

Input

A text file of the Git log exported from your software revision history. The file should be encoded in UTF-8 format.

Options

outputFolder: Specifies the output folder for the change list, which includes change frequency and churn.

start: Defines the start of the date/time range to convert (in ISO-8601 format), e.g., 2017-07-08T00:00:00Z.

stop: Defines the end of the date/time range to convert (in ISO-8601 format), e.g., 2018-01-08T00:00:00Z.

Dextended.config: Same as described above for fixing file path inconsistencies using the extended configuration file.

Example

dv8-console scm:history:gittxt:generate-changelist -outputFolder list "-Dextended.config=dv8-extended-config.xml" gitlog.txt

Generate CSV-formatted Target List from Regular Expression

This command generates a CSV-formatted target list file from a regular expression and a Git log file. The Git log can be exported using the following command:

git log --numstat --date=iso

Usage

dv8-console scm:history:gittxt:generate-targetlist [-h] [-outputFolder <OUTPUTFOLDER>] [-regex <REGEX>] [-start <DATETIME>] [-stop <DATETIME>] [-targetissuecsv <TARGETISSUECSV>] INPUT_FILE

Input

A text file of the Git commit log exported from your revision history. The file should be encoded in UTF-8 format.

Options

outputFolder: Specifies the output folder for the generated target list, including target frequency and churn.

regex: A regular expression used to extract issue IDs from commit messages in the revision history.

start: Defines the start of the date/time range to convert (in ISO-8601 format), e.g., 2017-07-08T00:00:00Z.

stop: Defines the end of the date/time range to convert (in ISO-8601 format), e.g., 2018-01-08T00:00:00Z.

targetissuecsv: Path to the CSV file containing the target issue IDs, where the first column is the issue ID.

Dextended.config: Same as described above for handling file path inconsistencies using the extended configuration file.

Example

dv8-console scm:history:gittxt:generate-targetlist -regex PDFBOX-[0-9]+ -outputFolder list "-Dextended.config=dv8-extended-config.xml" gitlog.txt