• Register
                                                                      

SVN Log Parser from XML to CSV using Perl

SVN Log Parser from XML to CSV using Perl

I have just recently completed a project at work that uses a Perl script to parse SVN logs to CSV format for reading in Excel or any other spreadsheet application.  I found a few other SVN log parsers online but nothing that met my requirements so I have written a log parser from scratch and would like to share it with you here.  Most of the other SVN log parsers that I found online were too targeted to do specific things with the logs.  My only requirement was to get the SVN log from the server in xml format and convert that to CSV to share this information with other developers on my team and of course management will most likely be interested in using it to get an idea of the code turmoil to our SVN repository.

In this article I would like to step through the process I used to develop the Perl script and describe the details of each portion of the code that leads us to the final CSV file output.  By understanding how the script works, and with a little bit of Perl power, you should be able to customize the script to do other tasks.  If you run into any issues with the details please feel free to contact me and I will be happy to add more detail.

Script Installation and Usage

Installation Requirements:

  • SVN client installed and in your path
  • Perl installed
  • xml2csv.exe in the same directory as the SVN_log_parser.pl script

Usage: SVN_log_parser.pl [SVN URL] [from_date] [to_date - Optional]

 

Download

svn_log_parser.pl

 

Script Details and Descriptions

Defining Paths to Perl and Variables Passed from the Command Line

To begin we need to start the script with a line that defines the path to your Perl executable.  One of the requirements for running this script is that it be run on a windows platform since the tool that converts the SVN log file from xml to CSV is a windows executable file.

#!C:\perl\bin\perl

This line lets your shell know what executable to use to parse the script and execute the rest of the commands in the script. You can omit this line if you like but when you execute the script from the command line you will need to run it using 'perl svn_log_parser.pl' otherwise the command shell will not know how to parse the code in the script.

Next we need to pull in the repository path from the command line and set it up as a variable for use later on in the script.

my $repo = $ARGV[0];

The $ARGV[0] variable is the one that you specify on the command line when running the script. The ARGV array is a special Perl array that contains arguments from the command line that are being passed to the script.

We also need to pull in the dates from the command line into variables so that when we run the 'SVN log' command we can specify the dates that we want to pull log information from.

my   $from_date = '{'.$ARGV[1].'}';

if($ARGV[2]) {
my $to_date = '{'.$ARGV[2].'}';
} else {
my $to_date = 'HEAD';
}

You'll notice here that in order to define the $to_date variable we need to do some extra work since it is an optional field. What this code is doing is checking to see if the was set on the command line and if it was it sets the $to_date variable to what was specified otherwise it sets it to the value 'HEAD'.  The HEAD value basically tells Subversion that we want to use the latest version available in the repository.

Getting the Log Data from Subversion

Now that we have all of the information we need defined we can execute the SVN log command to pull the log data from our subversion repository.

if ((@ARGV) && ($ARGV[0]) && ($ARGV[1])) {
$xml = `SVN log $repo -v -r$from_date:HEAD --xml`;
} else {
die("
-------------------------------------------------------------------------
USAGE: Perl logmake.pl

Currently installed repositories are:
@repos

Dates are formatted like:
yyyy-mm-dd
------------------------------------------------------------------------

");
}

What this code does is checks that the user has sent the SVN url and the from date to the script by checking first that the @ARGV array is set and that it has two values in it specified by $ARV[0] and $ARGV[1].  If this test passes then we execute the command and store it in a variable called $xml otherwise we send the user back some usage information describing how the program should actually be used.

Yay! Now we should have our log data from subversion stored in a variable called $xml that we can use.

Writing the XML Data to a File

Now we need to write this newly retrieved log data to a file so that the xml2csv.exe program has access to it in order to convert it to a CSV file.

$file = "svn.xml";
open FH, "+>", $file or die $!;
print FH "$xml";
close FH;

This code defines the name of our file that we would like to create and write our xml data to. Once the file is defined we can open it using the Perl open function.  The FH after the open function is called a file handle and that is what we use to access the file with future commands.  The +> tells Perl that we want to open the file for read/write purposes and we want to empty the file out if it already has data in it.  If you wanted to keep the data in the file you would use +<.  And finally we print our xml data to the file handle and close the file.

Converting XML to CSV using xml2csv.exe

Now that we have our xml data written to a file we can use xml2csv.exe to convert that file to CSV for viewing in a spreadsheet application.

@args = ("xml2csv.exe","svn.xml","svn.csv","revision,author,date,msg,path","-Q");
system(@args);

Basically what we have done here is taken the command line that we would normally type in a shell and placed it into an array replacing spaces with commas and surrounding the values in quotes.  The system function takes this array of arguments and executes them on the system and thexml2csv.exe program does the rest of the work for us.  The xml2csv.exe program takes the data that is in our svn.xml file that we created earlier and parses through it restructuring that data into a CSV format that we can view with any spreadsheet application.

In Conclusion

I hope this tutorial on my SVN_log_parser.pl script has helped you to better understand the inner workings of the script.  From here you should be able to add anything you like to the script so that it works in a custom fashion for your particular environment.  This was a great project and has saved me a great deal of time from people asking about code changes to my SVN repository.  Now I can refer them to the script and focus more on release engineering rather than tracking down information for other developers and management.

 

NOTE: This script has been updated!

The following additions are included in the current download.

  • A bug fix has been added to allow proper alignment of the comment column with the revision number.
  • A feature has been added to limit the number of files in a result set. 
    • This feature became a requirement in order to counter a limitation of Microsoft Excel that limits the number of characters allowed in a cell.
 

Comments


Boise Web Design

Established in the City of Trees in Boise Idaho in 2002, Vector Network Solutions is now a leading web development company providing Boise web design with over ten years of experience. Leading the way with a proven track record on the industries top web platforms, VectorNS is capable of delivering exceptional solutions to clients around the globe. Our strategic methods afford us the ability to provide Boise with web design, development and a sharp competitive edge. Call or e-mail today to start initiating your project with the experienced professionals at VectorNS.

Free Web Tips & Ideas!

Sign up for VectorNS Web Tips and recieve occasional tips & ideas to enhance your web presence.