Building from Source Tar Files

Originally written on 04 February, 2010 01:46 AM for the MOSS Magazine Issue #2 (08 February, 2010). I’m republishing it here so that it will be on the public domain as well.

This how to shows how “.tar.gz” files are used in general and what they are. We received an email request from one of the readers asking for an article on how to work with “.tar.gz” files and how applications distributed in tar files can be installed and made use of.

Basically “.tar.gz” files or simply called a “tar file” or “tarball” is an archive format. It usually comes compressed in a format available generally on a GNU/Linux system such as gzip, bzip2 or lzma. A command line program called “tar” exists for the purpose of creating and handling tar files. Simply put, a “.tar.gz” file serves the same purpose as the “.zip” archive format.

Typically GNU/Linux programs are distributed in this format. Most follow the convention of using “program-name_1.0.1_src.tar.gz” for the source code archive and “program-name_1.0.1.tar.gz” for the binary compilation.

Let’s begin with using these files on the latest version of Ubuntu. We’ll also download a small tool as a sandbox to have a look at how programs are built on these platforms from source code. At this point, it is necessary to know how to distinguish source tar files from binary tar files. That way, it would be convenient to learn earlier if a binary package is already generated for the distribution you’re using. For example, on Debian and Ubuntu derivatives, programs are packaged as “.deb” files. Which means that you do not need to download the source tar ball, extract it, configure and build it from scratch.

Let’s download a small utility that let’s you test the performance of websites. The tool is developed and provided by HP systems. You can download the source code at Once done navigate to the downloaded folder from the command line, eg: /home/user/Downloads/. Issue the following commands to extract it and going about building it. Note that this is a very primitive way of building most programs on GNU/Linux and it should almost be the same for most programs out there.

Let’s go through what’s happening above. The first line runs the “tar” program which handles tar files. The second part “-zxvf” are command line options which tells the tar program what to do. The third argument is the name of the tar file to perform the actions on. You can do the same by right clicking on the file with the Gnome file browser (Nautilus) and selecting “Extract Here”.

The command line options are:

  • z: filter the archive through gzip. Since the tar file is compressed with the gzip compression format, which is indicated by the second file extension of “.gz”.
  • x: extract files from an archive.
  • v: verbosely list files processed (optional). This displays a list of the files that are in the archive and which were extracted.
  • f: use archive file or device. The argument following the options. In our case, the third argument, name of the file.

Now if you do an “ls” or browse to the Downloads folder on your system through Nautilus, you will find a folder named “httperf-0.9.0”.

The second command “cd” changes the current working directory to the newly extracted directory. We then create a folder named “build” with the third command. We change into that newly created “build” directory with the fourth command.

The fifth command is special, in that it configures the source code to be built for your specific distribution. Since different systems have different types of file system standards and different environments, the configure script knows much about the differences and prepares things appropriately.

The sixth command actually tells the system to start compiling the source code to create binary files that can be executed on the system. At this point if you view the “build” directory you will find new files and a “src” folder. If you navigate to the “src” folder, you will find different intermediary build files used by the “make” program and the actual executable named “httperf”. We can run the program here by issuing “./httperf –help”. It will run and display the help information for the program.

The last line is also special in that it actually copies the necessary files to the system paths. It installs the executable in the system’s locally built binaries directory “/usr/local/bin/”, same for the “idleconn” program and finally installs the man (manual page) in “/usr/local/share/man/man1/”, which can be viewed by executing “man httperf” on the command line.

There you have it. You’ve successfully built and installed a program on your system. Now at anytime, you can run the “httperf” program from your command line. This is a typical program build process as mentioned before. It can simply be uninstalled by executing “sudo make uninstall” from the same build directory (“/home/user/Downloads/httperf-0.9.0/build/”).

Now for the difference between a binary tar file. You can extract any type of “.tar.gz” file with the first command as mentioned above. If you list the files extracted with the command “ls -l” it will display the directory in a list fashion with the file permissions, owner, group, file size and the date modified as columns. If there are any files that are in bold or with the an “x” in the file permission block, the file can be executed. All you have to do is type in the command “./program-name” and the program will get executed. A none source tarball will not have the “configure” script and files like “install” or “Makefile.*”.

You can find more about working with tar files by doing a search on the web, which will land you with different sites and weblogs which shares on how you can go about working with tar files as well as building and running programs distributed in tar files.