.. _ch02-fortran-debugging: ============================================================= Fortran debugging ============================================================= Print statements ---------------- Adding print statements to a program is a tried and true method of debugging, and the most popular method that many programmers frequently use. Not because it's the best method, but it's sometimes the simplest way to examine what's going on at a particular point in a program. Print statements can be added almost anywhere in a Fortran code to print things out to the terminal window as it goes along. You might want to put some special symbols in debugging statements to flag them as such, which makes it easier to see what output is your debug output, and also makes it easier to find them again later to remove from the code, e.g. you might use "+++" or "DEBUG". There is yet another decent way of using print statement for debugging. Remember that you can use the upper case file extension ``.F90`` which allows to utilize C-style macros. In this way, you can write your code that will print out useful debugging statements as in the following: .. literalinclude:: ./codes/segfault.F90 :language: fortran :linenos: In this way, you can turn on and off the print statements by selectively commenting out the first two macros. Compiling with various gfortran flags ------------------------------------- There are a number of flags you can use when compiling your code that will make it easier to debug. Here's a generic set of options you might try:: $ gfortran -g -W -Wall -fbounds-check -pedantic-errors \ -ffpe-trap=zero,invalid,overflow,underflow program.f90 See :ref:`ch02-fortran-flags` or the `gfortran man page `_ for more information. Most of these options indicate that the program should give warnings or die if certain bad things happen. Compiling with the `-g` flag indicates that information should be generated and saved during compilation that can be used to help debug the code using a debugger such as `gdb` or `totalview`. You generally have to compile with this option to use a debugger. The `gdb` debugger ------------------ `GDB `_ is the GNU open source debugger for GNU compilers such as gfortran. Unfortunately it often works very poorly for Fortran, especially on Mac (GDB works better on Linux). You may find that `lldb `_ works better on Mac. See also `Youtube-lldb `_. See more on `GDB commands `_. Consider a following example: .. literalinclude:: ./codes/segfault1.f90 :language: fortran :linenos: First compile the code with:: $ gfortran segfault1.f90 and run it. You should see something like:: $ ./a.out Program received signal SIGILL: Illegal instruction. Backtrace for this error: #0 0x101aa24f2 #1 0x101aa2cae #2 0x7fff92243529 #3 0x101a9ceb9 #4 0x101a9cf08 Illegal instruction: 4 Now if you compile it with a ``-g`` flag:: $ gfortran -g segfault1.f90 and run it again using gdb, you see now (the following example was done on the Grape Linux machine, not on Mac):: $ cd /lectureNote/chapters/chapt02/codes $ gfortran -g segfault1.f90 $ gdb a.out $ (gdb) break segfault1.f90:7 $ Breakpoint 1 at 0x40061b: file segfault1.f90, line 7. $ (gdb) r $ Starting program: /ams209/lectureNote/chapters/chapt02/codes/a.out $ Breakpoint 1, segfault1 () at segfault1.f90:7 $ 7 do i = 1, 12 $ Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.166.el6_7.3.x86_64 libgcc-4.4.7-16.el6.x86_64 libgfortran-4.4.7-16.el6.x86_64 $ (gdb) display a $ 1: a = (-nan(0x7fe658), 4.59163468e-41, -nan(0x7fe648), 4.59163468e-41, 1.40129846e-45, 0, -4894.375, 8.68805048e-44, -2574.67188, 8.68805048e-44) $ (gdb) display i $ 2: i = 62 $ (gdb) s $ 8 a(i) = i $ 2: i = 1 $ 1: a = (-nan(0x7fe658), 4.59163468e-41, -nan(0x7fe648), 4.59163468e-41, 1.40129846e-45, 0, -4894.375, 8.68805048e-44, -2574.67188, 8.68805048e-44) $ (gdb) s $ 7 do i = 1, 12 $ 2: i = 1 $ 1: a = (1, 4.59163468e-41, -nan(0x7fe648), 4.59163468e-41, 1.40129846e-45, 0, -4894.375, 8.68805048e-44, -2574.67188, 8.68805048e-44) $ (gdb) s $ 8 a(i) = i $ 2: i = 2 $ 1: a = (1, 4.59163468e-41, -nan(0x7fe648), 4.59163468e-41, 1.40129846e-45, 0, -4894.375, 8.68805048e-44, -2574.67188, 8.68805048e-44) $ (gdb) s $ 7 do i = 1, 12 $ 2: i = 2 $ 1: a = (1, 2, -nan(0x7fe648), 4.59163468e-41, 1.40129846e-45, 0, -4894.375, 8.68805048e-44, -2574.67188, 8.68805048e-44) Continuing this, you will see:: 2: i = 10 $ 1: a = (1, 2, 3, 4, 5, 6, 7, 8, 9, 8.68805048e-44) $(gdb) s $ 7 do i = 1, 12 $ 2: i = 10 $ 1: a = (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) $ (gdb) s $ 8 a(i) = i $ 2: i = 11 $ 1: a = (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) $ (gdb) s $ 7 do i = 1, 12 $ 2: i = 11 $ 1: a = (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) $ (gdb) s $ 8 a(i) = i $ 2: i = 12 $ 1: a = (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) $ (gdb) s $ 7 do i = 1, 12 $ 2: i = 1094713344 $ 1: a = (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) $ (gdb) s $ 8 a(i) = i $ 2: i = 1094713345 $ 1: a = (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) $ (gdb) s Program received signal SIGBUS, Bus error. 0x0000000000400639 in segfault1 () at segfault1.f90:8 $ 8 a(i) = i $ 2: i = 1094713345 $ 1: a = (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) This at least reveals the error happened when the compiler tried to access `a(i)` beyond `i=12` and allows printing the value of `i` when it died. Using `display` and `s` (step) commands, you can monitor the local variable `i` and the array `a(i)` which can lead you to where the bug could be. A very useful table of command comparisons between GDB and LLDB are available here: `table-gdb-lldb `_ On the other hand, if you compile it with a ``-g`` and ``-fbounds-check`` flags:: $ gfortran -g -fbounds-check segfault1.f90 and running it again (even without gdb), you see now:: $ ./a.out At line 8 of file segfault1.f90 Fortran runtime error: Index '11' of dimension 1 of array 'a' above upper bound of 10 Valgrind -------------------- `Valgrind `_ is a freely available open source programming tool for detecting many memory leaks, memory bugs, and profiling. Originally, it was designed as a free memory debugging tool for Linux; now also for Mac OS, Solaris, and Android. To use Valgrind for debugging, take the following steps: 1. Install Valgrind. There are two options: a. use package manager (e.g., using Homebrew: ``brew install valgrind``, followed by ``brew link valgrind`` if needed). If this doesn't work with some complains about ruby update, please follow the instructions in `article 1 `_ and `article 2 `_. b. download the recent Valgrind release (e.g., 3.13.0 released on June 14th, 2017) from the `website `_. Untar it (e.g., ``tar xvf valgrind-3.13.0.tar.bz2``, open README and follow the steps therein:: $ ./configure --prefix=/usr/local/opt/valgrind $ make $ make install If the last command ``make`` doesn't work due to permission, try ``sudo make install``. 2. To use Valgrind, it is important to use debugging flags when compiling your codes. For example, let's consider the previous example again: .. literalinclude:: ./codes/segfault1.f90 :language: fortran :linenos: To use Valgrind, first compile the code with various useful debugging flags:: $ gfortran -g -Wall -Wextra -Wimplicit-interface -fPIC -fmax-errors=1 -fcheck=all -fbacktrace segfault1.f90 -o segfault1.exe $ valgrind --leak-check=full --dsymutil=yes --track-origins=yes ./segfault1.exe On MacOS, it is suggested to include ``--dsymutil=yes``:: $ valgrind --leak-check=full --dsymutil=yes --track-origins=yes ./segfault1.exe If you simply compile the code without such flags in the above (``-g`` is the most important one):: $ gfortran segfault1.f90 -o segfault1.exe then you probably don't not get useful information (e.g., line numbers in the source files) from running Valgrind. Totalview --------- Totalview is a commercial debugger that works quite well on Fortran codes together with various compilers, including gfortran. It also works with other languages, and for parallel computing. See `Rogue Wave Softare -- totalview family `_.