MDjeep: a software tool for Distance Geometry

MD-jeep is a software tool for solving discretizable distance geometry problems (DDGPs).
Current release: MDjeep 0.3.2

Since the release 0.3.0, MD-jeep is hosted on GitHub.
Copyright (C) 2020, Mucherino, Goncalves, Lavor, Liberti, Lin, Maculan.
GNU General Public License v.3.

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. See <https://www.gnu.org/licenses/> for additional information.

Since version 0.3.0, MDjeep is able to solve instances containing both exact and interval distance values. Although initially written for problems arising in the context of structural biology, MDjeep is a general solver capable to solve instances from various applications. General "graph" notions are therefore employed (vertex, distance edge).

Discretizable Distance Geometry consists of a subclass of problems for which the search space can be discretized and reduced to a tree. Given a graph G=(V,E,d), with vertex set V, edge set E indicating whether the distance between two vertices is known or not, and a weight function d providing the numerical values for such distances, an instance of this problem falls in the discretizable subclass where there exist a vertex order on V such that:

  1. the first 3 vertices in the order form a clique with exact distances;
  2. for all other vertices with rank i > 3, there must exist three reference vertices j1, j2 and j3, such that:
    j1 < i, j2 < i, j3 < i, (j1,i) ∈ E, (j2,i) ∈ E, (j3,i) ∈ E.

In this version, we suppose that only one of the three distances d(j1,i), d(j2,i) and d(j3,i) can be represented by an interval, while the others are supposed to be exact (ie, its lower and upper bounds are closer than the predefined error tolerance).

Two methods are currently implemented in MDjeep for the solution of the instances. The Branch-and-Prune (BP) algorithm is specifically designed to solve instances satisfying the discretization assumptions given above. The Spectral Projected Gradient (SPG) is an algorithm for local optimization, which may be either run alone, or as a refinement step in BP. For more information, please refer to our publications.

Since version 0.3.2, MDjeep accepts in input MDfiles (with mdf extension). These are text files containing some main specifications for loading the problem instances, and for running the solution methods:

   syntax: ./mdjeep [options] mdfile.mdf

The MDfile is supposed to contain the specifications for a certain number of predefined "fields". Every field key-word is followed by its name; key-word and name need to be separated by a colon (:). Every value given in MDfiles appears on a single line and needs to respect the following syntax (blank characters and tabs cannot be contained in names or values, as they both act as separators):

   name [colon] value

After the definition of a field's name, every new line starting with the key-word "with" allows to set up one of the attributes of the field. Some attributes may have default values, so that it is not strictly necessary to specify them in the MDfile; other attributes are mandatory and their absence in the MDfile will cause the termination of MDjeep with code 1.

In the MDfile, the first mandatory field is "instance". Any string of characters is a valid name for the instance. The attributes "file", "format" and "separator" need to be specified in the MDfile in the subsequent lines starting with the key-word "with":

  • file: it's the path and name of the distance file, where distances are arranged line by line
  • format: it's the format that MDjeep expects to find for the distance file
  • separator: this is the character that serves as a separator in the distance file
The format can include the following elements:
  • Id1 (nonnegative integer, mandatory), identifier of vertex 1 (in the line)
  • Id2 (nonnegative integer, mandatory), identifier of vertex 2 (in the line)
  • groupId1 (integer), group identifier of vertex 1
  • groupId2 (integer), group identifier of vertex 2
  • Name1 (char string), the name of vertex 1
  • Name2 (char string), the name of vertex 2
  • groupName1 (char string), the group name of vertex 1
  • groupName2 (char string), the group name of vertex 2
  • lb (double, mandatory), the lower bound for the distance between vertex 1 and 2
  • ub (double, mandatory), the upper bound for the distance between vertex 1 and 2
Notice that:
  • if the distance file contains additional information that MDjeep does not need to load, the format element "ignore" can be used to skip this information;
  • the integer vertex labels need to be consecutive, but the smallest label is not supposed to be equal to 0 (neither to 1); the only constraint for the smallest label is that it needs to be nonnegative;
  • the format compatible with MDjeep versions 0.1 and 0.2 is "Id1 Id2 lb ub Name1 Name2 groupName1 groupName2";
  • the format introduced in MDjeep 0.3.0 is "Id1 Id2 groupId1 groupId2 lb ub Name1 Name2 groupName1 groupName2";
  • the separator is one single character, and it needs to be specified between single quotes; blank characters and tabs are always separators, so if not specified, the default separators are all blank characters and tabs.

Another mandatory field of the MDfile is the "method". Two method names can be specified in the current version of MDjeep: either "bp", or "spg" (see above). In both cases, a predefined set of attributes can then be specified on the following lines of the MDfile with the key-word "with". The reader can refer to the examples of MDfile provided with our instances to discover the several attributes that can be set up. Many of such attributes have default values: if not specifed in the MDfile, the default value are automatically used. Other attributes are mandatory: when using SPG as a main method, for example, the path and name of the text file containing the starting point (attribute "startpoint"), as well as the maximum number of iterations (attribute "maxit"), both need to be specified. In this version of MDjeep, it is mandatory for the bp method to have a refinement method: this can be specified via the field "refinement". Since only bp and spg are currently implemented in MDjeep, the only option for bp for a refinement method is currently spg. The key-word "with" can be invoked multiple times for the same attribute in the same MDfile: in such a case, the last specified value is the one that will actually be considered.

Notice that it is possible to include comments in the MDfiles: very line starting with the character '#' is ignored by MDjeep. Even if not specified as a separator, blank characters and tabs cannot be part of attribute values, and therefore they work as sort of "general separators".

More information about the MDfiles and MDjeep options can be found on the GitHub repository (see README file).
If you refer to this sofware in your publications, please cite the appropriate paper(s).


Back Home