Options scripts

Return to HOME

The following are the options of the two main scripts. These options are printed using the -h flag, for example:

$> ./NodeSimilarity.pl -h

Options for NodeSimilarity.pl

Generic call:

$> ./NodeSimilarity.pl -w $Option1 -d $Option2 -t $Option4 -f $File

Options:

 FLAGS: -h  Prints a help message 
        -w  equal to 0 if the network is not weighted, to 1 otherwise (required).
        -d  equal to 0 if the network is undirected, to 1 otherwise (required).
        -t  equal to 0 if the links are of the same type, to 1 if they are of different types (required). 
        -f  flag to include the input file (required).

 INPUT: A TAB-separated file describing a network with the format:
        [1] If the network is undirected the general format is: 
                   NodeA   NodeB    Weight  Type
            where: 
            ... "NodeA" is the source node and "NodeB" the target node.
            ... "Weight" is a real value indicating the strength of the link  (can be positive or negative, if
                negative the absolute value will be taken in the computation of the Tanimoto coefficients).
            ... "Type" is a string indicating the type of the link (e.g. mutualistic=0, competitive=1).
            ... The script accepts an indefinite number of header lines starting with # 

        [2] If the network is directed the general format is the same, but it is assumed that 
            the direction is encoded in the order in which the names of the nodes appear, i.e. 
            NodeA   NodeB  vs. NodeB   NodeA. There is no specific assumption on which node is 
            is source or target, since the algorithm just needs to know that the order matters 
            to consider them as different types of links. This means that it is possible to   
            encode the presence of directed links using the field Type, explained in section [4]

        [3] If the network has both directed and undirected links, then the flag -d 1 should be
            used (i.e. as if it would be directed) and those nodes linked with
            an undirected link should appear twice in both directions and with the same weight:
                            NodeA    NodeB    Weight   Type
                            NodeB    NodeA    Weight   Type
            An alternative possibility to encode this situation is explained in section [4]

        [4] An alternative to encode directed links is to consider each direction 
            (or the lack of direction if there are also undirected links) as an attribute for the field Type.
            If each link then has an additional qualitative attribute, it should additionally be 
            considered. For instance, consider you have directed and undirected links with an 
            attribute that can be White or Black: 
                       NodeA NodeB Weight White   (undirected)   
                       NodeB NodeC Weight White   (directed)   
                       NodeA NodeC Weight Black   (directed)   
            we could transform the field type into a format like this:   
                       NodeA NodeB Weight UndirWhite     
                       NodeB NodeC Weight DirWhite      
                       NodeA NodeC Weight DirBlack      
            and then we use the options needed for an undirected network (section [1]) 

        [5] If the network has no weights, either you use the general format (with -w 1) and all the weights are equal to one,
            or you use -w 0, and then the file can be simply formatted as:
                            NodeA   NodeB   Type
        [6] If the network has no types,  either you use the general format (with -t 1)  and all your types are the same,
            or if you use -t 0 then the file can be simply formatted as:
                            NodeA   NodeB   Weight
        [7] If the network has no types and no weights, either you use the general format (with -t 1 and -w 1) ,
           and all your types are the same and weights equal to one or you use -t 0 and -w 0,
           in which case the file can be simply formatted as:
                            NodeA   NodeB  

 OUTPUT: A file describing a similarity matrix of the format:
         NodeA   NodeB   TanimotoCoeff  JaccardCoeff

 EXAMPLE USAGE: ./NodeSimilarity -w 1 -d 1 -t 1 -f path2network

 COMMENTS: In addition, if you want to change the order of the input columns you can code it 
        in the function "readParameters".

Options for NodeLinkage.pl

Generic call:

$>  ./NodeLinkage -fs path2SimilarityMatrix -fn path2OriginalNetwork -s $option1 -v $option2 -a $option3 -c $option4 

Options:

  INPUT: All the inputs, required and optional, are introduced with a flag: 
  - Required arguments 
  
       -fs path_to_file 
           A tab-separated long-formatted-matrix with the all-against all topological similarity between nodes, as computed by  
           NodeSimilarity.pl, with the format:     
                  NodeA NodeB  Similarity1(A,B) Similarity2(A,B) .... 
                  NodeA NodeC  Similarity1(A,C) Similarity2(A,C) .... 
                  .... 
  
       -fn  path_to_file 
          A tab-separated network from which the above similarity matrix was derived, i.e. the input of NodeSimilarity.pl, with the format: 
                  NodeA NodeB   Weight 
          If the name of the file is "Network"-label, "label" will be used for 
          the name of the output, otherwise you will find default names. 
  
        Note: Both input files accept a header starting with the character # 
  
   - Optional arguments: 
  
      -h 
          Prints this help and exits 
  
      -c integer 
          An integer with the column in which the similarity measure between nodes will be found, it 
          is given as an option because the ouptut of NodeSimilarity.pl provides Tanimoto and Jaccard 
          coefficients in different columns. Defaults to column 3 (Tanimoto), column 4 is Jaccard. 
  
      -a method 
          Where method determines the clustering method. Valid arguments are "Average" for Average linkage, 
          "Single" for single linkage and  "Complete" for complete linkage. Default is Average. 
  
      -s stop_criteria 
          Where stop_criteria is a string determining a criteria to stop the clustering. It may be a threshold in  
          the similarity (argument "thres") a stopping point ("step"), or we may want to cluster until 
          there is a single cluster, in which case the argument should be "none"). Default value is "none" 
  
      -v value 
          If a stop criteria is given, this flag must be used to include either a value that would 
          be either a threshold (for the similarity criteria) or a clustering step (for the step criteria). 
  
   OUTPUT:  
      A list of files: 
      - If it is not given any no stop criteria: 
           HistExtend.NoStop.InputFile 
           HistCompact.NoStop.InputFile 
      - With a stop criteria: 
           HistExtend.StopCriteria.InputFile 
           HistCompact.StopCriteria.InputFile 
           Clusters.StopCriteria.InputFile  
           Partition.StopCriteria.InputFile 
      -where: 
        -- HistExtend file: explicitly describes the clustering process, showing the two clusters 
           that are clustered (its nodes, edges, etc.) 
        -- HistCompact file: it only provides relevant quantities of the two clusters joined (number of edges, 
           number of elements contained), and of the clustering, (step, partition densities, etc.)	
        -- Clusters file: Describes the different clusters at the stopping point 
        -- Partition file: A vector assigning to each node the cluster id it belongs to. 
  EXAMPLE USAGE:  
   ./NodeLinkage.pl -fs path2SimilarityMatrix -fn path2OriginalNetwork -s step -v 145 -a Single -c 4 
  

Return to HOME