SEA is based on the idea that two targets are similar if the ligand sets of a target are similar to one another. The similarity of two ligand sets is computed by the sum of ligand pair similarities that exceed a certain threshold. The ligand pair similarity is measured by Tanimoto similarity. To correct for size or chemical composition bias a correction technique is intrudiced, which is based on the similarity obtained from randomly drawn ligand sets is. This leads to z-scores for similarity between the sets. It is argued that the z-scores conform an extreme value distribution. Using this extreme value distribution the probability that a compound is active on a certain target is calculated by assuming that one of the two ligand sets consists only of the compound to predict. We implemented the SEA method efficiently for using it on a multi-core supercomputer, enabling us to compare it to the other target prediction methods. … Similarity Ensemble Approach (SEA) google