Contact Prediction ToolKit¶
A Python Interface to Contact Predictions
NEW: Now ConKit is also compatible with residue-residue distance predictions
ConKit is a Python library to provide a data object hierarchy and associated routine operations to work and manipulate residue-residue contact prediction data. Main features shipped with this library include:
- Parsers for Multiple Sequence Alignment, contact prediction and residue distance prediction files
- Analysis functions for Multiple Sequence Alignment, contact prediction data and residue distance prediction data
- Visualisation of Multiple Sequence Alignment, contact prediction data and residue distance prediction data
- Validation of models based on residue distance predictions
- Python wrappers for the contact-prediction-related software
For a general overview of ConKit, watch this video.
For an overview of model validation with ConKit, watch this video.
Changelog¶
[0.13.3]
Added
conkit.core.ContactMap.match_naive
method for contact map match when no sequence alignment is required- Examples on how to use
conkit-validate
in documentation at conkit.org - Examples on how file conversions for ditances in documentation at conkit.org
- Examples on how to plot residue ditances in documentation at conkit.org
Changed
- Update
requirements.txt
to include versions of biopython and sklearn compatible with CCP4 8.0 - Update requirements list in documentation at conkit.org
Fixed
- Resolve contact map match when one of the input maps is empty
[0.13.2]
Fixed
- Further fixes to pip install package
[0.13.1]
Fixed
- Minor fix for pip install and cython extension
[0.13]
Added
- Added support for distance prediction files
- Added new visualisation plots for distance files
- Added new command line tool for model validation conkit-validate
Changed
- Remove support for Python3.6
- Add support for Python3.9
[0.12]
Fixed
- Resolve plotting of small contact maps
Changed
- Remove support for Python2.7
- Remove support for Python3.5
- Add support for Python3.8
[0.11.3]
Fixed
- Test cases ensure file removal regardless of failure
Changed
- Code formatting to adapt [Black](https://black.readthedocs.io/en/stable/) formatting
Added
- [
map_align
](https://github.com/sokrypton/map_align) contact file parser - AppVeyor and TravisCI runs against Python3.7
[0.11.2]
Fixed
- Bug fix to avoid rare
ZeroDivision
[0.11.1]
Changed
conkit/core/ext/c_sequencefile.pyx
removedprint
statement
[0.11]
Added
conkit.io
routines now accept keyword arguments- SAINT2 and ROSETTA distance restraints can now be written,
format
keywords aresaint2
androsetta
StructureSelector
added to score protein structures by contact satisfaction
[0.10.2]
MANIFEST.ini
file required by PyPi
[0.10.1]
- Critical bug fix in installation procedure and Cython-code compilation
[0.10]
Added
- Support for Python 3.7
Cython
added as dependency andSciPy
removedconkit.misc.deprecate
decorator for easier taggingContactMap.match
provides keyword toadd_false_negatives
found in the reference but not in contact mapContactMap.remove_false_negatives
allows convenient removal of false negativesContactMap.recall
to calculate the recall of a contact mapSequenceFile.summary
for quick alignment summariesA2mParser
to read HH-suite A2M alignment files- Automatic
sphinx-apidoc
generation for up-to-date index ClustalParser
to read CLUSTAL formatted files
Changed
SequenceFile.calculate_freq
backend changed fromnumpy
toCython
for faster computationSequenceFile.calculate_weights
backend changed fromnumpy
toCython
for faster computationSequenceFile.filter
backend changed fromnumpy
toCython
for faster computationSequenceFile.filter_gapped
backend changed fromnumpy
toCython
for faster computationSequenceFile.calculate_weights
renamed toSequenceFile.get_weights
SequenceFile.compute_freq
renamed toSequenceFile.get_frequency
ContactMap.singletons
backend changed fromnumpy
toCython
for faster computationBandwidth
backend changed fromnumpy
toCython
for faster computationContactMap.short_range_contacts
renamed toContactMap.short_range
ContactMap.medium_range_contacts
renamed toContactMap.medium_range
ContactMap.long_range_contacts
renamed toContactMap.long_range
ContactMap.calculate_scalar_score
renamed toContactMap.set_scalar_score
ContactMap.calculate_contact_density
renamed toContactMap.get_contact_density
ContactMap.calculate_jaccard_index
renamed toContactMap.get_jaccard_index
ContactMatchState
provides options for true positive, true negative, false positive and false negative, which can be added to contacts in the map at willContact.is_match
andContact.define_match
renamed to attributeContact.true_positive
Contact.is_mismatch
andContact.define_mismatch
renamed to attributeContact.false_positive
Contact.is_unknown
andContact.define_unknown
renamed to attributeContact.status_unknown
Entity
,Gap
andResidue
classes made public
Fixed
- Bug fix in
SequenceFile.filter
to removeSequence
entries reliably - Bug fix in
ContactMapMatrixFigure
whengap
variable was less than 1
Removed
- Python 3.4 support
[0.9]
Added
conkit.plot
subpackage refactored to allowmatplotlib
access ofFigure
instances. This provides functionality similar toseaborn
, somatplotlib.Axes
can be provided into which a plot is drawn.ContactMap.as_list
function to represent the contact map as a 2D-list of residue indexesconkit.misc.normalize
function to apply Feature scaling normalizationCONTRIB.rst
file to list all contributorsSequenceFile.diversity
property defined by \(\sqrt{N}/L\)ContactMap.reindex
to reindex a contact map given a new starting indexContactMap.singletons
returns a copy of the contact map with singleton contacts, i.e. ones without neighborsSequence.seq_encoded
to allow turning a sequence into an encoded listSequence.encoded_matrix
to give the entire alignment as encoded matrixSequenceFile.filter_gapped
to filter sequences with a certain threshold of gapsSequenceFile.to_string
andContactMap.to_string
methodsContactMapMatrixFigure
added to illustrate prediction signal of entireContactMap
- Added support for
nebcon
contact prediction format
Changed
- Changed API interface for
conkit.plot
in accordance to necessary changes for above ContactMapFigure
now acceptslim
parameters for axes limitsContactMapFigure
andContacctMapChordFigure
improved to better space marker size- Typos corrected in documentation
THREE_TO_ONE
andONE_TO_THREE
dictionaries modified toEnum
objectsSequeneFile.neff
renamed toSequenceFile.meff
ContactMapChordFigure.get_radius_around_circle
moved toconkit.plot.tools.radius_around_circle
AmiseBW.curvature
renamed toAmiseBW.gauss_curvature
Fixed
A3mParser
keyword argument mismatch sorted
[0.8.4]
Added
Entity.top
property to always return the first child in the listContactMap.find
function acceptsstrict
keyword argument to find contact pairs with both residues inregister
PdbParser
takes a distance cutoff of0
to include all Cb-Cb contacts in the protein structureContactMatchState
enumerated type for definitions of state constants for contactSequenceAlignmentState
enumerated type for definitions of state constants for each sequence fileNcontParser
added to extract contact pairs identified by NCONT (CCP4 Software Suite)
Changed
- Optimized some functions and comparisons according to the recommended Python optimization instructions
ContactMap.match
does __not__ modifiyother
by default anymore. Specifymatch_other=True
as kwarg!ContactMap.calculate_kernel_density
renamed toContactMap.calculate_contact_density
ContactDensityFigure
draws domain boundary lines instead of symbols
[0.8.3]
Added
requirements.txt
file re-added for easier dependency installationLinearBW
calculator added for linear bandwidth calculation in analysisseq_ascii
property toSequence
for encoded sequenceascii_matrix
property toSequenceFile
for encoded alignmentSequenceFile
andContactFile
classes have newempty
propertiesflib
format forContactFile
classes to allow easier conversions for the Flib-Coevo fragment picking library
Changed
- Distance definitions accept floating point values
_BandwidthCalc
class renamed toBandwidthBase
- Abstractified
BandwidthBase
, andParser
with all subparser classes - Refactored
conkit/io/__init__.py
to avoid duplication of code
Fixed
PconsParser
class accepts negativeraw_score
valuesSequenceFile.neff
returnsfloat
instead ofint
CCMpredParser.read()
returns emptyContactFile
when matrix file empty
[0.8.2]
Added*
- Test function skipping added for
SequenceFile.filter()
when SciPy not installed
Changed
- Renamed conkit/io/tests files for filenames to agree with modules in conkit/io
- Performance of
write()
in parsers improved by construction of string and single call towrite()
of filehandle
Fixed
- Critical bug fix for automated opening of filehandle in Python2.7
[0.8.1]
Changed
- Revoked catching of
SystemExit(0)
exception in scripts when invoked with--help
flag
Fixed
- Bug fix relating to Python3 automatic opening of file handles - Thanks to Miguel Correa for reporting this bug
[0.8]
Added
- Logging message coloring according to message level
filter()
function added for redundancy/distant homolog removal fromSequenceFile
- License text added to each module
io
sub-package caches modules and imports upon request
Changed
- Default value in
calculate_meff()
andcalculate_weights()
changed from 0.7 to 0.8 [more commonly used in literature] core
classes extracted to individual module files
Fixed
- Bug fix with PyPi installation where
requirements.txt
not found; fix includes removal ofrequirements.txt
and addition ofinstall_requires
tosetup.py
instead. - Thanks to Miguel Correa for reporting this bug