Contact Prediction ToolKit¶
A Python Interface to Contact Predictions
NEW: Now ConKit is also compatible with residue-residue distance predictions
ConKit is a Python library to provide a data object hierarchy and associated routine operations to work and manipulate residue-residue contact prediction data. Main features shipped with this library include:
- Parsers for Multiple Sequence Alignment, contact prediction and residue distance prediction files
- Analysis functions for Multiple Sequence Alignment, contact prediction data and residue distance prediction data
- Visualisation of Multiple Sequence Alignment, contact prediction data and residue distance prediction data
- Validation of models based on residue distance predictions
- Python wrappers for the contact-prediction-related software
For a general overview of ConKit, watch this video.
For an overview of model validation with ConKit, watch this video.
Changelog¶
[0.13.3]
Added
conkit.core.ContactMap.match_naivemethod for contact map match when no sequence alignment is required- Examples on how to use
conkit-validatein documentation at conkit.org - Examples on how file conversions for ditances in documentation at conkit.org
- Examples on how to plot residue ditances in documentation at conkit.org
Changed
- Update
requirements.txtto include versions of biopython and sklearn compatible with CCP4 8.0 - Update requirements list in documentation at conkit.org
Fixed
- Resolve contact map match when one of the input maps is empty
[0.13.2]
Fixed
- Further fixes to pip install package
[0.13.1]
Fixed
- Minor fix for pip install and cython extension
[0.13]
Added
- Added support for distance prediction files
- Added new visualisation plots for distance files
- Added new command line tool for model validation conkit-validate
Changed
- Remove support for Python3.6
- Add support for Python3.9
[0.12]
Fixed
- Resolve plotting of small contact maps
Changed
- Remove support for Python2.7
- Remove support for Python3.5
- Add support for Python3.8
[0.11.3]
Fixed
- Test cases ensure file removal regardless of failure
Changed
- Code formatting to adapt [Black](https://black.readthedocs.io/en/stable/) formatting
Added
- [
map_align](https://github.com/sokrypton/map_align) contact file parser - AppVeyor and TravisCI runs against Python3.7
[0.11.2]
Fixed
- Bug fix to avoid rare
ZeroDivision
[0.11.1]
Changed
conkit/core/ext/c_sequencefile.pyxremovedprintstatement
[0.11]
Added
conkit.ioroutines now accept keyword arguments- SAINT2 and ROSETTA distance restraints can now be written,
formatkeywords aresaint2androsetta StructureSelectoradded to score protein structures by contact satisfaction
[0.10.2]
MANIFEST.inifile required by PyPi
[0.10.1]
- Critical bug fix in installation procedure and Cython-code compilation
[0.10]
Added
- Support for Python 3.7
Cythonadded as dependency andSciPyremovedconkit.misc.deprecatedecorator for easier taggingContactMap.matchprovides keyword toadd_false_negativesfound in the reference but not in contact mapContactMap.remove_false_negativesallows convenient removal of false negativesContactMap.recallto calculate the recall of a contact mapSequenceFile.summaryfor quick alignment summariesA2mParserto read HH-suite A2M alignment files- Automatic
sphinx-apidocgeneration for up-to-date index ClustalParserto read CLUSTAL formatted files
Changed
SequenceFile.calculate_freqbackend changed fromnumpytoCythonfor faster computationSequenceFile.calculate_weightsbackend changed fromnumpytoCythonfor faster computationSequenceFile.filterbackend changed fromnumpytoCythonfor faster computationSequenceFile.filter_gappedbackend changed fromnumpytoCythonfor faster computationSequenceFile.calculate_weightsrenamed toSequenceFile.get_weightsSequenceFile.compute_freqrenamed toSequenceFile.get_frequencyContactMap.singletonsbackend changed fromnumpytoCythonfor faster computationBandwidthbackend changed fromnumpytoCythonfor faster computationContactMap.short_range_contactsrenamed toContactMap.short_rangeContactMap.medium_range_contactsrenamed toContactMap.medium_rangeContactMap.long_range_contactsrenamed toContactMap.long_rangeContactMap.calculate_scalar_scorerenamed toContactMap.set_scalar_scoreContactMap.calculate_contact_densityrenamed toContactMap.get_contact_densityContactMap.calculate_jaccard_indexrenamed toContactMap.get_jaccard_indexContactMatchStateprovides options for true positive, true negative, false positive and false negative, which can be added to contacts in the map at willContact.is_matchandContact.define_matchrenamed to attributeContact.true_positiveContact.is_mismatchandContact.define_mismatchrenamed to attributeContact.false_positiveContact.is_unknownandContact.define_unknownrenamed to attributeContact.status_unknownEntity,GapandResidueclasses made public
Fixed
- Bug fix in
SequenceFile.filterto removeSequenceentries reliably - Bug fix in
ContactMapMatrixFigurewhengapvariable was less than 1
Removed
- Python 3.4 support
[0.9]
Added
conkit.plotsubpackage refactored to allowmatplotlibaccess ofFigureinstances. This provides functionality similar toseaborn, somatplotlib.Axescan be provided into which a plot is drawn.ContactMap.as_listfunction to represent the contact map as a 2D-list of residue indexesconkit.misc.normalizefunction to apply Feature scaling normalizationCONTRIB.rstfile to list all contributorsSequenceFile.diversityproperty defined by \(\sqrt{N}/L\)ContactMap.reindexto reindex a contact map given a new starting indexContactMap.singletonsreturns a copy of the contact map with singleton contacts, i.e. ones without neighborsSequence.seq_encodedto allow turning a sequence into an encoded listSequence.encoded_matrixto give the entire alignment as encoded matrixSequenceFile.filter_gappedto filter sequences with a certain threshold of gapsSequenceFile.to_stringandContactMap.to_stringmethodsContactMapMatrixFigureadded to illustrate prediction signal of entireContactMap- Added support for
nebconcontact prediction format
Changed
- Changed API interface for
conkit.plotin accordance to necessary changes for above ContactMapFigurenow acceptslimparameters for axes limitsContactMapFigureandContacctMapChordFigureimproved to better space marker size- Typos corrected in documentation
THREE_TO_ONEandONE_TO_THREEdictionaries modified toEnumobjectsSequeneFile.neffrenamed toSequenceFile.meffContactMapChordFigure.get_radius_around_circlemoved toconkit.plot.tools.radius_around_circleAmiseBW.curvaturerenamed toAmiseBW.gauss_curvature
Fixed
A3mParserkeyword argument mismatch sorted
[0.8.4]
Added
Entity.topproperty to always return the first child in the listContactMap.findfunction acceptsstrictkeyword argument to find contact pairs with both residues inregisterPdbParsertakes a distance cutoff of0to include all Cb-Cb contacts in the protein structureContactMatchStateenumerated type for definitions of state constants for contactSequenceAlignmentStateenumerated type for definitions of state constants for each sequence fileNcontParseradded to extract contact pairs identified by NCONT (CCP4 Software Suite)
Changed
- Optimized some functions and comparisons according to the recommended Python optimization instructions
ContactMap.matchdoes __not__ modifiyotherby default anymore. Specifymatch_other=Trueas kwarg!ContactMap.calculate_kernel_densityrenamed toContactMap.calculate_contact_densityContactDensityFiguredraws domain boundary lines instead of symbols
[0.8.3]
Added
requirements.txtfile re-added for easier dependency installationLinearBWcalculator added for linear bandwidth calculation in analysisseq_asciiproperty toSequencefor encoded sequenceascii_matrixproperty toSequenceFilefor encoded alignmentSequenceFileandContactFileclasses have newemptypropertiesflibformat forContactFileclasses to allow easier conversions for the Flib-Coevo fragment picking library
Changed
- Distance definitions accept floating point values
_BandwidthCalcclass renamed toBandwidthBase- Abstractified
BandwidthBase, andParserwith all subparser classes - Refactored
conkit/io/__init__.pyto avoid duplication of code
Fixed
PconsParserclass accepts negativeraw_scorevaluesSequenceFile.neffreturnsfloatinstead ofintCCMpredParser.read()returns emptyContactFilewhen matrix file empty
[0.8.2]
Added*
- Test function skipping added for
SequenceFile.filter()when SciPy not installed
Changed
- Renamed conkit/io/tests files for filenames to agree with modules in conkit/io
- Performance of
write()in parsers improved by construction of string and single call towrite()of filehandle
Fixed
- Critical bug fix for automated opening of filehandle in Python2.7
[0.8.1]
Changed
- Revoked catching of
SystemExit(0)exception in scripts when invoked with--helpflag
Fixed
- Bug fix relating to Python3 automatic opening of file handles - Thanks to Miguel Correa for reporting this bug
[0.8]
Added
- Logging message coloring according to message level
filter()function added for redundancy/distant homolog removal fromSequenceFile- License text added to each module
iosub-package caches modules and imports upon request
Changed
- Default value in
calculate_meff()andcalculate_weights()changed from 0.7 to 0.8 [more commonly used in literature] coreclasses extracted to individual module files
Fixed
- Bug fix with PyPi installation where
requirements.txtnot found; fix includes removal ofrequirements.txtand addition ofinstall_requirestosetup.pyinstead. - Thanks to Miguel Correa for reporting this bug