UniProt client

protmapper.uniprot_client.get_chains(protein_id)[source]

Return the list of cleaved chains for the given protein.

Parameters:protein_id (str) – The UniProt ID of the protein whose cleaved chains are to be returned.
Returns:A list of Feature named tuples representing each chain.
Return type:list of Feature
protmapper.uniprot_client.get_entrez_id(protein_id)[source]

Return the Entrez ID given the protein id of a human protein.

Parameters:protein_id (str) – UniProt ID of the human protein
Returns:Entrez ID of the human gene or None if not available.
Return type:str or None
protmapper.uniprot_client.get_family_members(family_name, human_only=True)[source]

Return the HGNC gene symbols which are the members of a given family.

Parameters:
  • family_name (str) – Family name to be queried.
  • human_only (bool) – If True, only human proteins in the family will be returned. Default: True
Returns:

gene_names – The HGNC gene symbols corresponding to the given family.

Return type:

list

protmapper.uniprot_client.get_feature_by_id(feature_id)[source]

Return a Feature based on its unique feature ID.

Parameters:feature_id (str) – A Feature ID, of the form PRO_*.
Returns:A Feature with the given ID.
Return type:Feature or None
protmapper.uniprot_client.get_feature_of(feature_id)[source]

Return the UniProt ID of the protein to which the given feature belongs.

Parameters:feature_id (str) – A Feature ID, of the form PRO_*.
Returns:A UniProt ID corresponding to the given feature, or None if not available (generally shouldn’t happen, unless the feature ID is invalid).
Return type:str or None
protmapper.uniprot_client.get_features(protein_id)[source]

Return a list of features (chains, peptides) for a given protein.

Parameters:protein_id (str) – The UniProt ID of the protein whose features are to be returned.
Returns:A list of Feature named tuples representing each Feature.
Return type:list of Feature
protmapper.uniprot_client.get_function(protein_id)[source]

Return the function description of a given protein.

Parameters:protein_id (str) – The UniProt ID of the protein.
Returns:The function description of the protein.
Return type:str
protmapper.uniprot_client.get_gene_name(protein_id, web_fallback=True)[source]

Return the gene name for the given UniProt ID.

Parameters:
  • protein_id (str) – UniProt ID to be mapped.
  • web_fallback (Optional[bool]) – If True and the offline lookup fails, the UniProt web service is used to do the query.
Returns:

gene_name – The gene name corresponding to the given Uniprot ID.

Return type:

str

protmapper.uniprot_client.get_gene_synonyms(protein_id)[source]

Return a list of synonyms for the gene corresponding to a protein.

Note that synonyms here also include the official gene name as returned by get_gene_name.

Parameters:protein_id (str) – The UniProt ID of the protein to query
Returns:synonyms – The list of synonyms of the gene corresponding to the protein
Return type:list[str]
protmapper.uniprot_client.get_hgnc_id(protein_id)[source]

Return the HGNC ID given the protein id of a human protein.

Parameters:protein_id (str) – UniProt ID of the human protein
Returns:hgnc_id – HGNC ID of the human protein
Return type:str
protmapper.uniprot_client.get_id_from_entrez(entrez_id)[source]

Return the uniProt ID given the Entrez ID of a human gene.

Parameters:entrez_id (str) – Entrez ID of the human gene
Returns:UniProt ID of the human protein or None if not available.
Return type:str or None
protmapper.uniprot_client.get_id_from_mgi(mgi_id)[source]

Return the UniProt ID given the MGI ID of a mouse protein.

Parameters:mgi_id (str) – The MGI ID of the mouse protein.
Returns:up_id – The UniProt ID of the mouse protein.
Return type:str
protmapper.uniprot_client.get_id_from_mnemonic(uniprot_mnemonic)[source]

Return the UniProt ID for the given UniProt mnemonic.

Parameters:uniprot_mnemonic (str) – UniProt mnemonic to be mapped.
Returns:uniprot_id – The UniProt ID corresponding to the given Uniprot mnemonic.
Return type:str
protmapper.uniprot_client.get_id_from_rgd(rgd_id)[source]

Return the UniProt ID given the RGD ID of a rat protein.

Parameters:rgd_id (str) – The RGD ID of the rat protein.
Returns:up_id – The UniProt ID of the rat protein.
Return type:str
protmapper.uniprot_client.get_ids_from_refseq(refseq_id, reviewed_only=False)[source]

Return UniProt IDs from a RefSeq ID”.

Parameters:
  • refseq_id (str) – The RefSeq ID of the protein to map.
  • reviewed_only (Optional[bool]) – If True, only reviewed UniProt IDs are returned. Default: False
Returns:

A list of UniProt IDs corresponding to the RefSeq ID.

Return type:

list of str

protmapper.uniprot_client.get_length(protein_id)[source]

Return the length (number of amino acids) of a protein.

Parameters:protein_id (str) – UniProt ID of a protein.
Returns:length – The length of the protein in amino acids.
Return type:int
protmapper.uniprot_client.get_mgi_id(protein_id)[source]

Return the MGI ID given the protein id of a mouse protein.

Parameters:protein_id (str) – UniProt ID of the mouse protein
Returns:mgi_id – MGI ID of the mouse protein
Return type:str
protmapper.uniprot_client.get_mnemonic(protein_id, web_fallback=False)[source]

Return the UniProt mnemonic for the given UniProt ID.

Parameters:
  • protein_id (str) – UniProt ID to be mapped.
  • web_fallback (Optional[bool]) – If True and the offline lookup fails, the UniProt web service is used to do the query.
Returns:

mnemonic – The UniProt mnemonic corresponding to the given Uniprot ID.

Return type:

str

protmapper.uniprot_client.get_mouse_id(human_protein_id)[source]

Return the mouse UniProt ID given a human UniProt ID.

Parameters:human_protein_id (str) – The UniProt ID of a human protein.
Returns:mouse_protein_id – The UniProt ID of a mouse protein orthologous to the given human protein
Return type:str
protmapper.uniprot_client.get_primary_id(protein_id)[source]

Return a primary entry corresponding to the UniProt ID.

Parameters:protein_id (str) – The UniProt ID to map to primary.
Returns:primary_id – If the given ID is primary, it is returned as is. Otherwise the primary IDs are looked up. If there are multiple primary IDs then the first human one is returned. If there are no human primary IDs then the first primary found is returned.
Return type:str
protmapper.uniprot_client.get_protein_synonyms(protein_id)[source]

Return a list of synonyms for a protein.

Note that this function returns protein synonyms as provided by UniProt. The get_gene_synonym returns synonyms given for the gene corresponding to the protein, and get_synonyms returns both.

Parameters:protein_id (str) – The UniProt ID of the protein to query
Returns:synonyms – The list of synonyms of the protein
Return type:list[str]
protmapper.uniprot_client.get_rat_id(human_protein_id)[source]

Return the rat UniProt ID given a human UniProt ID.

Parameters:human_protein_id (str) – The UniProt ID of a human protein.
Returns:rat_protein_id – The UniProt ID of a rat protein orthologous to the given human protein
Return type:str
protmapper.uniprot_client.get_rgd_id(protein_id)[source]

Return the RGD ID given the protein id of a rat protein.

Parameters:protein_id (str) – UniProt ID of the rat protein
Returns:rgd_id – RGD ID of the rat protein
Return type:str
protmapper.uniprot_client.get_signal_peptide(protein_id, web_fallback=True)[source]

Return the position of a signal peptide for the given protein.

Parameters:
  • protein_id (str) – The UniProt ID of the protein whose signal peptide position is to be returned.
  • web_fallback (Optional[bool]) – If True the UniProt web service is used to download information when the local resource file doesn’t contain the right information.
Returns:

A Feature named tuple representing the signal peptide.

Return type:

Feature

protmapper.uniprot_client.get_synonyms(protein_id)[source]

Return synonyms for a protein and its associated gene.

Parameters:protein_id (str) – The UniProt ID of the protein to query
Returns:synonyms – The list of synonyms of the protein and its associated gene.
Return type:list[str]
protmapper.uniprot_client.is_human(protein_id)[source]

Return True if the given protein id corresponds to a human protein.

Parameters:protein_id (str) – UniProt ID of the protein
Returns:
Return type:True if the protein_id corresponds to a human protein, otherwise False.
protmapper.uniprot_client.is_mouse(protein_id)[source]

Return True if the given protein id corresponds to a mouse protein.

Parameters:protein_id (str) – UniProt ID of the protein
Returns:
Return type:True if the protein_id corresponds to a mouse protein, otherwise False.
protmapper.uniprot_client.is_rat(protein_id)[source]

Return True if the given protein id corresponds to a rat protein.

Parameters:protein_id (str) – UniProt ID of the protein
Returns:
Return type:True if the protein_id corresponds to a rat protein, otherwise False.
protmapper.uniprot_client.is_reviewed(protein_id)[source]

Return True if the UniProt ID corresponds to a reviewed entry.

Parameters:protein_id (str) – The UniProt ID to check.
Returns:
Return type:True if it is a reviewed entry, False otherwise.
protmapper.uniprot_client.is_secondary(protein_id)[source]

Return True if the UniProt ID corresponds to a secondary accession.

Parameters:protein_id (str) – The UniProt ID to check.
Returns:
Return type:True if it is a secondary accessing entry, False otherwise.
protmapper.uniprot_client.query_protein[source]

Return the UniProt entry as an RDF graph for the given UniProt ID.

Parameters:protein_id (str) – UniProt ID to be queried.
Returns:g – The RDF graph corresponding to the UniProt entry.
Return type:rdflib.Graph
protmapper.uniprot_client.query_protein_xml[source]

Retrieve the XML entry for a given protein.

Some information is only available in the XML entry for UniProt proteins (not RDF), therefore this endpoint is necessary.

Parameters:protein_id (str) – The UniProt ID of the protein to look up.
Returns:An ElementTree representation of the XML entry for the protein.
Return type:xml.etree.ElementTree
protmapper.uniprot_client.verify_location(protein_id, residue, location)[source]

Return True if the residue is at the given location in the UP sequence.

Parameters:
  • protein_id (str) – UniProt ID of the protein whose sequence is used as reference.
  • residue (str) – A single character amino acid symbol (Y, S, T, V, etc.)
  • location (str) – The location on the protein sequence (starting at 1) at which the residue should be checked against the reference sequence.
Returns:

  • True if the given residue is at the given position in the sequence
  • corresponding to the given UniProt ID, otherwise False.

protmapper.uniprot_client.verify_modification(protein_id, residue, location=None)[source]

Return True if the residue at the given location has a known modifiation.

Parameters:
  • protein_id (str) – UniProt ID of the protein whose sequence is used as reference.
  • residue (str) – A single character amino acid symbol (Y, S, T, V, etc.)
  • location (Optional[str]) – The location on the protein sequence (starting at 1) at which the modification is checked.
Returns:

  • True if the given residue is reported to be modified at the given position
  • in the sequence corresponding to the given UniProt ID, otherwise False.
  • If location is not given, we only check if there is any residue of the
  • given type that is modified.