What's new?

Dear User, we are constantly adding new features to STRING, even (or mostly) between the releases. Most of them are not announced and, partially by design, are easy to miss. This section lists smaller and larger features that were added to STRING in the last release, in order from the oldest to the newest. If you have a particular request for a feature please contact our helpdesk. The list was last updated in November 2022.

Proteome Annotation / Adding new species to STRING Top ↑

You can now add your proteome to STRING. On the input page under “Annotate your proteome”, you can now add completely new (or existing) species to STRING. The only data that is required is a FASTA file with the proteins in your organism (one protein per gene), and the clade name of the added proteome. STRING will then compute both functional and physical interaction networks, as well as functional annotation (among others GO and KEGG) of all the proteins in your proteome. Your proteome can be then browsed through the web interface in the same way as any other species in STRING. The data can be also accessed via API or downloaded on the STRING’s download page. You can also share your proteome with anyone using the provided species identifier (starts with the letters “STRG”).

New node design options Top ↑

In the “Settings” tab in the “network display options” you can find new network customization options. These include changing the look of the bubbles to the new 2D (flat) design, centering the node label on the node, and changing the font size of the label. In addition one can now access the node coloring mode (see below) from the “display options”.

Monarch Phenotypes enrichment Top ↑

We have added Monarch Phenotypes a new enrichment category. This includes Human, Mouse, Zebra Fish, Fruit fly, and C. elegans phenotype data. Human data includes both Human Phenotype Ontology (HP terms) and Experimental Factors Ontology (EFO terms), all terms are projected to their parents’ terms according to their ontology relationships.

New way to color your network Top ↑

STRING now (finally) has a simple way to color your nodes: click on the node and in the pop-up that will appear you will find a button "Enable node coloring mode". If you click it a new pop-up will show up instead with a color palette and the current node's color preselected. You can now click on any node in your network and change its color to the selected color. You are also able to assign a single color to all the nodes.

Protein annotation download file Top ↑

You can now download all the functional annotation STRING knows about your set of input proteins and not only the enriched subset of terms. The file is available in the "Exports" tab under "functional annotation". PMIDs are not included in this file as it would explode in size. Note that the file is anyway quite big as STRING resolves all the parent-child relationships of all the ontologies, for example, if a protein has a specific GO annotation like "selenocysteine lyase activity" you find it being resolved to all its parents including non-specific terms like "catalytic activity".

Physical network in STRINGdb R-package Top ↑

In the STRINGdb Bioconductor package you can now choose if you want to use a full (functional) STRING network or a physical subset of it. You can specify this using the "network_type" parameter when initiating the STRINGdb object.

string_db <- STRINGdb$new(version="11.5", species=9606, network_type="physical")

Confidence of the experimental data Top ↑

All experimental data imported into STRING is benchmarked before being included in the database. STRING assigns confidence based on the experimental method used as well as based on the experiment's power to predict the functional network. This confidence is now visualized in the experimental evidence view with colored tags (high/medium/exploratory) near each of the experiments.

Comma-separated input Top ↑

STRING now (finally) accepts comma-separated protein lists for “multi-protein” inputs. The input has to be one line, otherwise, we will think each line is a single identifier. If you have commas (,) in your identifiers quote them using double-quotes (").

Node degree download file Top ↑

In the "Export" section of STRING, you can now find a file that lists the node degree (number of links) of all the proteins in your current network. The link is present if it has a score larger than your specified score cut-off (you can see the link in your network). Try submitting this node-degree file to the "proteins with values/ranks" input, you will be able to visualize the node degree as a halo around each node.

Visualizing additional data on the network. Top ↑

If you have a set of proteins with a value attached you can input them in the "proteins with value/ranks" input. If the set is smaller than a few hundred proteins you will be prompted to submit it to "gene set enrichment" rather than a permutation-based enrichment which uses your submitted set as the background. If you click on "gene set enrichment" your input values will be visualized as halos around each of the proteins in your input. The color of the halo will correspond to the value in your network. If your values are on both sides of zero (0) we will detect it and assign different colors to the value above and below the zero (0).

Visualize a single network cluster Top ↑

When you cluster the STRING network using a clustering method found in the "Cluster" tab STRING will generate a simple table of your clusters each of which will be annotated with color and a list of proteins belonging to that cluster. The rows are clickable and they will take you to the network of that cluster with nodes retaining the color of that cluster. The analysis tab will automatically go into focus where you can analyse what functions are specific to your cluster.

Excel sheet is now a valid input Top ↑

STRING now (finally) accepts excel/OpenOffice sheets as a valid input format for "multi-protein" input and "proteins with values/ranks" input. For "multi-protein" input only the first column will be parsed for identifiers.

Physical interactions mode Top ↑

STRING now allows you to view the physical-only interaction network. The physical network will only display edges between the proteins for which we have evidence of their binding or forming a physical complex. The network can be viewed and used the same way as the default, functional association, STRING network. It can be filtered by score or by evidence channel and is fully accessible via the API.

When you upload a set of proteins with values ("Protein with Values/Ranks" input field), STRING searches for any enrichments in your dataset using your whole dataset as a background. In addition to that STRING now correlates your input values with various whole-genome statistical measures, including:

  • average protein abundance,
  • protein length,
  • number of publications mentioning the protein in PubMed
  • predicted average protein disorder,
  • average GC content of the encoding transcript.

All the plots are automatically generated, which lets you uncover potential trends in your data at a glance which may have an explanatory role in your enrichment analysis.

STRING cluster enrichment Top ↑

Each time you query STRING with a set of proteins we automatically, in the background, check for any possible enrichment in your data. This includes KEGG pathways, UniProt keywords, Gene Ontology terms, etc. and now also STRING local neighborhood clusters. These clusters are derived from hierarchically clustering the full STRING network. Such clustering generates sets of functionally associated proteins on multiple levels of hierarchy ranging from small groups of 5 proteins to large assemblies of 200 proteins. Each of these overlapping clusters is then named based on the annotations of its proteins. One advantage STRING clusters have over traditional enrichment categories is that they cover less studied, potentially not well-annotated parts of the network.

Query names in the network Top ↑

A feature that has been requested for a long time already is the ability for STRING to display the user-provided gene names in the network view, instead of the official names (or locus identifiers if an official name is not available) that STRING parses from the source databases. Now you can choose STRING to display your query names, instead of the default names in the “Settings” tab. For proteins that are not part of your query (like the added interaction partners), STRING will default to the official gene names.