Author | Message |
[VENETO] boboviz
Send message
Joined: 1 Dec 05 Posts: 1994 Credit: 9,633,537 RAC: 7,232
|
Dear CASP Participants,
As most of you know, there will be a CASP14 experiment in 2020.
The first CASP14 targets will be released at the beginning of April (earlier than in the past). Registration and server testing will begin in March. As usual, the experiment will culminate with a conference in December, this time in Europe.
Prediction categories will be as in CASP13 (see the CASP13 section of predictioncenter.org for details). Three points of note:
1. We hope to again to work closely with CAPRI on the modeling of complexes, and will strive to obtain more and better targets. Clearly this is an area where deep learning may next have major impact.
2. The data assisted category will focus more on large proteins and particularly complexes with SAXS and cross-linking data.
3. A planned change is the addition of inter-residue distances predictions as well as inter-residue contact prediction. We will be consulting with the participant community about the best way to do that.
Given the tremendous progress of the last two experiments and the clear potential for further progress, we look forward to an exciting experiment. Success will primarily depend on two things: your participation and the identification of suitable targets, particularly for complexes. Please, please, help find targets - talk to your colleagues now.
|
|
[VENETO] boboviz
Send message
Joined: 1 Dec 05 Posts: 1994 Credit: 9,633,537 RAC: 7,232
|
Timetable
March 2020 - Start of the registration for CASP14 prediction experiment.
April 1, 2020 - Start of the testing of server connectivity ("dry run" for server predictors).
April 15, 2020 - Release of the first CASP14 modeling targets.
May/June 2020 - Early bird registration for the December CASP14 conference.
July 16, 2020 - Last date for releasing regular targets.
July 31, 2020 - End of the regular modeling season.
August 21, 2020 - End of the refinement and data-assisted modeling season.
September 2020 - Collection of abstracts describing the methods used in CASP14.
October/November 2020 - Invitations to groups with the most accurate models and the most interesting methods to give talks at the CASP14 conference.
November 2020 - Program of the conference finalized.
December 2020 - CASP14 Conference.
|
|
dcdc
Send message
Joined: 3 Nov 05 Posts: 1832 Credit: 119,664,803 RAC: 11,191
|
Can anyone explain what is expected from this year's CASP?
Is Alphafold entering again, and will they be utilising Rosetta again?
Does AI at the start and then Rosetta (or similar) seem to be the immediate future of all protein modelling, or does it only apply to determining existing structures rather than generating new ones?
|
|
dcdc
Send message
Joined: 3 Nov 05 Posts: 1832 Credit: 119,664,803 RAC: 11,191
|
|
|
[VENETO] boboviz
Send message
Joined: 1 Dec 05 Posts: 1994 Credit: 9,633,537 RAC: 7,232
|
Is Alphafold entering again, and will they be utilising Rosetta again?
Alphafold is costantly developed and updated, so i'm curious about this second test of AI in CASP.
[url]Does AI at the start and then Rosetta (or similar) seem to be the immediate future of all protein modelling, or does it only apply to determining existing structures rather than generating new ones?[/quote]
Prediction is the first (very important) stage. The next step will be....who knows?
P.S.
In the past R@H team released wus for CASP to test new apps and new protocols.
But now R@H code is 2 years old, so i don't know how we volunteers can help this competition.
|
|
[VENETO] boboviz
Send message
Joined: 1 Dec 05 Posts: 1994 Credit: 9,633,537 RAC: 7,232
|
I need to test about trRosetta next...
trRosetta seems very interesting:
- it's open
- it runs on gpu (from paper: "we train 5 networks with random 95/5%training/validation splits and use the average over the 5 networks as the final prediction. Training a single network takes∼9 days on one NVIDIA Titan RTX GPU").
|
|
dcdc
Send message
Joined: 3 Nov 05 Posts: 1832 Credit: 119,664,803 RAC: 11,191
|
S Does anyone know how much of the process is done through Alphafold on GPU vs the fine tuning at the end by rosetta, which is presumably still on CPU?
And is trRosetta the same?
Does this mean R@h is likely to switch to needing top end nVidia GPUs? Or isn't it suitable to run on a distributed platform?
|
|
[VENETO] boboviz
Send message
Joined: 1 Dec 05 Posts: 1994 Credit: 9,633,537 RAC: 7,232
|
Does anyone know how much of the process is done through Alphafold on GPU vs the fine tuning at the end by rosetta, which is presumably still on CPU?
The latest news about Rosetta and Alphafold is on Foldit blog.
And is trRosetta the same?
trRosetta is a "personalized" version of Alphafold developed by Bakerlab.
This is the presentation of the tool.
Does this mean R@h is likely to switch to needing top end nVidia GPUs? Or isn't it suitable to run on a distributed platform?
Who knows?
I think is important to see the various performance during this CASP.
|
|
[VENETO] boboviz
Send message
Joined: 1 Dec 05 Posts: 1994 Credit: 9,633,537 RAC: 7,232
|
|
|
[VENETO] boboviz
Send message
Joined: 1 Dec 05 Posts: 1994 Credit: 9,633,537 RAC: 7,232
|
News from CASP
The first ten CASP-covid targets are all difficult modeling challenges, and as expected there is often considerable variation between model structures. But visual comparison of the first-round models shows very interesting patterns and alternatives. For the larger targets, some consistent domain boundaries are apparent and there are specific alternative topologies for some domains. All of which suggests a second round of modeling that builds on these results will be rewarding. A Microsoft Teams site to discuss the results has been setup, with a channel for each target. Invitations will be sent out to active participants shortly.
We propose to run the second round of modeling as follows: Two weeks for community analysis and discussion of the models, followed by a two-week window for model revision, with model submission between May 14th and 17th (CASP server testing between May 4th and 13th restricts the submission window). First and second round models will not be considered CASP14 submissions. However, we do propose to include CASP-covid targets as official refinement targets early in the CASP14 prediction season (now scheduled to begin on Monday, May 18). There is no restriction on starting with the selected model, and additional submissions will be strongly encouraged. Of course, we don’t know which of these structures will be obtained experimentally within the CASP14 season, but expect that some will.
We hope that this overall four-step process (initial models; accuracy estimations, comparison and discussion; revised models; and refinement) will be an effective framework for leveraging our advantages as a community. We anticipate that the results will form the basis of a joint publication (all participants as authors) and a session at the CASP meeting.
We are working on preparing an additional set of targets with host proteins. Because of the timing, these will be released as CASP14 targets. We would welcome suggestions on best target choice (individual and complexes). We would also very much welcome feedback on this plan. Contributions can be communicated directly to the organizing committee or through the general Microsoft Teams discussion.
|
|
Admin Project administrator
Send message
Joined: 1 Jul 05 Posts: 4805 Credit: 0 RAC: 0
|
We have a version of TrRoseta (a model much like the published version but also including PDB templates) that has been benchmarked and continues to be benchmarked on CAMEO as a hidden server. It undoubtedly performs better than the current prediction method used by Robetta for medium to hard targets, which is (Robetta) consistently the best performing server among public servers. We plan to add the latest protocols that will be tested in this coming CASP to Robetta in the near future and open it up to the public. This will not only improve the prediction quality for most targets, but will also significantly reduce the cpu computing requirements. We also are looking into the possibility of running these ML models and minimization modeling strategies to R@h which will make use of GPUs. This may require the ability to run python, Tensor Flow, and PyRosetta on BOINC clients.
|
|
[VENETO] boboviz
Send message
Joined: 1 Dec 05 Posts: 1994 Credit: 9,633,537 RAC: 7,232
|
We plan to add the latest protocols that will be tested in this coming CASP to Robetta in the near future and open it up to the public. This will not only improve the prediction quality for most targets, but will also significantly reduce the cpu computing requirements. We also are looking into the possibility of running these ML models and minimization modeling strategies to R@h which will make use of GPUs. This may require the ability to run python, Tensor Flow, and PyRosetta on BOINC clients.
That's VERY interesting!!
|
|
monk_duck
Send message
Joined: 17 Nov 09 Posts: 11 Credit: 284,039 RAC: 0
|
Very cool, look forward to it.
|
|
dcdc
Send message
Joined: 3 Nov 05 Posts: 1832 Credit: 119,664,803 RAC: 11,191
|
The trRosetta and Alphafold work sounds really interesting. Are you likely to push a version of trRosetta out to Ralph in the future?
|
|
[VENETO] boboviz
Send message
Joined: 1 Dec 05 Posts: 1994 Credit: 9,633,537 RAC: 7,232
|
The registration for CASP14 is under way, and as of now we have already 159 methods (including 59 servers) registered.
We will start checking connectivity and correctness of prediction format for servers in the RR (contacts) and TS (tertiary structure and assembly) categories on Monday, May 4. Testing model accuracy servers (QA) will start after TS predictions are collected and collated into tarballs for the EMA estimation (May 10). If you plan to participate in the server track of CASP14 but have not registered your server(s) yet - we advise you to do that at your earliest opportunity. We will be sending your servers a request to predict a test target, which, obviously, will not be a part of the CASP14 experiment.
We want to emphasize that the RR format for contact predictions has changed. Now you can submit the old-style contact predictions in a 3-column format (cf. 5-column format in CASP13) , or you can submit contact prediction AND distance probabilities in a 13-column format. Please consult our format page (http://predictioncenter.org/casp14/index.cgi?page=format) for prediction format details.
|
|
[VENETO] boboviz
Send message
Joined: 1 Dec 05 Posts: 1994 Credit: 9,633,537 RAC: 7,232
|
After consultations with our community and CASP14 free modeling assessors, we are announcing second round of prediction for tentative domains of six first-round targets. The suggested targets and their domain boundaries are as follows:
From nsp2 : C1901d1: 1-359; C1901d2: 360-499; C1901d3: 500-638
From nsp4 : C1902d1: 1-32 + 279-400; C1902d2: 33-278; C1902d3: 401-500
From nsp6 : C1903d1: 1-220; C1903d2: 221-290
From PL-PRO : C1904d1: 1-151; C1904d2: 152-317; C1904d3: 318-686
From ORF3a: C1905d1: 1-130; C1905d2: 131-275
From M-protein: C1906d1: 1-103; C1906d2: 104-222
Please note that the second round is open to everyone and not only the participants of the first round. We ask the community to start modeling the suggested domains immediately. Since we are currently working with curators of CASP14 servers on their servers' connectivity and format correctness, the system for accepting predictions will be not immediately available. Once we are done with testing servers (May 14), we will have opened the CASP_Commons infrastructure for accepting CASP-covid predictions. The window for submitting CASP-covid predictions will be open for three and a half days until May 17, 11:59pm PDT. The regular CASP14 prediction season will start on May 18 at 9am PDT.
We want to inform the community that there were interesting and informative discussions of the first-round targets and predictions at two Zoom conferences last week. Some of the insights from the participants have been posted to target-related channels of the CASP-covid Microsoft Teams. The overview of targets and results presented by John Moult at these conferences is available through the 'How to proceed with the COVID-19 initiative' channel ('Files' tab). Invitation to the CASP-covid MS teams have been sent to registered participants of the CASP-covid experiment. If you did not participate in the first round, but are planning on participating in the second and want an access to the discussion forum - please request it by replying to this email.
|
|
[VENETO] boboviz
Send message
Joined: 1 Dec 05 Posts: 1994 Credit: 9,633,537 RAC: 7,232
|
Dear CASPers,
1. We will stop accepting stage2 CASP-covid structural predictions at 6 am PDT on Monday, May 18. All collected predictions will be posted to the Prediction Center's Data Archive the same day. The mechanism for submitting stage2 CASP-covid QA predictions will be announced shortly.
2. The CASP14 prediction season will start at 9 am on Monday, May 18. In the first week, we are releasing seven targets - one on Monday, Wednesday and Friday, and two (including one server-only) on Tuesday and Thursday. Please remember that server-only targets are usually selected from among easier-to-predict structures, so they are intended for prediction by server groups only. Please also remember that contact predictions (RR) are typically assessed on harder targets only, so there is no need to submit contact predictions on server-only targets even for server groups.
3. Plans for the second week of CASP14 will be announced next Friday.
|
|
[VENETO] boboviz
Send message
Joined: 1 Dec 05 Posts: 1994 Credit: 9,633,537 RAC: 7,232
|
We have posted predictions from the second stage of CASP-covid experiment at https://predictioncenter.org/download_area/CASPCOMMONS/2020_COVID-19/TS_predictions/stage2
We will be collecting model accuracy estimations on the submitted structure models from Thursday May 21 till Saturday May 23. Please submit your QA estimates by using the CASP_commons prediction submission page - please pay attention that you use the CASP-covid (and not casp14) infrastructure to submit the CASP-covid QA predictions.
|
|
[VENETO] boboviz
Send message
Joined: 1 Dec 05 Posts: 1994 Credit: 9,633,537 RAC: 7,232
|
Have you missed CASP? We bet you did. Even amidst the coronavirus disruption of normal life, we are seeing a strong registration for CASP14, with 240 prediction groups declaring their intent to participate. 65 of these groups (including 54 servers) have already sent us their first predictions. In all, by the end of the first prediction week we have already collected over 1,300 tertiary structure predictions and 200 contact predictions on seven single-unit targets. The real time statistics of the CASP14 experiment can be found at https://predictioncenter.org/casp14/numbers.cgi.
Announcements.
1. We have ended accepting CASP-covid QA predictions for the stage2-generated models today (Saturday, May 23).
2. Also today (at 12:15 pm PDT), we have sent first queries to the registered EMA servers. We will be collecting model accuracy estimates on the first CASP14 target, T1024, through May 25 (stage1 QA), and May 27 (stage2 QA).
3. The majority of server tertiary structure predictions (150 models selected for QA-stage2) will be posted at our data archive (https://predictioncenter.org/download_area/CASP14) on the 7th day after the target release (i.e., May 25 for T1024). All server models (including tertiary structure and contact) will be posted at the same place two days later.
4. Next week we are planning to release 10 single-sequence targets (some homo-multimers) and one heteromer. Two of these targets are selected for CAPRI-50 experiment by the CAPRI organizers. These targets will be marked with an asterisk next to the target name in the CASP Target List (https://predictioncenter.org/casp13/targetlist.cgi).
5. Five odd-numbered targets in the next week's release (i.e., T1031, 33, ...) are constitutive domains of a large RNA polymerase (over 2,000 residues). CASP organizers are already in possession of coordinates for this structure, and together with the FM assessors analyzed it and cut into domains for prediction. We want to emphasize that a preprint of the paper describing the structure is available at BioRxiv. However, the organizers and the FM assessors concluded that the ribbon diagrams shown there cover only about 550 residues of the structure leaving about 1,600 largely un-compromised. We decided to proceed with the prediction of the selected domains, disclosing the available information to all predictors. Link to the BioRxiv paper will be provided in the Additional information for every prediction target emanating from this protein.
|
|
[VENETO] boboviz
Send message
Joined: 1 Dec 05 Posts: 1994 Credit: 9,633,537 RAC: 7,232
|
05/26
Dear predictors,
Tomorrow we will release the first CASP14 hetero-meric target - H1036, a virus protein (trimer) with bound Fab fragments. This complex was also picked as a CAPRI target T165. Please note that subunits of this target are either known structures or expected to be reliably modeled with template-based modeling. The main modeling challenge for this complex is to get Fab binding sites right and correctly model the relative orientation of subunits. Due to the nature of this target, we are not asking predictors for intra-chain contact predictions, however you may consider submitting inter-chain contacts.
All in all, there will be 3 targets tomorrow: another domain from an RNA polymerase (T1035); a multimeric complex H1036; and its first subunit (chain A) T1036s1, which will be designated a server-only target and thus will not require RR prediction.
We will have only one target on Thursday (yet another domain from the polymerase), and two more targets on Friday.
CASP organizers
|
|