Message boards : Rosetta@home Science : Feedback, .. bandwidth usage :-(
Previous · 1 · 2 · 3 · 4 · Next
Author | Message |
---|---|
SwZ Send message Joined: 1 Jan 06 Posts: 37 Credit: 169,775 RAC: 0 |
http://www.7-zip.org/download.html Thanks! I suspect that 7zip open source and for any platform, but me bore with argue this fact :( |
Nothing But Idle Time Send message Joined: 28 Sep 05 Posts: 209 Credit: 139,545 RAC: 0 |
Thanks! I suspect that 7zip open source and for any platform, but me bore with argue this fact :( Don't be discouraged; your efforts are not wasted nor unnoticed. |
SwZ Send message Joined: 1 Jan 06 Posts: 37 Credit: 169,775 RAC: 0 |
:-) I wrote small program for encoding/decoding Rosetta files and get about 1.5Mb in binary file as 7zip, but not exact preserve information. So 7zip is better! ;-) And optional enlarge computative cost of WU is some more better! |
blackbird Send message Joined: 4 Nov 05 Posts: 15 Credit: 93,414 RAC: 0 |
Preprocessing of the WU file for compression gains about 20% of size. E.g. : aa1r69_09_05.400_v1_3 8609797 bytes (Uncompressed) aa1r69_09_05.400_v1_3.gz 2783855 bytes aa1r69_09_05.400_v1_3.7z 1029164 bytes (-mx7) After preprocessing with the program described below: aa1r69_09_05.400_v1_3 8609797 bytes (Uncompressed original file) aa1r69_09_05.400_v1_3.cr 2289588 bytes (Uncompressed coordinates in integers) aa1r69_09_05.400_v1_3.ot 1038759 bytes (Uncompressed other information, can be reduced) Compressed with 7z (-mx7 -mlc=4 -mlp=2) aa1r69_09_05.400_v1_3.cr.7z 663003 bytes aa1r69_09_05.400_v1_3.cr.ot 164619 bytes Thus, 827731 bytes after converting versus 1029164 bytes (-19.5%). Of course, the sheduler should assign only one type of WU for host when the work is requested, not 8 different with 20 Mb traffic! |
SwZ Send message Joined: 1 Jan 06 Posts: 37 Credit: 169,775 RAC: 0 |
Blackbird, this is good and very intresting information, but most likely authors of Rosetta code to be of no concern to this discussion. I not see feedback from their. |
BennyRop Send message Joined: 17 Dec 05 Posts: 555 Credit: 140,800 RAC: 0 |
Perhaps it's time to ask which person on the project deals with the client-server communications programming, and if they've seen this and other recent discussions on the matter and are willing to comment on what we're suggesting. Have them post a list of objections or problems they see that we can then counter, overcome, or make further suggestions on possibilities that will help the project. |
FluffyChicken Send message Joined: 1 Nov 05 Posts: 1260 Credit: 369,635 RAC: 0 |
If you read the earlier posts Jack Schonbrun has been doing the reading/posting to this post (Project Developer & Scientist btw) Team mauisun.org |
SwZ Send message Joined: 1 Jan 06 Posts: 37 Credit: 169,775 RAC: 0 |
If open page http://staff.washington.edu/laidig/ we see BOINC server for the Rosetta@HOME project. So may be he (Keith E. Laidig) answer for the Project. And may be we can send e-mail, to it? laidig@u.washington.edu Next questions from me 1) Can we optional control for WU computational cost (beleave constant data transfer per WU). 2) Can we change form gzip to 7zip packing? 3) Who translate some pages to Russian? |
SwZ Send message Joined: 1 Jan 06 Posts: 37 Credit: 169,775 RAC: 0 |
If you read the earlier posts Jack Schonbrun has been doing the reading/posting to this post (Project Developer & Scientist btw) Aha, thanks! So, Jack Schonbrun. Very nice! May be worth place feedback information about self (e-mail for example) on main page. At least with small font :) |
SwZ Send message Joined: 1 Jan 06 Posts: 37 Credit: 169,775 RAC: 0 |
From Jack Schonbrun profile: "In addition of my regular research on protein folding, I've always been interested in the power of visualization to help us understand abstract concepts. So I've been helping out on developing the graphics for the screen saver portion of Rosetta@home." 4) Haw about my suggestion about graphic in thread https://boinc.bakerlab.org/rosetta/forum_thread.php?id=849 "I see problem in rotating view of proteins. a) Views not synchronized. When work algorithm, witch evaluate RMSD finding rotation matrix and translation vector which can used for rotate all structures on the screen to one point of view. b) Center of rotation not is center of gravity of protein. c) Rotation "not right". It is makes around of model axis, but usualy more comfortable rotating around scene axis (like we move by mouse nearest screen plane, and under it rolling protein ball:-)" |
blackbird Send message Joined: 4 Nov 05 Posts: 15 Credit: 93,414 RAC: 0 |
As for me, i'm temporary switching to P@H until sheduler and traffic issues will be resolved. Jack? |
David Baker Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 17 Sep 05 Posts: 705 Credit: 559,847 RAC: 0 |
Preprocessing of the WU file for compression gains about 20% of size. This makes a lot of sense. Does anybody know if BOINC allows this? |
SwZ Send message Joined: 1 Jan 06 Posts: 37 Credit: 169,775 RAC: 0 |
This makes a lot of sense. Does anybody know if BOINC allows this? This operations (packing/unpacking) makes Rosetta server and client application irrelative to BOINC. So it is enough change gzip algorithm to 7zip algorithm. |
River~~ Send message Joined: 15 Dec 05 Posts: 761 Credit: 285,578 RAC: 0 |
hi David, if BOINC doesn't allow it, you could zip the file before they go on the server, and unzip from the app when the app starts. If you take that approach you may well want to wait to test the new pre-zipped wu until the Ralph (Rosetta Alpha) sub-project is running - good as it would be to have better compression none of us (project or donors) is in the mood for another series of bad wu just now ;-) I also think a fair point has been made that wu should be matched to the files a user has on their hard drive - Einstein do this so you may like to ask your colleagues to talk to the folks at Einstein to see if you can do it as well. This will save your bandwidth as well as users. This also entails making absolutely sure that if a file has the same name it has the same contents. Both these changes would make a big difference to people on a metered connection (about 75% of UK internet users are either on modem or on ADSL with a usage cap and financial penalties for overrun). River~~ |
Strop Send message Joined: 2 Nov 05 Posts: 6 Credit: 305,041 RAC: 0 |
I agree that the bandwidth used is alot. Even I'm on cable with an upload limit of 4.5GB for 30 days... I'm having problems. Don't forget, people use theire internet to to surf, download, read e-mails and so on... I had to stop rosetta at my puters at home because of danger to get on smallband.. No extra payement here.. but put on smallband. I can certainly understand people who are on paying connections that they will have much bigger problems.. like paying extra, or even be cutt off completely from the net. So, again, if you could, can you make this one of you're priority's :-) Really is holding back alot of people in my opinion. BOINC.BE The team for Belgians who love the smell of glowing red cpu's in the morning. |
Moderator7 Volunteer moderator Send message Joined: 27 Dec 05 Posts: 10 Credit: 0 RAC: 0 |
I don't think the decision has been made as to whether to better compress the files, or assign result types based on the host to reduce the number of files downloaded, or both - but the bandwidth issue is on the "to-do" list, near or in the "top ten". I don't know how long it will take to get to it though. Limited number of people to do the work... |
David E K Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 1 Jul 05 Posts: 1018 Credit: 4,334,829 RAC: 0 |
The suggestions mentioned in this thread are great and we've thought about these issues at the start of project development. Keep in mind that our staff is small and we are limited in development time. Also, some of these suggestions have to be implemented in boinc rather than rosetta (i.e. changes in the scheduler and compressing the executable without having to provide another app). The easiest way for us to reduce bandwidth would be: 1. to use a better compression method like bzip2 or 7zip. We'll look into this but it will take time to develop and test. At the start of project development, we added zlib to rosetta specifically for boinc and chose it over other packages because of cross-platform issues, available wrappers/support libraries, and ease of use. 2. use smaller fragment files (the _v1_3 files). We are currently testing whether larger fragment files improve results and the jury is still out but if they improve results by a small fraction do we continue to use them? 3. to implement locality scheduling provided by boinc. This would require changes to rosetta to read in a single large fragment file. We'll look into this. some comments: Due to the nature of our application and the large input files that are required, R@h is not suitable for dial-up users who mind the long wait required for downloads. The application cannot be compressed unless we provide another application that uncompresses it which adds a bit of complexity since we would have to use a compound application. This can be done but would require development and testing. A better option would be to add compression to boinc so that it would always send compressed data to the clients and back but this is a task for boinc developers. Download files are considered immutable so they should always have the same content - we follow this boinc rule. We could easily make the large fragment files "sticky" like we do with the rosetta database files (which are not work unit specific) but then we would use up client disk space and would have to manage the files on every client somehow to make sure they are removed when work unit batches are finished since they are specific to each batch. It would be better to use locality scheduling which is provided by boinc and should handle file managemnt, and as mentioned above, we will look into modifying rosetta for this. IF THERE ARE ANY DEVELOPERS OUT THERE INTERESTED IN ADDING A BETTER COMPRESSION METHOD LIKE 7zip or bzip2 THAT IS CROSS-PLATFORM COMPATIBLE, PLEASE SEND ME AN EMAIL AND I CAN PROVIDE YOU WITH OUR COMPRESSION WRAPPER CODE. THE WRAPPER CODE CAN BE USED, EXTENDED, AND TESTED INDEPENDENT OF ROSETTA. dekim at u dot washington dot edu The best we could do in the short term is to use a better compression method and use locality scheduling but this would still require some development. |
blackbird Send message Joined: 4 Nov 05 Posts: 15 Credit: 93,414 RAC: 0 |
UPX can be helpful with compressing Rosetta application: rosetta_4.80_i686-pc-linux-gnu 8323696 (uncompressed) rosetta_4.80_i686-pc-linux-gnu 3257302 (compressed) From my basic understanding how Rosetta works, i can suspect that the best way to decrease the traffic would be assigning a lot of random points for one protein. Eg. host downloads a protein, then 20 WUs with random points instead of loading 20 different proteins, thus reducing traffic (it would be the best solution). Another way to decrease the traffic is WU compression. As i have mentioned before, deep knowledge of WU structure is required to find the most apropriate compression method. In fact, if 2 digit mantissa of traectories is enough for computations, then traectories can be stored as words, which can give about 30% less compressed file size. Traffic issue is an often forgotten problem for scientists when intranet computations are transferred to internet-based solution. I believe that it is very important problem because more users mean more bandwidth for servers. In fact, you must select between building new server and optimising transfers. |
SwZ Send message Joined: 1 Jan 06 Posts: 37 Credit: 169,775 RAC: 0 |
IF THERE ARE ANY DEVELOPERS OUT THERE INTERESTED IN ADDING A BETTER COMPRESSION METHOD LIKE 7zip or bzip2 THAT IS CROSS-PLATFORM COMPATIBLE, PLEASE SEND ME AN EMAIL AND I CAN PROVIDE YOU WITH OUR COMPRESSION WRAPPER CODE. THE WRAPPER CODE CAN BE USED, EXTENDED, AND TESTED INDEPENDENT OF ROSETTA. Send this Wrapper code to <gionov@mail.ru>, please. With best wishes, Gennady Ionov. |
FluffyChicken Send message Joined: 1 Nov 05 Posts: 1260 Credit: 369,635 RAC: 0 |
IF THERE ARE ANY DEVELOPERS OUT THERE INTERESTED IN ADDING A BETTER COMPRESSION METHOD LIKE 7zip or bzip2 THAT IS CROSS-PLATFORM COMPATIBLE, PLEASE SEND ME AN EMAIL AND I CAN PROVIDE YOU WITH OUR COMPRESSION WRAPPER CODE. THE WRAPPER CODE CAN BE USED, EXTENDED, AND TESTED INDEPENDENT OF ROSETTA. You need to send an email his way PLEASE SEND ME AN EMAIL AND I CAN PROVIDE YOU WITH OUR COMPRESSION WRAPPER CODE. THE WRAPPER CODE CAN BE USED, EXTENDED, AND TESTED INDEPENDENT OF ROSETTA. dekim at u dot washington dot edu As for application compression, anyone have contacts with CPDN as they seem to do it. Team mauisun.org |
Message boards :
Rosetta@home Science :
Feedback, .. bandwidth usage :-(
©2024 University of Washington
https://www.bakerlab.org