{ "cells": [ { "cell_type": "code", "execution_count": 2, "metadata": { "tags": [ "remove-input" ] }, "outputs": [], "source": [ "#import sys\n", "#sys.path.append('./helpers')\n", "\n", "#from helpers import settings\n", "#import settings\n", "\n", "###\n", "from myst_nb import glue\n", "from IPython.display import IFrame\n", "from IPython.display import Markdown\n", "# Additional styling ; should be moved into helpers\n", "#from IPython.core.display import display, HTML\n", "#HTML(''.format(open('styler.css').read()))\n", "#import pandas as pd\n", "from datetime import datetime\n", "from IPython.display import Markdown as md\n", "updated = datetime.now()\n", "now = updated.strftime(\"%A, %B %d %Y at %H:%M\" )" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Introduction to scientific computing & computational modeling" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [ "remove-input" ] }, "outputs": [], "source": [ "md(f\"-Last updated on {now}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "````{panels}\n", ":column: col-lg-12 p-2\n", "\n", "```{image} img/sna-pop-3.png\n", ":width: 300px\n", ":name: cglipid\n", ":align: right\n", ":class: rounded-circle\n", "```\n", "\n", "**Learning goals:** \n", "- To have a basic idea of what scientfic computing and computational modeling are.\n", "- To get some historical background on computing.\n", "- To understand the relation between experiments, theory and computation.\n", "- To understand the need for different models and their relations. Several methods and models that will be discussed later in the course will be mentioned. The aim is to make the students acquainted to the terminology and make them comfortable in using the terms and termonology a litte-by-little over the course. \n", "\n", "**Keywords:** scientific computing, multiscale modeling, high performance computing, history of computing \n", "\n", "\n", "**Note:** To be more concrete, some of the examples are drawn from physics, chemistry and chemical engineering. The discussion is quite general and doesn't assume background in them. \n", "\n", "````\n", "\n", "
\n", "\"Computational chemistry is a branch of chemistry that uses computer simulation to assist in solving chemical problems. It uses methods of theoretical chemistry, incorporated into efficient computer programs, to calculate the structures and properties of molecules and solids. It is necessary because, apart from relatively recent results concerning the hydrogen molecular ion (dihydrogen cation, see references therein for more details), the quantum many-body problem cannot be solved analytically, much less in closed form. While computational results normally complement the information obtained by chemical experiments, it can in some cases predict hitherto unobserved chemical phenomena. It is widely used in the design of new drugs and materials. \" \n", "\n", "\n", "Although the field of computational chemistry didn't exist in the 1830's, [Auguste Comte](https://en.wikipedia.org/wiki/Auguste_Comte), the philosopher behind [positivism](https://en.wikipedia.org/wiki/Positivism), stated that it is impossible and destructive to try to study chemistry using mathematics (and as an extrapolation by computers):\n", "\n", "
\n", "\"Every attempt to employ mathematical methods in the study of chemical\n", "questions must be considered profoundly irrational and contrary to the\n", "spirit of chemistry.... if mathematical analysis should ever hold a\n", "prominent place in chemistry - an aberration which is happily almost\n", "impossible - it would occasion a rapid and widespread degeneration of that\n", " science.\"\n", " -Auguste Comte, Cours de philosophie positive, 1830\n", "\n", "\n", "This view is, if course, very fitting with the idea of [positivisim](https://en.wikipedia.org/wiki/Positivism) and wrong (or '[not even wrong](https://en.wikipedia.org/wiki/Not_even_wrong)') in terms of science as it defies the existence of predictive theories. \n", "\n", "\n", "\n", "Rather than trying to provide a strict defintion or classification as what computational chemistry is, let's rather list methods and approaches use by computational chemists - after all computational chemistry includes various simulation techniques, structural analysis, using computers to design or predict new molecules or/and molecular binding, and data mining from experimental and/or computational results. It is also hard - and unnecessary - to draw strict defintion between computational chemistry, physics or, say, chemical engineering or pharmacy/drug design. Although these fields all have their own special characteristics, from the computational perspective they share the same methods and methodologies many of them, especially when it come to algorithms and computational techniques, come from applied mathematics and computer science. Right now, we restrict ourselves just to provide and a list and we will discuss some of them in detail later. For most part here, we focus on simulation and we will also discuss some analysis and machine learning techniques. What is also common to the computational disciplines is that one needs in-depth knowledge across traditional disciplines. \n", "\n", "- Quantum mechanical methods: *ab initio* methods, density functional theory, Car-Parrinello simulations\n", "- Classical molecular dynamics, steered molecular dynamics, non-equilibrium molecular dynamics\n", "- Coarse-grained molecular dynamics\n", "- Lattice-Boltzmann simulations\n", "- Phase field simulations\n", "- Finite element methods\n", "- Monte Carlo techniques\n", "- Optimization techniques\n", "- Machine learning methods, Bayesian techniques, Markov models, deep learning\n", "- Free energy calculations\n", "\n", "... and the list goes on. This (by necessity incomplete) list aims to convey the idea that the field is very broad with no clear boundries." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "See also the short interview of one of the pioneers of computational chemistry / physics / chemical engineering:\n", "\n", "\n", "\n", "**Beauty in Science: Biophysicist Klaus Schulten & his Computational Microscope**\n", "\n", "
\n", "\"Today the computer is just as important a tool for chemists as the test tube. Simulations are so realistic that they predict the outcome of traditional experiments.\" \n", "\n", " -Pressmeddelande, Kungliga Vetenskapsakademien\n", "\n", " \n", "Like Kohn, Martin Karplus was born in Vienna, Austria and moved to the USA to escape the nazis. He did his PhD in chemistry at Caltech with the two-time Nobel Laureate [Linus Pauling](https://en.wikipedia.org/wiki/Linus_Pauling). He is the originator of the [CHARMM](https://www.charmm.org/) molecular simulation package. Karplus was at Université de Strasbourg and Harvard at the time of the Prize. Arieh Warshel was postdoc (1970) with Karplus' at Harvard in Chemical Physics. He went to the Weizmann Institute in Isreal for a faculty job but moved later to the University of Southern California after [Weizmann didn't grant him tenure](https://www.ynetnews.com/articles/0,7340,L-4438751,00.html). [Michael Levitt](https://en.wikipedia.org/wiki/Michael_Levitt) was born in Pretoria, South Africa and he did his PhD in Biophysics at Cambridge. At the time of the Prize he was at Stanford where he is a professor of Structural Biology. As a side note on tenure and Nobel Prizes, Nobel winners that were not given tenure include at least [Tom Sargent](https://en.wikipedia.org/wiki/Thomas_J._Sargent) (Economics 2011; University of Pennsylvania didn't give tenure), [Lars Onsager](https://en.wikipedia.org/wiki/Lars_Onsager) (Chemistry 1968; dismissed/let go by both John's Hopkins and Brown universities).\n", " \n", "\n", "\n", "````{panels}\n", ":card: shadow-none, border-0\n", ":column: col p-0 m-0\n", "\n", "```{figure} img/kohn-cc.jpg\n", " :width: 200px\n", " :name: figure-kohn\n", " [Walter Kohn](https://www.nobelprize.org/prizes/chemistry/1998/kohn/biographical/). Photo from Wikipedia, Creative Commons license.\n", " ```\n", "---\n", "```{figure} img/pople-cc.png\n", " :width: 200px\n", " :name: figure-pople\n", " [John Pople](https://www.nobelprize.org/prizes/chemistry/1998/pople/biographical/). Photo from Wikipedia, Creative Commons license.\n", " ```\n", "---\n", "```{figure} img/karplus-cc.jpg\n", " :width: 200px\n", " :name: figure-karplus\n", " [Martin Karplus](https://www.nobelprize.org/prizes/chemistry/2013/karplus/biographical/). Photo from Wikipedia, Creative Commons license.\n", " ```\n", "---\n", "```{figure} img/warshel-cc.jpg\n", " :width: 200px\n", " :name: figure-warshel\n", " [Arieh Warshel](https://www.nobelprize.org/prizes/chemistry/2013/warshel/biographical/). Photo from Wikipedia, Creative Commons license.\n", " ```\n", "---\n", "```{figure} img/levitt-cc.jpg\n", " :width: 200px\n", " :name: figure-levitt\n", " [Michael Levitt](https://www.nobelprize.org/prizes/chemistry/2013/levitt/biographical/). Photo from Wikipedia, Creative Commons license.\n", " ```\n", "````\n", "\n", "\n", "\n", "**More:**\n", "\n", "- [Models of success](https://www.chemistryworld.com/features/models-of-success/6701.article) from [Chemistry World](https://www.chemistryworld.com/)\n", "- [Lindau Nobel Laureate Meetings](https://www.lindau-nobel.org/)\n", "- [Nobel Prize facts from the Nobel Foundation](https://www.nobelprize.org/prizes/facts/nobel-prize-facts/). Includes a lot of interesting information including prizes to married couples, mother/father & child and so on.\n", "- [Nobel Prize controversies](https://en.wikipedia.org/wiki/Nobel_Prize_controversies)\n", "\n", " \n", "### Other notable prizes in computational chemistry & physics\n", "\n", "Since we got into the Nobel Prizes, let's list a few others.\n", "Probably the best know is the [Berni J. Alder CECAM Prize](https://www.cecam.org/awards), the acronym [CECAM](https://www.cecam.org/) comes from *Centre Européen de Calcul Atomique et Moléculaire* and it is currently headquartered at the École polytechnique fédérale de Lausanne (EPFL) in Switzerland. CECAM is a pan-European organization and on the practical side it is well-known for its [workshops, summer schools and conferences](https://www.cecam.org/program?type=all&month=all&location=all). The Prize, named after [Berni Alder](https://en.wikipedia.org/wiki/Berni_Alder) one of the pioneers of molecular simulations, is granted every three years. The prize was established in 1999 and has thus far been granted to\n", "\n", "- 2019: [Sauro Succi](https://en.wikipedia.org/wiki/Sauro_Succi), Italian Institute of Technology, Center for Life Nanosciences at La Sapienza in Rome, [for his pioneering work in lattice-Boltzmann (LB) simulations](https://www.cecam.org/award-details/2019-sauro-succi).\n", "- 2016: [David M Ceperley](https://en.wikipedia.org/wiki/David_Ceperley), Department of Physics, University of Illinois at Urbana-Champaign and Eberhard Gross, Max Planck Institute of Microstructure Physics, Halle for \"[fundamental ground-breaking contributions to the modern field of electronic structure calculations](https://www.cecam.org/award-details/2016-david-m-ceperley-and-eberhard-k-u-gross)\".\n", "- 2013: [Herman J.C. Berendsen](https://en.wikipedia.org/wiki/Herman_Berendsen) and [Jean-Pierre Hansen](https://en.wikipedia.org/wiki/Jean-Pierre_Hansen), Groningen and Cambridge for [\"outstanding contributions to the developments in molecular dynamics and related simulation methods\"](https://www.cecam.org/award-details/2013-herman-jc-berendsen-and-jean-pierre-hansen)\n", "- 2010: [Roberto Car](https://en.wikipedia.org/wiki/Roberto_Car) (Princeton) and [Michele Parrinello](https://en.wikipedia.org/wiki/Michele_Parrinello) (ETH Zürich) for [\"their invention and development of an ingenious method that, by unifying approaches based on quantum mechanics and classical dynamics, allows computer experiments to uncover otherwise inaccessible aspects of physical and biological sciences\"](https://www.cecam.org/award-details/2010-roberto-car-and-michele-parrinello)\n", "- 2007: [Daan Frenkel](https://en.wikipedia.org/wiki/Daan_Frenkel), Cambridge. [](https://www.cecam.org/award-details/2007-daan-frenkel)\n", "- 2004: [Mike Klein](https://en.wikipedia.org/wiki/Michael_L._Klein), Temple University. The committee wrote: [\"Mike Klein’s leadership has been crucial in the development of a variety of computational tools such as constant-temperature Molecular Dynamics, Quantum simulations (specifically path-integral simulations), extended-Lagrangian methods and multiple-timestep Molecular Dynamics\"](https://www.cecam.org/award-details/2004-mike-klein)\n", "- 2001: [Kurt Binder](https://en.wikipedia.org/wiki/Kurt_Binder), University of Mainz [\"for pioneering the development of the Monte Carlo method as a quantitative tool in Statistical Physics and for catalyzing its application in many areas of physical research\"](https://www.cecam.org/award-details/2001-kurt-binder)\n", "- 1999: [Giovanni Ciccotti](https://en.wikipedia.org/wiki/Giovanni_Ciccotti), University of Rome La Sapienza for [\" pioneering contributions to molecular dynamics\"](https://www.cecam.org/award-details/1999-giovanni-ciccotti)\n", "\n", "Other prizes include:\n", "\n", "- The APS [Aneesur Rahman Prize for Computational Physics](https://www.aps.org/programs/honors/prizes/rahman.cfm)\n", "- [ACS Award for Computers in Chemical and Pharmaceutical Research](https://www.acscomp.org/awards/acs-award-for-computers-in-chemical-and-pharmaceutical-research)\n", "\n", "The reason for listing all these is simply to indicate how diverse the field is." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## A few examples\n", "\n", "Before going any further, let's take a look some examples from simulations to get a more concrete view of what simulations can do. Further examples will be provided as we progress. These are taken from our own simulations to avoid any copyright issues. Use the blue arrow to see more movies.\n", "" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "slideshow": { "slide_type": "skip" }, "tags": [ "remove-input" ] }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " " ], "text/plain": [ "
\n", "Overall, simulations can cast light on the basic architectural principles of molecules that explain fundamental cellular mechanics and might eventually direct the design of proteins with desired mechanical properties. Most importantly, beyond the understanding gained regarding the molecular mechanisms of force-bearing proteins, these examples serve to demonstrate that incessant advancements of MD methodology bring about discoveries stemming from simulation rather than from experiment. Molecular modeling, while useful as a means to complement many experimental methodologies, is rapidly becoming a tool for making accurate predictions and, thereby, discoveries that stand on their own. In other words, it is becoming a computational microscope. \n", " \n", "\n", " \n", "This summarizes the situation very well. Although computation has finally gained the recognition as an independent paradigm of research with theory and experiments, every now and then one still hears the comment *but it is just a simulation* or *it is just a model*. While that can be politely dismissed, such an ignorant comment deserves an equally polemic answer: *utter rubbish from a petrified mind.* Independent if one is a theorist, experimentalist or computational scientist, it is imperative to have a reasonable understanding what the other methods of research can and cannot do. \n", "\n", "\n", "\n", "\n", "## Modeling & simulations: The 3rd paradigm of research\n", "\n", "```{image} img/heart-simplified.svg\n", ":width: 300px\n", ":name: heart\n", ":align: right\n", "```\n", "\n", "\n", "\n", "With the above potentially provoking statements, it is important to keep in mind that while we are dealing with *empirical sciences* and any prediction has to be proven by experiments, theory and computation provide predictions and explanations for observations in a manner that is often impossible for experimentation alone. In addition, experiments have their own problems and to get a signal out of any system, one must poke it, that is, one has to perturb the system to get a response. Simulations don't have such a limitation but can actually be used to study what the effects of probes or peturbations are on a given system. All in all, experiments, theory and simulations all have their advantages and problems and one should not consider any of them to be more superior than the others. With the rise of big data, one could even extend the methods of research to *data-intensive analysis*.\n", "\n", "- Simulations can be predictive and guide experiments.\n", "- Can tell what kind of perturbations experimental probes cause.\n", "- Like any other approach: has its own problems\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "**More:**\n", "\n", "- [Deep Data Dives Discover Natural Laws](https://cacm.acm.org/magazines/2009/11/48443-deep-data-dives-discover-natural-laws/fulltext)\n", "- [Vision 2020: Computational Needs of the Chemical Industry](https://www.ncbi.nlm.nih.gov/books/NBK44988/)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Good practices\n", "\n", "The list below repeats some of the matters discussed earlier, but it is well worth going through them again.\n", "\n", "\n", "* Keep notes, that is have a lab book. That saves a lot of time and reduces the number of errors. Document the procedures and commands (and they can sometimes be somewhat tricky and finding them quickly without proper notes may be difficult or/and lead to errors. Lab book will help you to avoid mistakes and help you to speed up your work tremendously. One good way is to use something like a github document. That is easy to maintain and it is available from any computer. Other reasonable ways: Google docs, OneDrive, Dropbox, etc. \n", "\n", "* Always remember to back up you critical files! There are great tools for doing that including GitHub, Zenodo and such.\n", "* When running simulations, ensure that you will not fill your computer (estimate the amount of data that will be generated and is needed for analysis).\n", "* Check for viruses\n", "* Always, always, always visualize\n", "* Always, always, always verify your simulations & system setup against known results from theory, other simulations and experiments.\n", "* Let the computer do the work! Use [shell scripts](https://en.wikipedia.org/wiki/Shell_script), python, and so on to let the computer to do the repetitive work." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Dangers\n", "\n", "The biggest problem of any computational modeling is the user, the computer will do what it is told to do.\n", "\n", "We have stated in our paper \"The good, the bad and the user in soft matter simulations\"{cite}`Wong_ekkabut_2016`:\n", "\n", "
\n", "It may sound absurd and somewhat provoking, but available simulation \n", "software (i.e., not home grown programs) becoming very user friendly and easy to \n", "obtain can almost be seen as a negative development!\n", " \n", "\n", " \n", "Not that long ago all simulations were based on house-written programs. Such programs had obvious limitations: They were not available to others, the codes were not maintained for longer periods of time, they were often difficult to use for all but the person who wrote code. It was also difficult to improve their performance due to very limited number (often one) developers. Modern codes are well maintained, have long-term stability, extensive error checking due to large number of users and they offer excellent performance that is impossible reach for any individual developer or even a small group. They are also very easy to use. \n", "\n", "Similarly to experiments, it is easy train a person to use the software and even produce data using the built-in analysis methods. That is, however, a precarious path. It is absolutely imperative to have a very strong background in the application field and its underlying theories and methods to be able to produce something meaningful. One should never use a software package as a black box!\n", "\n", "**More on the topic:**\n", "\n", "- [Rampant software errors may undermine scientific results](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4629271/)\n", "- [Three ways researchers can avoid common programming bugs](https://www.natureindex.com/news-blog/three-ways-researchers-science-can-avoid-common-programming-bugs-errors)\n", "- [Computational science: ...Error](https://www.nature.com/news/2010/101013/full/467775a.html)\n", "- [A Scientist's Nightmare: Software Problem Leads to Five Retractions](https://www.semanticscholar.org/paper/A-Scientist%27s-Nightmare%3A-Software-Problem-Leads-to-Miller/dcbf02005884bf79d80315b250b8d70b7a021a21)\n", "- [Scientists Make Mistakes. I Made a Big One.](https://elemental.medium.com/when-science-needs-self-correcting-a130eacb4235)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "\n", "```{figure} img/1280-scales-fs.svg\n", ":width: 700px\n", ":name: scales1\n", ":align: left\n", "*Some common computer simulation methods and typical their time scales. Rough idea of systems sizes in terms of atoms and lengths is given when appropriate. The typical time steps in classical molecular dynamics and coarse-grained MD are also shown.*\n", "```\n", "\n", "## Terminology: Scales\n", "\n", "\n", "The figure aboves show different methods and typical time scales associated with them. It is common to use the following terminology when talking about them. It is also important to notice that the physical time and length scales given below are somewhat arbitrary and depend on the field and there is some overlap: \n", "\n", "**Macroscale:** Roughly speaking, denotes time and length scales observable by plain eye.\n", "\n", "- Typical times >0.1 sec and lengths >1 mm\n", "- Topical associated phenomena: instabilities, pattern formation, phase separation\n", "- Some simulation methods often used to study these scales: phase field models, FEM, Monte Carlo\n", " \n", "**Mesoscale:** Intermediate scales that are not quite observable directly by eye, but typically accessible by many experimental techniques. This is a very broad concept.\n", "- Typical times from $10^{-7}$ sec up to $10^{-1}$ sec and lengths ~$10^{-7}$-$10^{-3}$ m.\n", "- Typical phenomena: instabilities, pattern formation, aggregation \n", "- Some commonly used simulation methods: phase field models, lattice Boltzmann, coarse-grained molecular dynamics, Monte Carlo\n", "\n", "**Atomistic scale:** Typically indicates molecular scales.\n", "- Are in the range of $10^{–12}$ – $10^{-7}$ sec and $10^{-9}$ - $10^{-7}$m.\n", "- Typical phenomena: microscopic mechanisms and interactions such as hydrogen bonding\n", "- Some commonly used methods: classical molecular dynamics, Monte Carlo\n", "\n", "**Subatomistic scale:** Typically used to denote quantum scales.\n", "- Time and length scales below the atomistic\n", "- Typical phenomena: electronic structure, excitations, chemical reactions\n", "- Some commonly used methods: *ab initio* methods, Green's functions, Monte Carlo, density functional theory\n", "\n", "\n", "\n", "\n", "\n", "\n", " \n", "### Hybrid models\n", "\n", "{numref}`scales1` lists some common models. It is important to notice that it is possible to combine, or hybridize them. The most common hybridization is perhaps the QM/MM methods, that is, combination of a quantum mechanical method with a method from the classical MD level; the acronym MM stands for molecular mechanics and it is often used synonymously with classical MD although strictly speaking that is not the case. The principle is simple: In QM/MM, a small part of the system is treated using quantum mechanics and the rest with classical (Newtonian) mechanics. Although the idea is simple, there are many complications and limitations.\n", "\n", "Similarly to QM/MM, it is possible combine classical MD and coarse-grained MD. Another fairly common combination is [lattice-Boltzmann](https://en.wikipedia.org/wiki/Lattice_Boltzmann_methods) (LB) and molecular dynamics. There, the difficulties are different since the lattice-Boltzmann method uses a grid to solve fluid motion using densities (=no explicit particles), but classical MD brings in particles. This means that the two must coupled in consistent way. Often that means using composite particles instead of 'normal' atoms.\n", "\n", "### Implicit vs explicit water\n", "\n", "When simulating biomolecular systems as well as many others, water is present. Water is very challenging for simulations since in practise most of the simulation times goes in simulation water as it is the most abundant component in any given system. This is quite easy to understand: In addition to the molecules of interest, say, proteins, lipids or polymers, for example, the simulation box needs to be filled with water at the *correct density* and to reach proper *solvation*, enough water is needed. It is not uncommon that over 90% of the simulation time goes to simulating water. \n", "\n", "While keeping water in is naturally the most correct choice, methods have been developed to include water only implicitly - such models are called implicit solvent models. There are several strategies for doing but the *Generalized Born model* is probably the most commonly used one.\n", "\n", "### Coarse-graining\n", "\n", "\n", "\n", "Coarse-graining is a process in which one tries to reduce the number of *degrees of freedom* in order to be able to simulate larger systems for longer times. There many approaches to coarse-grain but as a general idea, one can consider a molecule such as a lipid in the Figure and somehow determine new larger *superatoms*. It may sound like magic, but there is a solid theoretical procedure do that (Henderson's theorem) but - like with everything - there are limitations and complexities. \n", "\n", "Here is a brief list of some coarse-grained MD methods:\n", "\n", "- Direct coarse graining using the Henderson theorem.\n", "- Adaptive Resolution Simulations (AdResS). \n", "- The MARTINI model. This is the most popular approach. The name MARTINI does not refer to the drink, but the Gothic style church steeple in the city of Groningen, the Netherlands. The tower was completed in 1482 and it is the most famous landmark in the city. The developers of the model, Marrink and co-workers are based at the University of Groningen. As an unrelated note (but Canadian connection), the city of Groningen was librated by Canadian troops in World War II.\n", "- Dissipative Particle Dynamics (DPD). The DPD method is an other popular coarse-grained method. It was originally developed as a fluid flow solver at Shell laboratories by Hoogerbrugge and Koelman in 1991. The original method had, however, an error: It could not produce the correct equilibrium state with Boltzmann distribution; this is a strict requirement that arises from statistical mechanics (and it will discussed in detail later). This problem was fixed through the theoretical work of Espanol and Warren in 1995, and their formulation is the one that is universally used. \n", "- PLUM model. The PLUM model resembles MARTINI in the sense that the level of coarse-graining is the same. The philosophies of the two models are, however, different. PLUM was developed by Bereau and Deserno. \n", "- ELBA model. The ELBA model was developed by Orsi and Essex in 1991 primarly for lipid simulations. Whereas many other CG models igonre electrostatic interactions (or most of them), ELBA is unique in including them. The name ELBA comes from 'electrostatics-based'.\n", "- SIRAH model. The name comes from 'Southamerican Initiative for a Rapid and Accurate Hamiltonian' and it has its origin in the Institute Pasteour de Montevideo in Urugay. SIRAH is based on a top down approach.\n", "\n", "\n", "### Concurrent vs sequential coarse-graining\n", "\n", "As the number of different models and the existence of various hybrid models suggests, there are different approached to coarse-graining. These will not be discussed in detail, but it is good to understand the there are two philosophies: *sequential* and *concurrent*. Sequential means that data is first extracted from a higher accuracy model. That data is then used construct a coasre-grained model. In *concurrent* coarse-graining, CG is done on-the-fly.\n", "\n", "\n", "\n", "\n", "### Coarse-graining and software compatibility\n", "\n", "It is not possible to run all of the above (or others) using just any software. Below are some notes. The list is not extensive but is updated regularly.\n", "\n", "- Direct coarse graining: There is no unified tool.\n", "- AdResS: Works with Espresso++\n", "- MARTINI: Works with Gromacs\n", "- DPD: Works with LAMMPS\n", "- PLUM: Works with a special version of Gromacs\n", "- ELBA: Works with LAMMPS\n", "- SIRAH: Works with Gromacs and Amber" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Beyond particles: Different representations and methods\n", "\n", "Not all coarse-grained simulation methods use particles. The most common ones are the finite element method (FEM), phase field models and the lattice-Boltzmann method. In all of these cases the space is meshed, that is, the space is discretized. This discretization is typically regular such as based on squares, cubes, triangle or something else, but it can also be random or adaptive. \n", "\n", "**Monte Carlo:** Purely stochastic, total momentum is not conserved. This is a general methodology and does not imply the anything about the application. Monte Carlo can be used for problems as different as optimizing railway network traffic and to study quantum phenomena.\n", "\n", "\n", "**FEM:** The finite element method is essentially a method for solving partial differential equations (PDE). The idea is simple: discretize the space *elements*. It then uses methods based on calculus of variations to solve the problem at hand. As the above implies, it is a very general method and not limited to any particular field. Typical application fields include structural analysis and heat flow. The idea of elements in obvious if one considers a bridge: To model various designs, one can consider the trusses to be the finite elements that have, for example, elastic properties. The elements are meshed and the PDEs (what exactly they are depends on the problem at hand) are then solved numerically.\n", "\n", "**Phase field:** In the case of phase field models, the space is typically discretized using a regular grid and the relevant PDEs are solved using *finite differebce methods* on a grid. The PDEs arise from the physical proprties of the system: The system is described by an *order parameter*. The order parameter is a central concept in fields such as *physical chemistry* and *condensed matter physics* and it describes the emergence of order in a system. In the disorder state (high temperature), the order paramater is zero. When the system undergoes a phase transition (temperature is lowered), the order parameter becomes small and finite, and it is bounded from above to the value of one. Thus, order parameter of one describes full order. The difficulty is that there is no unique way of determining the order parameter but its choice depends on the system. One thus needs deep knowledge of ths system and its behavior. Examples of order parameter are the density in the case of a liquid-gas transition and the wavefunction in the case of superconductivity. \n", "\n", "**Lattice-Boltzmann:** In this case the space is again meshed using a regular mesh. Then a fluid is described a density at each of the nodes. The idea is to then solve for the density and flow by approximating the Boltzmann transport equation. The simplest single-relaxation time approx to Bolzmann transport equation is the so-called Bhatnagar-Gross-Krook (BGK) equation and it is commonly used in LB. This method can be hybridized with particle models. \n", "\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Why do we need so many methods?\n", "\n", "The above examples and discussion are mostly from computational chemistry / physics / chemical engineering. Even when restricted to those fields and just simulations, the number of methods is very large - and we haven't even talked about algorithms and data analysis yet! Why do we need some many methods? There are various reasons, so let's take a look at some. First, let's look at the number of atoms/components - after all, in the context of computational chemistry / physics / chemical engineering they are fundamental units. Analogously, in other problems one has to identify the basic units and behaviours to be modelled and analyzed.\n", "\n", "A typical classical MD simulation has 10,000-100,000 atoms. The atoms are treated as entities interacting via pair potentials, and electrons (or nuclei) are not taken into account separately. Without considering technical details, let's naively assume that that there is some relatively straightforward way of treating electrons and nuclei. A system of 10,000 atoms would then contain many more entities. If the system has only hydrogens, then the total number of particles would be 20,000. Let's consider carbon-12. It has 6 protons, 6 neutrons and 6 electrons: Each atom has 18 smaller entities. This would make the 10,000 atom system to a 180,000 atom system. How about gold? It has 79 protons, 118 neutrons and 79 electrons. This means that each Au atom has 276 smaller entities and a system of 10,000 atoms would then consist of 2,760,000 particles. The number alone is overwhelming and the methods to treat them make it an even tougher problem. Here are some issues that one needs to consider. \n", "\n", "- Classical treatment: [Newton's equation of motion](https://en.wikipedia.org/wiki/Newton%27s_laws_of_motion), positions and velocities \n", "- Quantum mechanics: [Schrödinger equation](https://en.wikipedia.org/wiki/Schr%C3%B6dinger_equation), [wavefunction](https://en.wikipedia.org/wiki/Wave_function)\n", "- Discrepancy in velocities: The fastest motion determines time step.\n", " - Consider this: In hydrogen, electrons (1s) move at about 0.7% speed of light!\n", " - In classical MD, the time step is typically 1-2 fs (or $1-2 \\times 10^{-15}$ s)\n", " - In coarse-grained MD, the time step depends on the level of description: In the popular MARTINI model the time step is usually 10-40 fs, in dissipative particle dynamics about 10 ps." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Programming languages\n", "\n", "One typical question is: Does one need to be able to program? The simple answer is yes. One needs to know some programming independent if one does computation, theory or experiments. Below is a brief discussion.\n", "\n", "**Which programming language should I use/learn?** Depends on the needs, the level of knowledge and application(s). For example, Python is a general purpose language that can be used for many purposes and it is particularly useful for analyzing and visualzing data indpendent if the data is from simulations, experiments or some database. Python is very quick to learn and it has amazing amount of libraries and routines for almost any imaginable task, and the community is large and very supportive. Python is an interpreted language. Python is a language that has become a must learn independent of field and application, and it is particularly important in machine learning. We will also use Python.\n", "\n", "C and C++ have a different nature than Python. They have to be compiled and they have a much more rigidily defined structure. C/C++ codes are great for writing high-performance simulation codes and, for example in the field of molecular simulations, Gromacs, LAMMPS and NAMD are written in C/C++. As for syntax, C/C++ and Python have lots of similarities.\n", "\n", "Fortran is an older programming language and it was originally designed to be very efficient for numerical calculations. The name Fortran comes from *For*mula *Tran*slation. Although there has been a shift to C/C++, many HPC codes such as CP2K, Orca and Gaussian are written in Fortran.\n", "\n", "Graphics processing units (GPU) are used increasingly in high performance computing. They use languages such as CUDA for NVIDIA GPUs, HIP for AMD GPUs and OpenCL that is GPU agnostic. In high performance computing CUDA is the dominant one while HIP is starting to gain ground. Syntactically, they have a very high resemblance to C/C++.\n", "\n", "As for computers, the Linux operating system is the dominant in the world of higher performance computing. As for programming, it can be done in any of the common environments, that is, Windows, Mac OSX and Linux, or even Android or Chrome." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Bibliography\n", "\n", "```{bibliography} ../../references.bib\n", ":filter: docname in docnames\n", ":style: plain\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "" ] } ], "metadata": { "jupytext": { "formats": "ipynb,md:myst" }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.5" } }, "nbformat": 4, "nbformat_minor": 4 }