This lecture aims at providing a brief history of the development of computers and computer programming languages and their mutual interaction with rapid developments in natural sciences in the 20th century.

The rise of the machines

Immediately after the first world-war and during the second world-war, many fields of science and engineering witnessed rapid growth. In particular, two fields of mathematical and physical sciences, mathematical programming (a terminology used commonly in place of mathematical optimization) (not to be confused with computer programming!) and Monte Carlo methods witnessed rapid exponential growth in both theory and practical applications. Parallel to progresses in natural sciences, a new field of science and technology, computer science, began to rise during the years of world-war-II, partly in response to the needs of war, but mostly in response to the exponential growth of natural sciences and engineering in the post-world-war-II era.

The history of computer programming probably begins with the development of the first computer, ENIAC (Electronic Numerical Integrator And Computer), one of the earliest electronic general-purpose computers made.

A word-usage relative frequency plot, illustrating the exponential growth of computer technology in the mid 20th century, as well as developments in the fields of deterministic and stochastic optimization techniques, which ultimately led to the emergence of computational modeling as the third pillar of science. Advances in the computational methods and technology also led to the gradual popularity of Bayesian techniques in mathematical modeling towards the end of the 20th century, and the emergence of an important of subfield of computational modeling, now known as Uncertainty Quantification. Note that the positive-slope linear behavior on this semi-logarithmic plot indicates a exponential growth.

Programming language generations

Initially computers had to be programmed by what is called machine code or machine language, a set of instructions for the Central Processing Unit (CPU) of the computer, comprising a long sequence of binary digital zeros and ones. Any interaction with computer hardware, even nowadays, has to be first converted to machine code in order to become comprehensible by the computer hardware. The coding in this language is however, very tedious and time consuming, and non-portable. As a result, immediately after the development of the first generation of computers, the first generation of programming languages also came to exist, most notably, the Assembly language in the late 1940s. Many more programming languages have been since developed that provide higher and higher levels of programming abstraction by hiding more complexities of the machine code interaction with machine hardware from the front-end computer software and users.

A diagram tracing the history of computer languages throughout the history of computer science.

Depending on their levels of abstraction, programming languages are classified into different generations:

  • First generation: The First generation languages, or machine languages, are the lowest-level computer programming languages, which provide no abstraction in their interactions with computer hardware. These languages directly interact with computer hardware, and so there is no need for a compiler or assembler to interpret these languages for the hardware.
  • Second generation: The second generation languages, are at a higher level of abstraction from the machine hardware, meaning that they require an assembler to interpret the code for the computer hardware. The most prominent and, to my understanding, the sole language of this generation is Assembly, which is the closest-possible compiled programming language to computer hardware.
  • Third generation: The third generation languages, or high-level programming languages provide an even higher abstraction level than the second-generation languages. Third-generation languages make programming almost platform-independent, meaning that the content of the code does not depend directly on the current hardware being used. This helps the programmer to focus more on the problem of interest, rather than spending time to understand details of the specific computer and hardware being currently used. Examples of third-generation programming languages are: Fortran, ALGOL, COBOL, BASIC, C, C#, C++, Java, Pascal
  • Fourth generation: The definition for the fourth generation and beyond is not very clear, however, it is generally as the set of languages that provide an even higher level of abstraction from the hardware and proximity to the user (programmer). Some prominent examples of this category include , Python, Perl, Ruby, IDL, R, S.

In the field of scientific computation, Fortran (FORmula TRANslation), first released in 1956, is undoubtedly the most influential programming language of all human history and the oldest high-level programming language that is still in active everyday use. Here is a history of Fortran by its original developers:

A final, personal remark

Sometimes science acts like humans: it finds a matching partner (a programming language), then they flourish together, make a love story and occasionally decline together as well, with the only difference that polygamy is allowed in science, like Fortran’s marriage with Aerospace and Plasma physics:

A plot illustrating the co-evolution of two of the most challenging computationally-intensive fields of science (Aerospace and Plasma Physics) with the most popular high-performance scientific programming language in human history as of today. The vertical axis represents the relative word-usage frequency of the three keywords (Plasma, Aerospace, and Fortran) in all digital corpus, and the horizontal axis represents the year. Note that the downward slope of the curves in later times does NOT imply the decline of these fields of science or the decline of Fortran. It merely means that they have reached their exponential peak growth in the mid-1980s, and are now expanding steadily (linearly), whereas other newer fields (such as bioinformatics) are being more and more frequently referenced than the above three keywords every year in all digital corpus.

Many times throughout the recent history, some specific fields of science have boosted and popularized some computer programming languages and vice versa, in a positive feedback loop. A younger just-married couple seems to be Bioinformatics-Python, both of which, as of today seem to be thriving:

A plot illustrating the co-evolution and thriving of the field of bioinformatics with the popular programming language, Python. The vertical axis represents the relative word-usage frequency of the two keywords (Bioinformatics and Python) in all digital corpus, and the horizontal axis represents the year.

Sometimes, a programming language couples with a specific field of science and thrives for a few years, only to be soon replaced with a younger, more attractive, programming language. This is probably what happened to the co-evolution of Perl with bioinformatics, a language which was gradually replaced with Python in the beginning of the new millennium, at least in the field of bioinformatics.

A plot illustrating the evolution and steady expansion of the Perl programming language and its gradual replacement with Python, in the field of bioinformatics. The vertical axis represents the relative word-usage frequency of the three keywords (Bioinformatics, Python, and Perl) in all digital corpus, and the horizontal axis represents the year.