We'll start with the assumption that you don't have a choice – you have to learn a programming language chosen by someone else. What do you nevertheless need to understand about the landscape of all programming languages, and how your language fits into it?


You may object: isn't understanding your language just learning to understand, and write, programs in it? Well: no, not only that. It's also useful to understand something about the decisions that were taken in designing it: how, and why, it differs from other languages. If someone else chose this language for you to learn, why did they choose this one? Why will learning this language, in particular, be valuable to you?


Questions you might ask (ask your instructor, ask your favourite search engine) include:


① What kind of task was this language developed for? When, and by whom?


② Who uses it now, and for what?


③ What kind of community is there of people who use this language?


④ Where do they hang out online?


⑤ Is your language compiled or interpreted?


⑥ What kind of type discipline does it impose?


⑦ What high-level structure do programs in your language have?


What conventions do people obey? You'd be surprised how important things like how words are capitalised can be, in terms of helping experts in the language quickly understand your program, and in terms of making you look like someone who knows the language! There are also conventions about many other things, from how long parts of the program tend to get before an expert would decide to split them up, to which libraries are used.


Let's discuss some of these questions. We'll start from the more concrete questions, and come to the more sociological ones later.



1 Compilation or Interpretation 编译或解释

2 Types 类型

3 Structure 结构

4 History, Community and Motivation 历史、社区和动机

5 Paradigms 范式

A question which can seem silly to people who already know the answer – and to which, therefore, theymay forget to tell you the answer – is: once I have writtenmy program, how do I get it to run? There are two main answers:


① You just run it.


② You compile it, then run it.


This is a simplification, although it's a useful one because the presence or absence of a compilation activity tends to make a big difference to how it feels to program in a language. Let us give the simplified explanation first, before addressing the ways in which it's an over-simplification.


"You just run it" applies to languages, for example Python, JavaScript (NB nothing to do with Java, despite the similar name!), PHP and Perl, which are interpreted. Thatmeans that there is some other program, called an interpreter, which reads your program and does what it says. If there is a problem somewhere in your program which means that part of the program cannot be interpreted, the interpreter will give some kind of errormessage, and stop,when it gets to that part. However, by then it may already have run the earlier parts of the program.


"You compile it, then run it" applies to languages, for example Java, Haskell and all variants of C (C++, C, Objective-C, etc. ), which are compiled. That means that there is some other program, called a compiler, which reads your program and translates it into a more primitive form. Some kinds of error in your program can be detected in the process of compilation. If no such errors are discovered, then you end up with a compiled program, saved as a separate file, which you can then run, as above.


Because some of the work has been done by the compiler, the compiled program usually runs faster than an interpreted program with the same functionality. What is often more important is that, because the compiler has checked for certain kinds of error, you get a guarantee: if your program compiles correctly, then you can be sure that that kind of error is absent. The main kind of error-checking the compiler does is called type-checking.


In 1978, Robin Milner published a theorem about the core of the programming language he defined, which is called ML; this language has influenced Haskell and many later programming languages. The theorem can be summarised as "well-typed programs cannot 'go wrong' ". That is, he proved that if the compiler accepted your program, then your program was definitely free of certain kinds of error. When I first worked with ML, some of my colleagues used to describe it as "the language of pure thought" and say that if your program compiled, there was no need to test it: it was certainly correct! Unfortunately this is an exaggeration: but still, it is remarkably useful to have a compiler that is good at noticing when you have made a mistake, even if it can be frustrating to be told so.


To find out exactly how you get from having a file containing a program in your language, to the result of running the program, you need a basic tutorial in the language.


In Python, for example, you can save your program in a file called myprogram. py, and then run it by typing at a command line.


python myprogram.py

In Java, you define a class called MyProgram in a file called MyProgram. java, and then compile and run the program by typing first


javac MyProgram.java

to compile it, and then


java MyProgram

to run it.


Sometimes the lines between interpreted and compiled languages get blurred: I admitted to an over-simplification. Strictly speaking, whether a language is compiled or interpreted is a property of the implementation of the language, not of the language itself. Even in languages which are usually interpreted, like Python, it is often possible to compile a program into a form (a . pyc file) which can be run faster than the original and which has been checked for certain kinds of problem. And even languages which are compiled, like Haskell, can sometimes be used in interactive situations (e. g. the Haskell REPL) which feel very much like interpretation.


Moreover, in some languages, like C and C++, another stage, called linking, is made explicit. This connects the compiled program with any libraries that must be available before it can be run. All programs have to get connected to the libraries they depend on at some stage, of course, but this isn't always something the programmer has to do deliberately. For example, in Java, linking is done by the Java Virtual Machine when the class is loaded: that is, it's part of what the run command, java MyProgram, causes to happen.

此外,在一些语言中,如C和C++,另一个阶段称为链接,是显式的。这将编译的程序与运行之前必须可用的任何库连接起来。当然,所有程序都必须在某个阶段连接到它们所依赖的库,但这并不总是程序员必须刻意去做的事情。例如,在Java中,当类被加载时,链接由Java虚拟机完成:也就是说,这是运行命令java MyProgram命令时所发生的情况之一。

2 Types 类型

If you have ever been reminded to "show the units" in your answer to a problem in a mathematics or science class, you have met types. Arguably, if you've ever watched a baby using a shape-sorter, you have, too! The type of a value in a program tells you something about what you can legitimately do with it. What do you need to know about a value in order to know that it makes sense to use it in a particular context?


Type-checking is the process of checking that the shapes of the pieces of a program fit together properly: for example, that a function that has been designed to accept only integers is never given strings as its input. If this is done as part of compilation, it is called static type-checking; if it is done at run-time, it is called dynamic type-checking. Many languages use a mixture of static and dynamic type-checking.


Almost every programming language has types of integers and strings, for example. You'll be familiar with integers from school mathematics; "string" is the computer science term for a piece of text, or sequence of characters. By long tradition, the first string we experiment with is "Hello, World!". A Hello World program in a language is a programwhich prints out "Hello,World!"when you run it. Our program(as belows) will do slightly more.

例如,几乎每种编程语言都有整数和字符串类型。你将熟悉学校数学中的整数;“字符串”是计算机科学中对一段文本或字符序列的术语。根据悠久的传统,我们实验的第一个字符串是“Hello,World!”。一种语言的Hello World程序是当你运行它时打印“Hello,World!”的程序。而我们下面的程序的功能会稍微多一些。

#Python example
x = 5
y = 2
z = "Hello, World!"

No types are given explicitly in this program, but they are there: if you try running it, you will get an error at the last line, something like


TypeError: unsupported operand type(s) for /:

’int’ and ’str’

Once you think about what the program is doing on that line, this is easy to understand, whether or not you “speak” Python. Variables x and y hold integers; variable z holds a string. We don’t have to say that: the language’s type inference works it out. Any of x, y and z can be printed. It makes sense to divide an integer by an integer (even though, note, the result is not an integer any longer). However, it does not make sense to divide an integer by a string. The interpreter does not even try: instead, it tells you that you have got something wrong.


Languages differ in how they treat information about the types of values. If we write the same program in Java, it looks like this:


// Java example
int x = 5;
int y = 2;
String z = "Hello, World!";

(as usual, we omit the lines that show this code placed inside a method inside a class). This is very similar to the Python example: compiling it will give an error at the final line, because you can't divide an integer by a string. Whereas in the Python case, the earlier, unproblematic print statements were carried out before the interpreter encountered the nonsensical instruction to print x/z, in the Java case, since compilation does not succeed, none of the instructions can be carried out until the problem is fixed and the program is recompiled.


Apart from the System. out. println verbiage, the biggest difference between this version and the Python one is that here we have to give the types of the variables x, y, z in the program text. (There is still some type inference going on, though: for example, we do not have to say what type the expression x/y has. By the way, if you're learning Java: what type does it have? Removing the last line and wrapping this code in a method in a class, compile and run it. Did it print what you expected?)


In Haskell, it is rather unidiomatic1 to write anything of the sort, but we can, if we insist:


// Haskell example
f _ =
    do print x
        print y
        print z
        print (x/y)
        print (x/z)
    where x = 5
        y = 2
        z = "Hello, World!"

As with Java, we won’t be able to compile this – let alone invoke function f to run the code – until we get rid of the nonsensical line about x/z. Just as in the Python example, we did not have to write any types; they are inferred. Here, however, type inference is done as part of the compilation phase. We cannot execute any of the program until the types of all of it make sense.


We cannot really think without types: even programs written in apparently untyped languages have implicit type information. Even if, in your language, you are not forced to write down information about what you expect types to be, it is wise to clarify your expectations in your own head. Sometimes it is useful to write them down, even if you don’t have to: it can help you, and other readers of your program, understand what’s going on. One of the ways in which our Haskell example was unusual was that it did not specify the type of function f.


All the examples above used built-in types for strings and integers. All major languages have these types built in. To write real programs you also have to be able to define your own types, and languages differ in how you do that.


All the examples also demonstrated polymorphism: that is, we could use the same function to print things of several different types. Printing is the commonest situation where language designers feel obliged to provide polymorphism. Whether, and how, you can write your own polymorphic functions – that is, functions that work on arguments of several different types – is another axis on which programming languages differ. Indeed, it is a particularly interesting one, as there are different kinds of polymorphism. Try searching


3 Structure 结构

In a beginners’ programming course, the way in which large programs are structured may be invisible to you. You will probably only write small programs to start with; you may, perhaps, write only a few lines of code, and be told where to put them.


All serious programs, though, have to have structure. They have to be split up into parts, so that teams of people can work on different parts of the program without getting in one another’s way. The structure of a program is what makes it possible to make a change to a program, without having to understand everything about the entire program. This helps with finding and fixing bugs quickly and confidently, for example.


The functions are as black box machines transforming input into output. When you define a function (or method, or procedure) in your programming language, you are structuring the program so that the lines of code that define what this machine does (the body of the function) are together. While this section of code may not be completely self-contained – it may depend on other parts of the program, e. g. by calling other functions – the aim is that a reader can understand what the function will do, just by reading its body code. This sounds very basic, but it could not always be taken for granted.


An intimately related issue is the scope of names.


Many things in programs – variables, functions and classes, for example – are given names. The scope of a name for a thing means where, in the program text, the name can be used to refer to the thing. If the name can be used anywhere in the program, it is said to have global scope.


Global scope may sound convenient, but there is an important disadvantage: if you need to understand the role this named thing plays – e. g. to work out whether a change you have in mind will break anything – you have to read the whole program. Therefore, programming languages allow named things to have smaller scopes. For example, a variable might be local to a function, so that it can only be referred to inside that function's definition. The details are subtle and vary between languages.


Your language may provide classes, modules, packages, or several of these. Very likely these higher-level structures will be used to provide librarieswhichmake it easier for you to write programs.


A software library provides functionality designed to be used in many other programs. A standard library for a language is one that is maintained along with the basic software implementing the language, and distributed with it, so that it is always available to someone programming in the language.


Standard libraries provide things which are frequently needed, such as code for finding a pattern in a string, collections that can be sorted efficiently, user interface components, etc. If your language has a standard library, becoming familiar with it is an integral part of learning to program well in the language.


Many libraries, and much of the other software implementing major programming languages, are open-source.


Software is open-source when it is made available under licensing conditions that allow anyone to view the source code, modify it, and redistribute their modified version. Typically, there are conditions, such as that the modified software must itself be made available with the same licence.


4 History, Community and Motivation 历史、社区和动机

How old is your language? Who designed it? What is it used for? If you are doing a beginners' programming course, one question is: are you using a language that is used mostly for teaching, or one that is also widely used by professional developers? You may have come across educational languages such as Scratch, or some language for turtle graphics based on Logo; you may now be learning Alice.


Similar questions apply to the tooling you are using: for example, you might be using the education-focused IDE BlueJ, for Java. The lines between categories do get blurred, and successful languages outgrow their niches: for example BASIC, the name of a language designed in the early 1960s, stands for Beginner's Allpurpose Symbolic Instruction Code, but its Visual Basic dialect went on to be very widely used by experts as well as beginners.

类似的问题也适用于您正在使用的工具:例如,您可能正在使用面向Java教育的IDE BlueJ。类别之间的界限确实变得模糊,成功的语言已经超出了它们最初的定位:例如,BASIC,一种在20世纪60年代早期设计的语言,代表Beginner's Allpurpose Symbolic Instruction Code,但它的Visual BASIC方言版本继续被专家和初学者广泛使用。

Most likely, your language is used by some professional developers. To do what? Reading its Wikipedia page, or searching will find you some information (and possibly some examples of the language wars). In the process, you may find out something about the community surrounding your language. Perhaps your language is a scripting language, often used for automating sequences of tasks that would otherwise have to be done manually by humans.


These are interpreted languages: Python is usually considered a scripting language, although these days it is also used for many other purposes. Or perhaps your language is mostly used for web services, or in AI, or data science, or embedded programming, or statistics.


5 Paradigms 范式

We have left until last something which comparative discussions of programming languages sometimes take first. Traditionally, programming languages have been divided into groups according to the main way in which people writing in those languages tend to think – that is, according to paradigm. The four main paradigms usually identified are:


① Imperative. The program orders the computer to do one thing, then another thing. Data is stored in the form of mutable state, i. e. variables which have values that can be changed. Example language: C.


② Object-oriented. The program is organised in terms of objects. Each object wraps up (encapsulates) some data, and can respond to certain requests (messages), thereby fulfilling some responsibilities. Example language: Java.


③ Functional. The programmer thinks of functions not just as bits of code, but as concrete things in their own right – as data –which can be passed around the program. For example, a function can be passed as an argument to another function, just as an integer might be. (People sometimes say functions are “firstclass citizens”. ) Mutable state is avoided. Example language: Haskell.


④ Logic. Writing a program involves specifying facts, and rules about how facts follow from other facts, and then asking a question. Example language: Prolog.


However, real life is nothing like as neat as this, and some people argue that it isn't useful to think in terms of paradigms. As you program in more than one language, you naturally import your favourite ways of thinking – influenced by your past programming experience – into each language you adopt. Some languages – Python is an example, in fact – have a mixture of features that makes them hard to classify. And sometimes a language that begins neatly in one paradigm may change, over time, to make it easier to program in a style that began elsewhere. For example, Java version 8 introduced new features that made it more practical to program in a functional style.

然而,现实生活并不像这样整洁,有些人认为用范式来思考是没有用的。当你用不止一种语言编程时,你自然会把你最喜欢的思维方式——受你过去编程经验的影响——导入到你所采用的每一种语言中。事实上,有些语言(Python就是一个例子)具有多种特性,这使得他们很难被分类。有时,一种始于某种范式的语言可能会随着时间的推移而改变,从而更容易以始于其他范式的风格进行编程。例如,Java 8引入了新的特性,使以函数式编程更加实用。

Does this mean you can just pick your favourite way to program and then program that way in any language? To some extent you can, but it's unlikely to be the best approach. For example, you can write a C program in a functional style, but, because C doesn't support functional programming very well, your program is not likely to be good. It will be all too easy to make mistakes, and all too hard for any reader (including you, later) to understand the program. Try to go with the grain of your chosen language (whether or not it was chosen by you): learn from the way experts in that language typically write. That is, learn to write idiomatically in your language. At the same time, be alert to the good features of different programming styles you come across, and be ready to make use of them where appropriate.


To help you get a feel for what is considered good, idiomatic code in your language, find a fairly large, highly reputable body of code. Look at it and remember to come back to it at intervals as you learn the language. Don't worry if you can't understand it in detail at this stage. Consult it if you ever wonder about such things as "how long should a function be?", "how should I capitalise the name of a type?", etc.


Standard libraries, for example, are written by experts who expect their code to be inspected by many other experts, so they tend to be good – though not especially beginner-friendly – code.


Java: the OpenJDK version of the Java Development Kit has source code at http://hg. openjdk. java. net/jdk/jdk/. Look for the "browse" entry in the left-hand menu.

Java:Java开发工具包的OpenJDK版本的源代码位于http://hg. openjdk. java. net/jdk/jdk/。在左侧菜单中查找“浏览”条目。

Haskell: if you use Hoogle (https://hoogle. haskell. org/) to look up a function, there is a link to its source code to the right of its name.

Haskell:如果你用Hoogle(https://hoogle. haskell. org/)查找函数,其名称右侧有一个指向其源代码的链接。

Python: if you use the documentation available from https://docs. python. org/3/library/ for the standard library, you will see links to source code near the top of most pages.

Python:如果您使用https://docs. python. org/3/library/提供的标准库,您将在大多数页面的顶部看到指向源代码的链接。

Your language-specific book or documentation should provide plenty of examples of simpler code.



