How has national wellbeing evolved over time?

Survey data on national wellbeing only go back a few decades, but text analysis makes it possible to observe changes over longer periods of time. The results reveal the big influences on peaks and troughs in human happiness: income, health, aspirations and avoidance of major conflicts.

National wellbeing is normally measured by surveying individuals and collating their responses. There are many well-known national and international sources, including the World Values Survey, the World Happiness Report and Eurobarometer.

Unfortunately, these measures only give us between ten and 50 years of data, which is not ideal for forming a long-run understanding of how national wellbeing has changed over time. We can supplement these measures with modern techniques from data science including text analysis, which make it possible to infer mood from the language that people use.

This technique allows us to roll back measures of national wellbeing to around 1800 and gives us considerable insight into how national wellbeing has evolved over time. In particular, we can see that income matters for wellbeing but perhaps not by as much as we might have thought. Aspirations matter too.

Health correlates well with wellbeing as we might expect, but perhaps the most important factor in keeping wellbeing levels high has been avoiding major conflicts. This analysis provides us with some understanding of the most striking peaks and troughs of human happiness over time.

Do governments have the data they need to understand wellbeing?

National wellbeing is far from a new concept, but it has become increasingly normalised as a potential policy objective for governments as data have become more readily available.

Watershed moments include when the United Nations (UN) asked member countries to measure happiness and use the data to guide policy in 2011, publication of the first World Happiness Report in 2012 and the UN International Day of Happiness. This annual occasion was first celebrated in 2013 and has since become a global focus for all things related to happiness.

The UN’s World Values Survey has contained a question on happiness since 1981. This initially covered 11 countries but the number had risen to 100 in the 2017-22 wave. Other regional or national surveys provide slightly longer duration data.

Eurobarometer – a public opinion survey in the European Union – is probably the most well-known of these surveys. It has data on life satisfaction going back to 1972 for a selection of European countries. The World Happiness Report also includes global data on wellbeing that amounts to around ten years’ worth of data.

What this means is that we have a maximum of around 50 years of data for a small number of countries, and perhaps ten years for most others. This is not enough to enable us to understand fully how an important socio-economic variable changes over time. Neither does it allow us to analyse how wellbeing responds to major social or economic shifts, wars, famines, pandemics and many other big events that tend to occur relatively rarely.

To go back further, we have to move beyond traditional methods of data collection and rely on non-survey methods.

Our work explores how we can measure national wellbeing before the 1970s using text data from newspapers and books. The principle is that people’s mood can be extracted from the words that they use (Hills et al, 2019). This allows us to supplement traditional methods by constructing a long-run measure of national wellbeing going back 200 years.

How is national wellbeing measured?

Many national and international surveys measure reported wellbeing. For example, the World Values Survey includes a typical question: ‘Taking all things together would you say you were… very happy/quite happy/not very happy/not at all happy?’

This use of a short ordered set of answers, or a Likert scale, is also used in the World Happiness Report though with an expanded range of zero to ten rather than just four possible responses. Respondents are asked to place their current level of wellbeing somewhere in this scale. This has led to the idea of the Cantril ladder since respondents are asked to think of each number as a rung on a ladder.

Other national surveys also follow the Likert approach. For example, Eurobarometer’s life satisfaction measure asks respondents: ‘On the whole, how satisfied are you with the life you lead…very unsatisfied/not very satisfied/fairly satisfied/very satisfied?’

There is debate about how concepts such as happiness and life satisfaction differ. But most accept that life satisfaction is a longer-term measure, while current happiness is more vulnerable to short-term fluctuations.

Nevertheless, averaged across large numbers of respondents and across long periods, most measures that use words like wellbeing, satisfaction or happiness tend to be correlated. What all of these surveys have in common is the need to interview a large number of people, which is costly in terms of time and organisation. That explains why data tend to be annual. Unfortunately, this provides a limit on the speed with which we can build up a good supply.

To generate more data, especially from the past, we need to use non-survey methods. One approach is to make use of well-known results from psychology indicating that mood can be inferred from language. These insights have been used successfully at the individual level to pick up sentiment from social media posts and other sources (Dodds and Danforth, 2009).

To scale this to the national level, we need two things: a large body of text data (a corpus) and a way to translate text into numerical data (a norm). We use several examples of each, but to give a feel for how this works, Google have digitised millions of books published between 1500 and the present, allowing us access to billions of words. This is one of the core sources for our work.

The main norm used is Affective Norms for English Words – known as ANEW (Bradley and Lang, 1999). This converts words into numbers that measure happiness (‘text valence’) on a scale of one to nine. For example, the word ‘joy’ scores 8.21, while ‘stress’ scores only 1.79.

We then shrink the set of words down to a common 1,000 that appear widely across time and different languages. Finally, we construct a weighted average of implied happiness in text for a number of different languages and periods. For example, we take the weighted average text valence for each year in books and newspapers published in the UK from 1800 to 2009, and we call this the National Valence Index (NVI).

To see how this works, imagine two years in which the number of words is the same but there is a shift from words like ‘joy’ to words like ‘stress’. In this case, the weighted average text valence score would fall significantly.

Validation is crucial: we need to be sure that our measure corresponds with survey measures. It is also necessary to recognise and control for changes in language over time and, of course, variations in literacy and the purpose of literature.

First, this measure is highly correlated with survey results. Further, the correlation is positive: when the nation is happy (according to survey data), the text we read and write tends to be happy (high valence). The reverse is true when the nation is sad.

Second, the measure needs to control for language evolving over time. We do this by looking at the ‘neighbourhood’ around words. Specifically, if we see that a word is surrounded by different words over time, this tends to mean that the word has changed meaning. In this case, it is removed from the 1,000, and we go down to the 500 most ‘stable’ words – those that have the same words in a neighbourhood around them. This study also includes controls for literacy. It is limited to the period post-1800 when literacy levels were high in the UK and when text data are mainly coming from novels (as opposed to a large share being religious texts or legal documents as in the 1600s).

How has national wellbeing changed in the UK?

Using this text measure, we can document longitudinal shifts in happiness over time. But we need to be careful when interpreting graphical data. First, comparisons are best made over short durations. In other words, rates of change are always more valid than looking at long-run levels. Second, the quantity of data has risen over time, which makes more distant history more prone to error.

Figure 1: National valence index for the UK, 1800-2008

Source: Hills et al, 2019

Figure 1 shows a book-based NVI measure for the UK. It highlights huge falls and rises surrounding the two world wars in the 20^th century.

This provides a clue as to the major force that has driven wellbeing in the past: avoiding major conflicts. Analysis that looks at how our measure changes alongside variations in other major socio-economic variables also sheds light on other key drivers.

National income does correlate with national wellbeing, but the effect sizes are small. In other words, it takes a very large rise in national income to produce a small increase in wellbeing. National health, traced using proxy measures such as life expectancy or child mortality, unsurprisingly correlates with national wellbeing.

The data also show how powerful aspirations seem to be. To highlight this, we can look at the later 20^th century. We see a sharp rise from 1945 up to 1957 (when Harold Macmillan famously said that the country had ‘never had it so good’), but then there is a slow decline through to 1978-79 (the aptly named Winter of Discontent).

In line with current thinking on what influences wellbeing, this seems to reflect expectations. In the period following the Second World War, hopes were high. But it seems that they were not fully realised, pushing wellbeing down. This occurred even though there were significant increases in productivity and national income, and improvements in technology between the 1950s and the 1970s.

Crucially, people seem to be largely thinking about their wellbeing relative to where they thought they might be. As a result, the 1950s seemed good relative to the 1940s, but the 1970s did not satisfy hopes relative to the 1960s.

Previous research has also argued that aspirations play a role in determining reported wellbeing (Blanchflower and Oswald, 2004). It has even been stated that more realistic aspirations are part of the reason why happiness rises after middle age for many people Blanchflower and Oswald, 2008).

Can we use language to measure happiness for the whole world?

To use language to measure happiness, we need books and newspapers to have been digitised and norms to be available, which restricts the number of countries we can analyse. One way around this is to use audio data.

Music is sometimes called a universal language and a language of the emotions. It can be sad, happy, exciting, dull, terrifying or calming – and these emotions can span different cultures and time periods.

Working with a group of computer scientists, we have developed a machine-learning algorithm that can recognise 190 different characteristics of sound and use these to estimate the happiness embodied in music (Benetos et al, 2022).

The algorithm first needed to be trained on sound samples where we already know the embodied happiness – this is the equivalent of using a norm for text. The equivalent of the corpus of text is the music itself, and to maximise the chances of measuring national mood, we focus on top-selling music.

This study finds that the mood embodied in a single popular song seems to be better at predicting survey-based mood than the vast amount of text data that we use. This seems remarkable until you remember that language contains a mixture of emotional content and information. This might explain why using music, which has a greater emotional content, could be a better way to capture wellbeing, especially for nations where text data are sparse. Putting this all together, our hope is that as data science, computational power and behavioural science advance, our understanding of national wellbeing will continue to improve. This can only help policy-makers to develop a better understanding of how government policy or major shocks are likely to affect the wellbeing of the nation.