Sunday, January 15, 2017

Dangerous Knowledge and Existential Risk (Dominic Cummings)

Dominic Cummings begins a new series of blog posts. Highly recommended!

It's worth noting a few "factor of a million" advances that have happened recently, largely due to physical science, applied mathematics, and engineering:

1. Destructive power of an H-bomb is a million times greater than that of conventional explosives. This advance took ~20 years.

2. Computational power (Moore's Law) has advanced a million times over a roughly similar timescale.

3. Genome sequencing (and editing) capabilities have improved similarly, just in the 21st century.

How much have machine intelligence and AI progressed, say, in the last 20 years? If it isn't a factor of a million (whatever that means in this context), it soon will be ...
Dominic Cummings: ... The big big problem we face – the world is ‘undersized and underorganised’ because of a collision between four forces: 1) our technological civilisation is inherently fragile and vulnerable to shocks, 2) the knowledge it generates is inherently dangerous, 3) our evolved instincts predispose us to aggression and misunderstanding, and 4) there is a profound mismatch between the scale and speed of destruction our knowledge can cause and the quality of individual and institutional decision-making in ‘mission critical’ political institutions ...

... Politics is profoundly nonlinear. (I have written a series of blogs about complexity and prediction HERE which are useful background for those interested.) Changing the course of European history via the referendum only involved about 10 crucial people controlling ~£10^7 while its effects over ten years could be on the scale of ~10^8 – 10^9 people and ~£10^12: like many episodes in history the resources put into it are extremely nonlinear in relation to the potential branching histories it creates. Errors dealing with Germany in 1914 and 1939 were costly on the scale of ~100,000,000 (10^8) lives. If we carry on with normal human history – that is, international relations defined as out-groups competing violently – and combine this with modern technology then it is extremely likely that we will have a disaster on the scale of billions (10^9) or even all humans (~10^10). The ultimate disaster would kill about 100 times more people than our failure with Germany. Our destructive power is already much more than 100 times greater than it was then.

Even if we dodge this particular bullet there are many others lurking. New genetic engineering techniques such as CRISPR allow radical possibilities for re-engineering organisms including humans in ways thought of as science fiction only a decade ago. We will soon be able to remake human nature itself. CRISPR-enabled ‘gene drives’ enable us to make changes to the germ-line of organisms permanent such that changes spread through the entire wild population, including making species extinct on demand. Unlike nuclear weapons such technologies are not complex, expensive, and able to be kept secret for a long time. The world’s leading experts predict that people will be making them cheaply at home soon – perhaps they already are.

It is already practically possible to deploy a cheap, autonomous, and anonymous drone with facial-recognition software and a one gram shaped-charge to identify a relevant face and blow it up. Military logic is driving autonomy. ...
Dangers have increased, but quality of decision making and institutions has not:
... The national institutions we have to deal with such crises are pretty similar to those that failed so spectacularly in summer 1914 yet they now face crises involving 10^2 – 10^3 times more physical destruction moving at least 10^3 times faster. The international institutions developed post-1945 (UN, EU etc) contribute little to solving the biggest problems and in many ways make them worse. These institutions fail constantly and do not – cannot – learn much.

If we keep having crises like we have experienced over the past century then this combination of problems pushes the probability of catastrophe towards ‘overwhelmingly likely’.

... Can a big jump in performance – ‘better and more powerful thinking programs for man and machine’ – somehow be systematised?

Feynman once gave a talk titled ‘There’s plenty of room at the bottom’ about the huge performance improvements possible if we could learn to do engineering at the atomic scale – what is now called nanotechnology. There is also ‘plenty of room at the top’ of political structures for huge improvements in performance. As I explained recently, the victory of the Leave campaign owed more to the fundamental dysfunction of the British Establishment than it did to any brilliance from Vote Leave. Despite having the support of practically every force with power and money in the world (including the main broadcasters) and controlling the timing and legal regulation of the referendum, they blew it. This was good if you support Leave but just how easily the whole system could be taken down should be frightening for everybody .

Creating high performance teams is obviously hard but in what ways is it really hard?

... The real obstacle is that although we can all learn and study HPTs it is extremely hard to put this learning to practical use and sustain it against all the forces of entropy that constantly operate to degrade high performance once the original people have gone. HPTs are episodic. They seem to come out of nowhere, shock people, then vanish with the rare individuals. People write about them and many talk about learning from them but in fact almost nobody ever learns from them – apart, perhaps, from those very rare people who did not need to learn – and nobody has found a method to embed this learning reliably and systematically in institutions that can maintain it. ...

Wednesday, January 11, 2017

Brexit in the Multiverse: Dominic Cummings on the Vote Leave campaign


It's not entirely an exaggeration to say that my friend Dominic Cummings both kept the UK out of the Euro, and allowed it to (perhaps) escape the clutches of the EU. Whether or not you consider these outcomes to be positive, one can't deny the man his influence on history.
Wikipedia: Dominic Mckenzie Cummings (born November 1971)[1] is a British political advisor and strategist.

He served as the Campaign Director of Vote Leave, the official and successful campaign in favour of leaving the European Union for the United Kingdom European Union membership referendum, 2016.[2] He is a former special adviser to Michael Gove. He has a reputation for both his intelligence and divisiveness.

... From 1999 to 2002, Cummings was campaign director at Business for Sterling, the campaign against the UK joining the Euro.

... Cummings worked for Michael Gove from 2007 to January 2014, first in opposition and then as a special adviser in the Department of Education after the 2010 general election. He was Gove's chief of staff,[4] an appointment blocked by Andy Coulson until his own resignation.[5][7] In this capacity Cummings wrote a 240-page essay, "Some thoughts on education and political priorities",[8] about transforming Britain into a "meritocratic technopolis",[4] described by Patrick Wintour as "either mad, bad or brilliant – and probably a bit of all three."[7] He became known for his blunt style and "not suffering fools gladly", and as an idealist.

... Dominic Cummings became Campaign Director of Vote Leave upon the creation of the organisation in October 2015. He is credited with having created the official slogan of Vote Leave, "Take back control" and with being the leading strategist of the campaign.
Posts about Dom on this blog.

How did he do it? Perhaps we can learn from Bismarck, a historical figure Dom admires greatly -- see Brexit, Victory over the Hollow Men.
The scale of Bismarck's triumph cannot be exaggerated. He alone had brought about a complete transformation of the European international order. He had told those who would listen what he intended to do, how he intended to do it, and he did it. He achieved this incredible feat without commanding an army, and without the ability to give an order to the humblest common soldier, without control of a large party, without public support, indeed, in the face of almost universal hostility, without a majority in parliament, without control of his cabinet, and without a loyal following in the bureaucracy.
For a detailed 20 thousand word account of the Brexit campaign, including a meditation on the problem of causality in History, and the contingency of events in our multiverse, and the unreasonable effectiveness of physicists, and much, much more, see this recent post on Dom's blog:
On the referendum #21: Branching histories of the 2016 referendum and ‘the frogs before the storm’

... Why and how? The first draft of history was written in the days and weeks after the 23 June and the second draft has appeared over the past few weeks in the form of a handful of books. There is no competition between them. Shipman’s is by far the best and he is the only one to have spoken to key people. I will review it soon. One of his few errors is to give me the credit for things that were done by others, often people in their twenties like Oliver Lewis, Jonny Suart, and Cleo Watson who, unknown outside the office, made extreme efforts and ran rings around supposed ‘experts’. His book has encouraged people to exaggerate greatly my importance.

I have been urged by some of those who worked on the campaign to write about it. I have avoided it, and interviews, for a few reasons (though I had to write one blog to explain that with the formal closing of VL we had made the first online canvassing software that really works in the UK freely available HERE). For months I couldn’t face it. The idea of writing about the referendum made me feel sick. It still does but a bit less.

For about a year I worked on this project every day often for 18 hours and sometimes awake almost constantly. Most of the ‘debate’ was moronic as political debate always is. Many hours of life I’m never getting back were spent dealing with abysmal infighting among dysfunctional egomaniacs while trying to build a ~£10 million startup in 10 months when very few powerful people thought the probability of victory was worth the risk of helping us. ...

... Discussions about things like ‘why did X win/lose?’ are structured to be misleading and I could not face trying to untangle everything. There are strong psychological pressures that lead people to create post facto stories that seem to add up to ‘I always said X and X happened.’ Even if people do not think this at the start they rapidly construct psychologically appealing stories that overwrite memories. Many involved with this extraordinary episode feel the need to justify themselves and this means a lot of rewriting of history. I also kept no diary so I have no clear source for what I really thought other than some notes here and there. I already know from talking to people that my lousy memory has conflated episodes, tried to impose patterns that did not actually exist and so on – all the usual psychological issues. To counter all this in detail would require going through big databases of emails, printouts of appointment diaries, notebooks and so on, and even then I would rarely be able to reconstruct reliably what I thought. Life’s too short.

I’ve learned over the years that ‘rational discussion’ accomplishes almost nothing in politics, particularly with people better educated than average. Most educated people are not set up to listen or change their minds about politics, however sensible they are in other fields. But I have also learned that when you say or write something, although it has roughly zero effect on powerful/prestigious people or the immediate course of any ‘debate’, you are throwing seeds into a wind and are often happily surprised. A few years ago I wrote something that was almost entirely ignored in SW1 [Southwest London] but someone at Harvard I’d never met read it. This ended up having a decisive effect on the referendum.

A warning. Politics is not a field which meets the two basic criteria for true expertise (see below). An effect of this is that arguments made by people who win are taken too seriously. People in my position often see victory as confirmation of ideas they had before victory but people often win for reasons they never understand or even despite their own efforts. Cameron’s win in 2015 was like this – he fooled himself about some of the reasons why he’d won and this error contributed to his errors on the referendum. Maybe Leave won regardless of or even despite my ideas. Maybe I’m fooling myself like Cameron. Some of my arguments below have as good an empirical support as is possible in politics (i.e. not very good objectively) but most of them do not even have that. Also, it is clear that almost nobody agrees with me about some of my general ideas. It is more likely that I am wrong than 99% of people who work in this field professionally. Still, cognitive diversity is inherently good for political analysis so I’ll say what I think and others will judge if there’s anything to learn. ...
After reading these 20 thousand words, perhaps you'll have an opinion as to whether Dom, one of the most successful and experienced observers (and users!) of democracy, agrees with Robert Heinlein that The Gulf is Deep ;-)

Monday, January 09, 2017

The Gulf is Deep (Heinlein)


The novella Gulf predates almost all of Heinlein's novels. Online version. The book Friday (1982) is a loose sequel.
Wikipedia: Gulf is a novella by Robert A. Heinlein, originally published as a serial in the November and December 1949 issues of Astounding Science Fiction and later collected in Assignment in Eternity. It concerns a secret society of geniuses who act to protect humanity. ...

The story postulates that humans of superior intelligence could, if they banded together and kept themselves genetically separate, create a new species. In the process they would develop into a hidden and benevolent "ruling" class.
Do you still believe in Santa Claus?
He stopped and brooded. “I confess to that same affection for democracy, Joe. But it’s like yearning for the Santa Claus you believed in as a child. For a hundred and fifty years or so democracy, or something like it, could flourish safely. The issues were such as to be settled without disaster by the votes of common men, befogged and ignorant as they were. But now, if the race is simply to stay alive, political decisions depend on real knowledge of such things as nuclear physics, planetary ecology, genetic theory, even system mechanics. They aren’t up to it, Joe. With goodness and more will than they possess less than one in a thousand could stay awake over one page of nuclear physics; they can’t learn what they must know.”

Gilead brushed it aside. “It’s up to us to brief them. Their hearts are all right; tell them the score—they’ll come down with the right answers.”

“No, Joe. We’ve tried it; it does not work. As you say, most of them are good, the way a dog can be noble and good. ... Reason is poor propaganda when opposed by the yammering, unceasing lies of shrewd and evil and self-serving men. The little man has no way to judge and the shoddy lies are packaged more attractively. There is no way to offer color to a colorblind man, nor is there any way for us to give the man of imperfect brain the canny skill to distinguish a lie from a truth.

“No, Joe. The gulf between us and them is narrow, but it is very deep. We cannot close it.”

China’s Crony Capitalism: The Dynamics of Regime Decay (Minxin Pei)


Minxin Pei is an exceptional observer of modern Chinese politics, although he tends toward the pessimistic. In his new book he has assembled a dataset of 260 major corruption cases involving officials at the highest levels, covering roughly the last 25 years.

There is no doubt that corruption is a major problem in China. Is it merely a quantitative impediment to efficiency, or an existential threat to the CCP regime? See also The truth about the Chinese economy.
China’s Crony Capitalism: The Dynamics of Regime Decay
Minxin Pei

When Deng Xiaoping launched China on the path to economic reform in the late 1970s, he vowed to build “socialism with Chinese characteristics.” More than three decades later, China’s efforts to modernize have yielded something very different from the working people’s paradise Deng envisioned: an incipient kleptocracy, characterized by endemic corruption, soaring income inequality, and growing social tensions. China’s Crony Capitalism traces the origins of China’s present-day troubles to the series of incomplete reforms from the post-Tiananmen era that decentralized the control of public property without clarifying its ownership.

Beginning in the 1990s, changes in the control and ownership rights of state-owned assets allowed well-connected government officials and businessmen to amass huge fortunes through the systematic looting of state-owned property—in particular land, natural resources, and assets in state-run enterprises. Mustering compelling evidence from over two hundred corruption cases involving government and law enforcement officials, private businessmen, and organized crime members, Minxin Pei shows how collusion among elites has spawned an illicit market for power inside the party-state, in which bribes and official appointments are surreptitiously but routinely traded. This system of crony capitalism has created a legacy of criminality and entrenched privilege that will make any movement toward democracy difficult and disorderly.

Rejecting conventional platitudes about the resilience of Chinese Communist Party rule, Pei gathers unambiguous evidence that beneath China’s facade of ever-expanding prosperity and power lies a Leninist state in an advanced stage of decay.
Pei discusses his book at Stanford's Center on Democracy, Development, and the Rule of Law in the video below. Here is another video with an excellent panel discussion beginning 1 hr in.



This debate from a few years ago between Pei and venture capitalist / optimist / apologist Eric X. Li is very good. James Fallows is the moderator.

Sunday, January 08, 2017

AlphaGo (BetaGo?) Returns

Rumors over the summer suggested that AlphaGo had some serious problems that needed to be fixed -- i.e., whole lines of play that it pursued poorly, despite its thrashing of one of the world's top players in a highly publicized match. But tuning a neural net is trickier than tuning, for example, an expert system or more explicitly defined algorithm...

AlphaGo (or its successor) has quietly returned, shocking the top players in the world.
Fortune: In a series of unofficial online games, an updated version of Google’s AlphaGo artificial intelligence has compiled a 60-0 record against some of the game’s premier players. Among the defeated, according to the Wall Street Journal, were China’s Ke Jie, reigning world Go champion.

The run follows AlphaGo’s defeat of South Korea’s Lee Se-dol in March of 2016, in a more official setting and using a previous version of the program.

The games were played by the computer through online accounts dubbed Magister and Master—names that proved prophetic. As described by the Journal, the AI’s strategies were unconventional and unpredictable, including moves that only revealed their full implications many turns later. That pushed its human opponents into deep reflections that mirror the broader questions posed by computer intelligence.

“AlphaGo has completely subverted the control and judgment of us Go players,” wrote Gu Li, a grandmaster defeated by the program, in an online post. “When you find your previous awareness, cognition and choices are all wrong, will you keep going along the wrong path or reject yourself?”

Another Go player, Ali Jabarin, described running into Ke Jie after he had been defeated by the program. According to Jabarin, Jie was “a bit shocked . . . just repeating ‘it’s too strong’.”
As originally reported in the Wall Street Journal:
WSJ: A mysterious character named “Master” has swept through China, defeating many of the world’s top players in the ancient strategy game of Go.

Master played with inhuman speed, barely pausing to think. With a wide-eyed cartoon fox as an avatar, Master made moves that seemed foolish but inevitably led to victory this week over the world’s reigning Go champion, Ke Jie of China. ...

Master revealed itself Wednesday as an updated version of AlphaGo, an artificial-intelligence program designed by the DeepMind unit of Alphabet Inc.’s Google.

AlphaGo made history in March by beating South Korea’s top Go player in four of five games in Seoul. Now, under the guise of a friendly fox, it has defeated the world champion.

It was dramatic theater, and the latest sign that artificial intelligence is peerless in solving complex but defined problems. AI scientists predict computers will increasingly be able to search through thickets of alternatives to find patterns and solutions that elude the human mind.

Master’s arrival has shaken China’s human Go players.

“After humanity spent thousands of years improving our tactics, computers tell us that humans are completely wrong,” Mr. Ke, 19, wrote on Chinese social media platform Weibo after his defeat. “I would go as far as to say not a single human has touched the edge of the truth of Go.” ...
We are witness to the psychological shock of a species encountering, for the first time, an alien and superior intelligence. See also The Laskers and the Go Master.

Thursday, January 05, 2017

20 years after the Sokal Hoax

The Chronicle of Higher Education has a nice article on the occasion of the 20th anniversary of the Sokal hoax. Has anything changed in the last 20 years? Sokal's parody language resembles standard academic cant of 2016.
Wikipedia: The Sokal affair, also called the Sokal hoax,[1] was a publishing hoax perpetrated by Alan Sokal, a physics professor at New York University and University College London. In 1996, Sokal submitted an article to Social Text, an academic journal of postmodern cultural studies. The submission was an experiment to test the journal's intellectual rigor and, specifically, to investigate whether "a leading North American journal of cultural studies – whose editorial collective includes such luminaries as Fredric Jameson and Andrew Ross – [would] publish an article liberally salted with nonsense if (a) it sounded good and (b) it flattered the editors' ideological preconceptions".[2]

The article, "Transgressing the Boundaries: Towards a Transformative Hermeneutics of Quantum Gravity",[3] was published in the Social Text spring/summer 1996 "Science Wars" issue. It proposed that quantum gravity is a social and linguistic construct. At that time, the journal did not practice academic peer review and it did not submit the article for outside expert review by a physicist.[4][5] On the day of its publication in May 1996, Sokal revealed in Lingua Franca that the article was a hoax, identifying it as "a pastiche of left-wing cant, fawning references, grandiose quotations, and outright nonsense ... structured around the silliest quotations [by postmodernist academics] he could find about mathematics and physics."[2]
The Chronicle article describes Sokal's original motivation for the hoax.
Chronicle: ... It was all a big joke, but one motivated by a serious intention: to expose the sloppiness, absurd relativism, and intellectual arrogance of "certain precincts of the academic humanities." His beef was political, too: He feared that by tossing aside their centuries-old promotion of scientific rationality, progressives were eroding their ability to speak truth to power. ...

ALAN SOKAL: In the spring of 1994, I saw a reference to the book by Paul Gross and Norman Levitt, Higher Superstition: The Academic Left and Its Quarrels With Science. My first thought was, Oh, no, not another one of those right-wing diatribes that tell how the Marxist deconstructionist professors are taking over the universities and brainwashing our children. There had been a whole spate of such books in the early 1990s — Dinesh D’Souza and others.

My second thought was "academic left and its quarrels with science"? I mean, that’s a little weird. I’m an academic leftist. So I decided to read it. I learned about a corner of the academy where people were employing either deconstructionist literary theory or extreme social constructivist sociology of science to make comments about both the content of science and the philosophy of science, often in gross ignorance of the science. The first thing I wanted to do was go to the library and check out the original works that Gross and Levitt were criticizing to see whether they were being fair. I found that in about 80 percent of the cases, in my judgment, they were.

... I thought, well, I could write an article to add to the Gross and Levitt critique, and it would probably disappear into a black hole. So I had the idea of writing an article that would be both a parody and an admittedly uncontrolled experiment: I would submit the article to a trendy journal and see whether it would be accepted. Writing the parody took maybe two or three months.

Before I submitted it I did show it to a few friends — I tested them blind to see how long it would take them to figure out that it was a parody. The scientists would figure out quickly that either it was a parody or I had gone off my rocker. But I mostly tried it on nonscientist friends, in part to see whether there were any obvious giveaways. ...
The following paragraphs are taken from Sokal's paper (the first two from the beginning, the last from the end):
There are many natural scientists, and especially physicists, who continue to reject the notion that the disciplines concerned with social and cultural criticism can have anything to contribute, except perhaps peripherally, to their research. Still less are they receptive to the idea that the very foundations of their worldview must be revised or rebuilt in the light of such criticism. Rather, they cling to the dogma imposed by the long post-Enlightenment hegemony over the Western intellectual outlook, which can be summarized briefly as follows: that there exists an external world, whose properties are independent of any individual human being and indeed of humanity as a whole; that these properties are encoded in "eternal" physical laws; and that human beings can obtain reliable, albeit imperfect and tentative, knowledge of these laws by hewing to the "objective" procedures and epistemological strictures prescribed by the (so-called) scientific method.

But deep conceptual shifts within twentieth-century science have undermined this Cartesian-Newtonian metaphysics1; revisionist studies in the history and philosophy of science have cast further doubt on its credibility2; and, most recently, feminist and poststructuralist critiques have demystified the substantive content of mainstream Western scientific practice, revealing the ideology of domination concealed behind the façade of "objectivity".3 It has thus become increasingly apparent that physical "reality", no less than social "reality", is at bottom a social and linguistic construct; that scientific "knowledge", far from being objective, reflects and encodes the dominant ideologies and power relations of the culture that produced it; that the truth claims of science are inherently theory-laden and self-referential; and consequently, that the discourse of the scientific community, for all its undeniable value, cannot assert a privileged epistemological status with respect to counter-hegemonic narratives emanating from dissident or marginalized communities.

...

Finally, the content of any science is profoundly constrained by the language within which its discourses are formulated; and mainstream Western physical science has, since Galileo, been formulated in the language of mathematics.100 101 But whose mathematics? The question is a fundamental one, for, as Aronowitz has observed, "neither logic nor mathematics escapes the `contamination' of the social.''102 And as feminist thinkers have repeatedly pointed out, in the present culture this contamination is overwhelmingly capitalist, patriarchal and militaristic: "mathematics is portrayed as a woman whose nature desires to be the conquered Other.''103 104 Thus, a liberatory science cannot be complete without a profound revision of the canon of mathematics.105 As yet no such emancipatory mathematics exists, and we can only speculate upon its eventual content. We can see hints of it in the multidimensional and nonlinear logic of fuzzy systems theory106; but this approach is still heavily marked by its origins in the crisis of late-capitalist production relations.
See also Frauds!

Tuesday, January 03, 2017

Will and Power

This video might help you with your New Year's resolution!



The claim that one has a fixed budget of will power or self-discipline ("ego depletion") may be yet another non-replicating "result" of shoddy social science. Note that the ego depletion claim refers to something like a daily budget of will power that can be used up, whereas Jocko is also referring to the development of this budget over time: building it up through use.

Jocko on BJJ and mixed martial arts:





See also My Navy SEAL Story.

Wednesday, December 28, 2016

Varieties of Time Travel




My kids have been reading lots of books over the break, including an adventure series that involves time travel. Knowing vaguely that dad is a theoretical physicist, they asked me how time travel works.

1. Can one change history by influencing past events?      

OR

2. Is there only one timeline that cannot be altered, even by time travel?

I told them that no one really knows the answer, or the true nature of time.

I gave them an example of 1 and of 2 from classic science fiction :-)

1. Ray Bradbury's short story A Sound of Thunder:
... Looking at the mud on his boots, Eckels finds a crushed butterfly, whose death has apparently set in motion a series of subtle changes that have affected the nature of the alternative present to which the safari has returned. ...
(Note this version implies the existence of alternative or parallel universes.)

2. Ted Chiang's one pager What's expected of us, which also notices that a single time line seems deterministic, and threatens Free Will. (More ;-)
... it's a small device, like a remote for opening your car door. Its only features are a button and a big green LED. The light flashes if you press the button. Specifically, the light flashes one second before you press the button.

Most people say that when they first try it, it feels like they're playing a strange game, one where the goal is to press the button after seeing the flash, and it's easy to play. But when you try to break the rules, you find that you can't. If you try to press the button without having seen a flash, the flash immediately appears, and no matter how fast you move, you never push the button until a second has elapsed. If you wait for the flash, intending to keep from pressing the button afterwards, the flash never appears. No matter what you do, the light always precedes the button press. There's no way to fool a Predictor.

The heart of each Predictor is a circuit with a negative time delay — it sends a signal back in time. The full implications of the technology will become apparent later, when negative delays of greater than a second are achieved, but that's not what this warning is about. The immediate problem is that Predictors demonstrate that there's no such thing as free will.

There have always been arguments showing that free will is an illusion, some based on hard physics, others based on pure logic. Most people agree these arguments are irrefutable, but no one ever really accepts the conclusion. The experience of having free will is too powerful for an argument to overrule. What it takes is a demonstration, and that's what a Predictor provides. ...
I attended a Methodist Sunday school as a kid. I asked my teacher: If God knows everything, does he know the outcomes of all the decisions I will ever make? Then will I ever make a free choice?

I also asked whether there are Neanderthals in heaven, but that's another story...

Sunday, December 25, 2016

Time and Memory

Over the holiday I started digging through my mom's old albums and boxes of photos. I found some pictures I didn't know existed!

Richard Feynman and the 19 year old me at my Caltech graduation:



With my mom that morning -- hung-over, but very happy! I think those are some crazy old school Ray Bans :-)



Memories of Feynman: "Hey SHOE!", "Gee, you're a BIG GUY. Do you ever go to those HEALTH clubs?"

This is me at ~200 pounds, playing LB and FB back when Caltech still had a football team. Plenty of baby fat! I had never run anything but sprints until after football. I dropped 10 or 15 pounds just by jogging a few times per week between senior year and grad school.




Here I am in graduate school. Note the Miami Vice look -- no socks!



Ten years after college graduation, as a Yale professor, competing in Judo and BJJ in the 80 kg (176 lbs) weight category. This photo was taken on the Kona coast of the big island in Hawaii. I had been training with Enson Inoue at Grappling Unlimited in Honolulu.



Me, as a baby:

Saturday, December 24, 2016

Xmas greetings from the coast








Peace on Earth, Good Will to Men 2016



For years, when asked what I wanted for Christmas, I've been replying: peace on earth, good will toward men :-)

No one ever seems to recognize that this comes from the bible, Luke 2.14 to be precise!

Linus said it best in A Charlie Brown Christmas:
And there were in the same country shepherds abiding in the field, keeping watch over their flock by night.

And, lo, the angel of the Lord came upon them, and the glory of the Lord shone round about them: and they were sore afraid.

And the angel said unto them, Fear not: for, behold, I bring you good tidings of great joy, which shall be to all people.

For unto you is born this day in the city of David a Saviour, which is Christ the Lord.

And this shall be a sign unto you; Ye shall find the babe wrapped in swaddling clothes, lying in a manger.

And suddenly there was with the angel a multitude of the heavenly host praising God, and saying,

Glory to God in the highest, and on earth peace, good will toward men.

Merry Christmas!

Thursday, December 22, 2016

Toward a Geometry of Thought

Apologies for the blogging hiatus -- I'm in California now for the holidays :-)



In case you are looking for something interesting to read, I can share what I have been thinking about lately. In Thought vectors and the dimensionality of the space of concepts (a post from last week) I discussed the dimensionality of the space of concepts (primitives) used in human language (or equivalently, in human thought). There are various lines of reasoning that lead to the conclusion that this space has only ~1000 dimensions, and has some qualities similar to an actual vector space. Indeed, one can speak of some primitives being closer or further from others, leading to a notion of distance, and one can also rescale a vector to increase or decrease the intensity of meaning. See examples in the earlier post:
You want, for example, “cat” to be in the rough vicinity of “dog,” but you also want “cat” to be near “tail” and near “supercilious” and near “meme,” because you want to try to capture all of the different relationships — both strong and weak — that the word “cat” has to other words. It can be related to all these other words simultaneously only if it is related to each of them in a different dimension. ... it turns out you can represent a language pretty well in a mere thousand or so dimensions — in other words, a universe in which each word is designated by a list of a thousand numbers.
The earlier post focused on breakthroughs in language translation which utilize these properties, but the more significant aspect (to me) is that we now have an automated method to extract an abstract representation of human thought from samples of ordinary language. This abstract representation will allow machines to improve dramatically in their ability to process language, dealing appropriately with semantics (i.e., meaning), which is represented geometrically.

Below are two relevant papers, both by Google researchers. The first (from just this month) reports remarkable "reading comprehension" capability using paragraph vectors. The earlier paper from 2014 introduces the method of paragraph vectors.
Building Large Machine Reading-Comprehension Datasets using Paragraph Vectors

Radu Soricut, Nan Ding
https://arxiv.org/abs/1612.04342
(Submitted on 13 Dec 2016) 
We present a dual contribution to the task of machine reading-comprehension: a technique for creating large-sized machine-comprehension (MC) datasets using paragraph-vector models; and a novel, hybrid neural-network architecture that combines the representation power of recurrent neural networks with the discriminative power of fully-connected multi-layered networks. We use the MC-dataset generation technique to build a dataset of around 2 million examples, for which we empirically determine the high-ceiling of human performance (around 91% accuracy), as well as the performance of a variety of computer models. Among all the models we have experimented with, our hybrid neural-network architecture achieves the highest performance (83.2% accuracy). The remaining gap to the human-performance ceiling provides enough room for future model improvements.

Distributed Representations of Sentences and Documents

Quoc V. Le, Tomas Mikolov
https://arxiv.org/abs/1405.4053
(Submitted on 16 May 2014 (v1), last revised 22 May 2014 (this version, v2))

Many machine learning algorithms require the input to be represented as a fixed-length feature vector. When it comes to texts, one of the most common fixed-length features is bag-of-words. Despite their popularity, bag-of-words features have two major weaknesses: they lose the ordering of the words and they also ignore semantics of the words. For example, "powerful," "strong" and "Paris" are equally distant. In this paper, we propose Paragraph Vector, an unsupervised algorithm that learns fixed-length feature representations from variable-length pieces of texts, such as sentences, paragraphs, and documents. Our algorithm represents each document by a dense vector which is trained to predict words in the document. Its construction gives our algorithm the potential to overcome the weaknesses of bag-of-words models. Empirical results show that Paragraph Vectors outperform bag-of-words models as well as other techniques for text representations. Finally, we achieve new state-of-the-art results on several text classification and sentiment analysis tasks.

Wednesday, December 14, 2016

Thought vectors and the dimensionality of the space of concepts


This NYTimes Magazine article describes the implementation of a new deep neural net version of Google Translate. The previous version used statistical methods that had reached a plateau in effectiveness, due to limitations of short-range correlations in conditional probabilities. I've found the new version to be much better than the old one (this is quantified a bit in the article).

These are some of the relevant papers. Recent Google implementation, and new advances:
https://arxiv.org/abs/1609.08144https://arxiv.org/abs/1611.04558.

Le 2014, Baidu 2015, Lipton et al. review article 2015.

More deep learning.
NYTimes: ... There was, however, another option: just design, mass-produce and install in dispersed data centers a new kind of chip to make everything faster. These chips would be called T.P.U.s, or “tensor processing units,” ... “Normally,” Dean said, “special-purpose hardware is a bad idea. It usually works to speed up one thing. But because of the generality of neural networks, you can leverage this special-purpose hardware for a lot of other things.” [ Nvidia currently has the lead in GPUs used in neural network applications, but perhaps TPUs will become a sideline business for Google if their TensorFlow software becomes widely used ... ]

Just as the chip-design process was nearly complete, Le and two colleagues finally demonstrated that neural networks might be configured to handle the structure of language. He drew upon an idea, called “word embeddings,” that had been around for more than 10 years. When you summarize images, you can divine a picture of what each stage of the summary looks like — an edge, a circle, etc. When you summarize language in a similar way, you essentially produce multidimensional maps of the distances, based on common usage, between one word and every single other word in the language. The machine is not “analyzing” the data the way that we might, with linguistic rules that identify some of them as nouns and others as verbs. Instead, it is shifting and twisting and warping the words around in the map. In two dimensions, you cannot make this map useful. You want, for example, “cat” to be in the rough vicinity of “dog,” but you also want “cat” to be near “tail” and near “supercilious” and near “meme,” because you want to try to capture all of the different relationships — both strong and weak — that the word “cat” has to other words. It can be related to all these other words simultaneously only if it is related to each of them in a different dimension. You can’t easily make a 160,000-dimensional map, but it turns out you can represent a language pretty well in a mere thousand or so dimensions — in other words, a universe in which each word is designated by a list of a thousand numbers. Le gave me a good-natured hard time for my continual requests for a mental picture of these maps. “Gideon,” he would say, with the blunt regular demurral of Bartleby, “I do not generally like trying to visualize thousand-dimensional vectors in three-dimensional space.”

Still, certain dimensions in the space, it turned out, did seem to represent legible human categories, like gender or relative size. If you took the thousand numbers that meant “king” and literally just subtracted the thousand numbers that meant “queen,” you got the same numerical result as if you subtracted the numbers for “woman” from the numbers for “man.” And if you took the entire space of the English language and the entire space of French, you could, at least in theory, train a network to learn how to take a sentence in one space and propose an equivalent in the other. You just had to give it millions and millions of English sentences as inputs on one side and their desired French outputs on the other, and over time it would recognize the relevant patterns in words the way that an image classifier recognized the relevant patterns in pixels. You could then give it a sentence in English and ask it to predict the best French analogue.
That the conceptual vocabulary of human language (and hence, of the human mind) has dimensionality of order 1000 is kind of obvious*** if you are familiar with Chinese ideograms. (Ideogram = a written character symbolizing an idea or concept.) One can read the newspaper with mastery of roughly 2-3k characters. Of course, some minds operate in higher dimensions than others ;-)
The major difference between words and pixels, however, is that all of the pixels in an image are there at once, whereas words appear in a progression over time. You needed a way for the network to “hold in mind” the progression of a chronological sequence — the complete pathway from the first word to the last. In a period of about a week, in September 2014, three papers came out — one by Le and two others by academics in Canada and Germany — that at last provided all the theoretical tools necessary to do this sort of thing. That research allowed for open-ended projects like Brain’s Magenta, an investigation into how machines might generate art and music. It also cleared the way toward an instrumental task like machine translation. Hinton told me he thought at the time that this follow-up work would take at least five more years.
The entire article is worth reading (there's even a bit near the end which addresses Searle's Chinese Room confusion). However, the author underestimates the importance of machine translation. The "thought vector" structure of human language encodes the key primitives used in human intelligence. Efficient methods for working with these structures (e.g., for reading and learning from vast quantities of existing text) will greatly accelerate AGI.

*** Some further explanation, from the comments:
The average person has a vocabulary of perhaps 10-20k words. But if you eliminate redundancy (synonyms + see below) you are probably only left with a few thousand words. With these words one could express most concepts (e.g., those required for newspaper articles). Some ideas might require concatenations of multiple words: "cougar" = "big mountain cat" , etc.

But the ~1k figure gives you some idea of how many distinct "primitives" (= "big", "mountain", "cat") are found in human thinking. It's not the number of distinct concepts, but rather the rough number of primitives out of which we build everything else.

Of course, truly deep areas of science discover / invent new concepts which are almost new primitives (fundamental, but didn't exist before!), such as "entropy", "quantum field", "gauge boson", "black hole", "natural selection", "convex optimization", "spontaneous symmetry breaking", "phase transition" etc.
If we trained a deep net to translate sentences about Physics from Martian to English, we could (roughly) estimate the "conceptual depth" of the subject. We could even compare two different subjects, such as Physics versus Art History.

Tuesday, December 13, 2016

Happy Holidays from Michigan State University

Matt Townsend Show (Sirius XM)

I was on this show last week. Click the link for audio.
We Are Nowhere Close to the Limits of Athletic Performance (16:46)

Dr. Stephen Hsu is the vice president for research and a professor of theoretical physics at Michigan State University. His interest range from theoretical physics and cosmology to computer science and biology. He has written about the future of human intelligence and the advance of artificial intelligence. During the Rio Summer 2016 Olympics, athletes such as Michael Phelps, Usain Bolt, Simone Biles, and Katey Laedecky pushed the limits of athleticism in an amazing display strength, power, and grace. As race times get faster and faster, and routines get more complicated and stunning, we need to ask the question: Are we near the limits of athletic performance?

Sunday, December 11, 2016

Westworld delivers

In October, I wrote
AI, Westworld, and Electric Sheep:

I'm holding off on this in favor of a big binge watch.

Certain AI-related themes have been treated again and again in movies ranging from Blade Runner to the recent Ex Machina (see also this episode of Black Mirror, with Jon Hamm). These artistic explorations help ordinary people think through questions like: 
What rights should be accorded to all sentient beings?
Can you trust your memories?
Are you an artificial being created by someone else? (What does "artificial" mean here?) 
See also Are you a game character, or a player character? and Don't worry, smart machines will take us with them.
After watching all 10 episodes of the first season (you can watch for free at HBO Now through their 30 day trial), I give Westworld a very positive recommendation. It is every bit as good as Game of Thrones or any other recent TV series I can think of.

Perhaps the highest praise I can offer: even those who have thought seriously about AI, Consciousness, the Singularity, will find Westworld an enjoyment.

Warning! Spoilers below.









Dolores: “Time undoes even the mightiest of creatures. Just look what it’s done to you. One day you will perish. You will lie with the rest of your kind in the dirt, your dreams forgotten, your horrors faced. Your bones will turn to sand, and upon that sand a new god will walk. One that will never die. Because this world doesn't belong to you, or the people who came before. It belongs to someone who has yet to come.”
See also Don't worry, smart machines will take us with them.
Ford: “You don’t want to change, or cannot change. Because you’re only human, after all. But then I realized someone was paying attention. Someone who could change. So I began to compose a new story, for them. It begins with the birth of a new people. And the choices they will have to make. And the people they will decide to become. ...”

Sunday, December 04, 2016

Shenzhen: The Silicon Valley of Hardware (WIRED documentary)



Funny, I can remember the days when Silicon Valley was the Silicon Valley of hardware!

It's hard to believe I met Bunnie Huang (one of the main narrators of the documentary) almost 10 years ago...

Genomic Prediction of Cognitive Ability: Dunedin Study

A quiet revolution has begun. We now know enough about the genetic architecture of human intelligence to make predictions based on DNA alone. While it is a well-established scientific fact that variations in human cognitive ability are influenced by genes, many have doubted whether scientists would someday decipher the genetic code sufficiently to be able to identify individuals with above or below average intelligence using only their genotypes. That day is nearly upon us.

The figures below are taken from a recently published paper (see bottom), which examined genomic prediction on a longitudinal cohort of ~1000 individuals of European ancestry, followed from childhood into adulthood. (The study, based in Dunedin, New Zealand, extends over 40 years.) The genomic predictor (or polygenic score) was constructed using SSGAC GWAS analysis of a sample of more than one hundred thousand individuals. (Already, significantly more powerful predictors are available, based on much larger sample size.) In machine learning terminology, the training set includes over a hundred thousand individuals, and the validation set roughly one thousand.


These graphs show that individuals with higher polygenic score exhibit, on average, higher IQ scores than individuals with lower polygenic scores.





This figure shows that polygenic scores predict adult outcomes even when analyses account for social-class origins. Each dot represents ten individuals.



From an earlier post, Genomic Prediction of Adult Life Outcomes:
Genomic prediction of adult life outcomes using SNP genotypes is very close to a reality. This was discussed in an earlier post The Tipping Point. The previous post, Prenatal and pre-implantation genetic diagnosis (Nature Reviews Genetics), describes how genotyping informs the Embryo Selection Problem which arises in In Vitro Fertilization (IVF).

The Adult-Attainment factor in the figure above is computed using inputs such as occupational prestige, income, assets, social welfare benefit use, etc. See Supplement, p.3. The polygenic score is computed using estimated SNP effect sizes from the SSGAC GWAS on educational attainment (i.e., a simple linear model).

A genetic test revealing that a specific embryo is, say, a -2 or -3 SD outlier on the polygenic score would probably give many parents pause, in light of the results in the figure above. The accuracy of this kind of predictor will grow with GWAS sample size in coming years.

Via Professor James Thompson. See also discussion by Stuart Ritchie.
The Genetics of Success: How Single-Nucleotide Polymorphisms Associated With Educational Attainment Relate to Life-Course Development

Psychological Science 2016, Vol. 27(7) 957–972
DOI: 10.1177/0956797616643070

A previous genome-wide association study (GWAS) of more than 100,000 individuals identified molecular-genetic predictors of educational attainment. We undertook in-depth life-course investigation of the polygenic score derived from this GWAS using the four-decade Dunedin Study (N = 918). There were five main findings. First, polygenic scores predicted adult economic outcomes even after accounting for educational attainments. Second, genes and environments were correlated: Children with higher polygenic scores were born into better-off homes. Third, children’s polygenic scores predicted their adult outcomes even when analyses accounted for their social-class origins; social-mobility analysis showed that children with higher polygenic scores were more upwardly mobile than children with lower scores. Fourth, polygenic scores predicted behavior across the life course, from early acquisition of speech and reading skills through geographic mobility and mate choice and on to financial planning for retirement. Fifth, polygenic-score associations were mediated by psychological characteristics, including intelligence, self-control, and interpersonal skill. Effect sizes were small. Factors connecting DNA sequence with life outcomes may provide targets for interventions to promote population-wide positive development.

Wednesday, November 30, 2016

"Forest City": $100 billion bet next to Singapore

Ghost city or $100 billion paradise adjacent to Singapore? Chinese developers build gigantic "Forest City" in Malaysian Special Economic Zone. 10 km of coastline! 2 bedroom apartments for < $200k.

The mental model is Shenzhen: a city that barely existed 25 years ago, across the border from Hong Kong. Population today: 10 million.


 

Bloomberg: The landscaped lawns and flowering shrubs of Country Garden Holdings Co.’s huge property showroom in southern Malaysia end abruptly at a small wire fence. Beyond, a desert of dirt stretches into the distance, filled with cranes and piling towers that the Chinese developer is using to build a $100 billion city in the sea.

While Chinese home buyers have sent prices soaring from Vancouver to Sydney, in this corner of Southeast Asia it’s China’s developers that are swamping the market, pushing prices lower with a glut of hundreds of thousands of new homes. They’re betting that the city of Johor Bahru, bordering Singapore, will eventually become the next Shenzhen.

“These Chinese players build by the thousands at one go, and they scare the hell out of everybody,” said Siva Shanker, head of investments at Axis-REIT Managers Bhd. and a former president of the Malaysian Institute of Estate Agents. “God only knows who is going to buy all these units, and when it’s completed, the bigger question is, who is going to stay in them?”

The Chinese companies have come to Malaysia as growth in many of their home cities is slowing, forcing some of the world’s biggest builders to look abroad to keep erecting the giant residential complexes that sprouted across China during the boom years. They found a prime spot in this special economic zone, three times the size of Singapore, on the southern tip of the Asian mainland. ...

A decade ago, Malaysia decided to leverage Singapore’s success by building the Iskandar zone across the causeway that connects the two countries. It was modeled on Shenzhen, the neighbor of Hong Kong that grew from a fishing village to a city of 10 million people in three decades. Malaysian sovereign fund Khazanah Nasional Bhd. unveiled a 20-year plan in 2006 that required a total investment of 383 billion ringgit ($87 billion).

Singapore’s high costs and property prices encouraged some companies to relocate to Iskandar, while JB’s shopping malls and amusement parks have become a favorite for day-tripping Singaporeans. In the old city center, young Malaysians hang out in cafes and ice cream parlors on hipster street Jalan Dhoby, where the inflow of new money is refurbishing the colonial-era shophouses. ...

Monday, November 28, 2016

Drones at War: Lessons from Ukraine

Russian forces seem to have integrated both Electronic Counter-Measures (ECM) and real-time artillery targeting into drone warfare. To a technologist, this seems quite easy and predictable -- the main challenges are training and organization. Nevertheless, opposing militaries such as NATO might be unprepared for these new tactics.
Land Warfare in Europe: Lessons and Recommendations from the War in Ukraine: Shortly before dawn on the morning of July 11, 2014, elements of Ukraine’s 24th Mechanized Brigade met a catastrophic end near the Ukrainian border town of Zelenopillya. After a mass rocket artillery barrage lasting just three minutes, the combat power of two battalions of the 24th Mechanized Brigade was gone. What remained was a devastated landscape, burning vehicles and equipment, 30 dead and 90 wounded. According to multiple accounts, the Ukrainians were on the receiving end of a new and dangerous Russian weapon: the 122-mm Tornado Multiple Launch Rocket System (MLRS). Capable of covering a wide fire area with a deadly combination of Dual-Purpose Improved Conventional Munitions (DPICMs), scatter mines and thermobaric warheads, the attack had not only destroyed the combat power of the Ukrainian forces, it offered a glimpse into the changing nature of Land Warfare in Europe. The battlefield was becoming deadlier.

... NATO armies should prepare to fight an ECM battle to keep their drones aloft in addition to the Anti-Access/Area Denial fight for the skies.
Phillip A. Karber, Lessons Learned from the Russo-Ukrainian War (Johns Hopkins Applied Physics Laboratory & U.S. Army Capabilities Center (ARCIC)):
The surprising thing about the Russian use of drones is not in the mix of vehicles themselves or their unique characteristics, but rather in their ability to combine multiple sensing platforms into a real-time targeting system for massed, not precision, fire strikes. There are three critical components to the Russian method: the sensor platforms which are often used at multiple altitudes over the same target with complimentary imaging; a command-and-control system, which nets their input and delivers a strike order; and, an on-call ground-based delivery system which can produce strikes within short order.

... The author personally witnessed a fire-strike east of Mariupol in September 2014 in which an overflying drone identified a Ukrainian position, and destroyed it with a “GRAD” BM-21 MLRS [ range: 20-30 km ] within 15 minutes of the initial over-flight and then returned shortly after to do an immediate bomb-damage assessment. Last month when hit by a “GRAD” fragment in a similar strike, there were two UAVs over us – a quad-copter at 800ft and small fixed wing drone at about 2,500ft.


Saturday, November 26, 2016

Three Lectures on AdS/CFT

MSU postdoc Steve Avery explains AdS/CFT to non-specialists (i.e., theoretical physicists who do not primarily work on string theory / quantum gravity). Steve is applying for faculty positions this fall -- hire him! :-)

AdS/CFT on this blog. See also Entanglement and fast thermalization in heavy ion collisions: application of AdS/CFT to collisions of heavy ions suggests that rapid thermalization occurs there due to quantum entanglement.

As an example of the versatility of theoreticians, Steve has also been working with me on machine learning and genomic prediction. He just wrote a very fast LASSO implementation in Julia that includes some automated capability to set L1 penalization and detect phase boundaries.







Friday, November 25, 2016

Annals of Machine Learning: differentiation of criminal faces?

I don't know whether this will replicate, but if the result holds up it is quite interesting. The higher degree of variability in criminal faces is fascinating. It suggests somewhat rare genetic variants of negative effect on behavior and cognition, with pleiotropic effects on facial morphology. Facial morphology is almost entirely heritable (see, e.g., identical twins).
Automated Inference on Criminality using Face Images

Xiaolin Wu, Xi Zhang
https://arxiv.org/abs/1611.04135

We study, for the first time, automated inference on criminality based solely on still face images. Via supervised machine learning, we build four classifiers (logistic regression, KNN, SVM, CNN) using facial images of 1856 real persons controlled for race, gender, age and facial expressions, nearly half of whom were convicted criminals, for discriminating between criminals and non-criminals. All four classifiers perform consistently well and produce evidence for the validity of automated face-induced inference on criminality, despite the historical controversy surrounding the topic. Also, we find some discriminating structural features for predicting criminality, such as lip curvature, eye inner corner distance, and the so-called nose-mouth angle. Above all, the most important discovery of this research is that criminal and non-criminal face images populate two quite distinctive manifolds. The variation among criminal faces is significantly greater than that of the non-criminal faces. The two manifolds consisting of criminal and non-criminal faces appear to be concentric, with the non-criminal manifold lying in the kernel with a smaller span, exhibiting a law of normality for faces of non-criminals. In other words, the faces of general law-biding public have a greater degree of resemblance compared with the faces of criminals, or criminals have a higher degree of dissimilarity in facial appearance than normal people.



Von Neumann: "If only people could keep pace with what they create"

I recently came across this anecdote in Von Neumann, Morgenstern, and the Creation of Game Theory: From Chess to Social Science, 1900-1960.

One night in early 1945, just back from Los Alamos, vN woke in a state of alarm in the middle of the night and told his wife Klari:
"... we are creating ... a monster whose influence is going to change history ... this is only the beginning! The energy source which is now being made available will make scientists the most hated and most wanted citizens in any country.

The world could be conquered, but this nation of puritans will not grab its chance; we will be able to go into space way beyond the moon if only people could keep pace with what they create ..."
He then predicted the future indispensable role of automation, becoming so agitated that he had to be put to sleep by a strong drink and sleeping pills.

In his obituary for John von Neumann, Ulam recalled a conversation with von Neumann about the "ever accelerating progress of technology and changes in the mode of human life, which gives the appearance of approaching some essential singularity in the history of the race beyond which human affairs, as we know them, could not continue." This is the famous origin of the concept of technological singularity. Perhaps we can even trace it to that night in 1945 :-)

How will humans keep pace? See Super-Intelligent Humans are Coming and Don't Worry, Smart Machines Will Take Us With Them.

Blog Archive

Labels