Table of Contents

 

Preface

 

xi

 

 

 

 

 

Chapter 1

Introduction

1

 

 

1.1   This book and the ancillary material

3

 

 

1.2   Types of machine learning models

4

 

 

1.3   Validation and testing

6

 

 

1.4   Data cleaning

14

 

 

1.5   Bayes’ theorem

16

 

 

Summary

19

 

 

Short concept questions

20

 

 

Exercises

21

 

 

 

 

 

Chapter 2

Unsupervised Learning

23 

 

 

2.1   Feature scaling

24

 

 

2.2   The k-means algorithm

25

 

 

2.3   Choosing k

28

 

 

2.4   The curse of dimensionality

31

 

 

2.5   Country risk

31

 

 

2.6   Alternative clustering algorithms

35

 

 

2.7   Principal components analysis

39

 

 

Summary

43

 

 

Short concept questions

44

 

 

Exercises

45

 

 

 

 

 

Chapter 3

Supervised Learning: Linear and Logistic Regression                                                                           

 

47

 

 

3.1   Linear regression: one feature

48

 

 

3.2   Linear regression: multiple features

49

 

 

3.3   Categorical features

52

 

 

3.4   Regularization

53

 

 

3.5   Ridge regression

54

 

 

3.6   Lasso regression

58

 

 

3.7   Elastic Net regression

60

 

 

3.8   Results for house price data

62

 

 

3.9   Logistic regression

66

 

 

3.10 Decision criteria

69

 

 

3.11 Application to credit decisions

70

 

 

3.12 The k-nearest neighbor algorithm

76

 

 

Summary

76

 

 

Short concept questions

77

 

 

Exercises

78

 

 

 

 

 

Chapter 4

 Supervised Learning: Decision Trees

81

 

 

4.1   Nature of decision trees

82

 

 

4.2   Information gain measures

83

 

 

4.3   Application to credit decisions

85

 

 

4.4   The naïve Bayes classifier

91

 

 

4.5   Continuous target variables

95

 

 

4.6   Ensemble learning

98

 

 

Summary

100

 

 

Short concept questions

101

 

 

Exercises

101

 

 

 

 

 

Chapter 5

Supervised Learning: SVMs

103

 

 

5.1   Linear SVM classification

103

 

 

5.2   Modification for soft margin

109

 

 

5.3   Non-linear separation

112

 

 

5.4   Predicting a continuous variable

114

 

 

Summary

118

 

 

Short concept questions

118

 

 

Exercises

119

 

 

 

 

 

Chapter 6

Supervised Learning: Neural Networks

121

 

 

6.1   Single layer ANNs

121

 

 

6.2   Multi-layer ANNs

125

 

 

6.3   Gradient descent algorithm

126

 

 

6.4   Variations on the basic method

131

 

 

6.5   The stopping rule

133

 

 

6.6   The Black−Scholes−Merton formula

133

 

 

6.7   Extensions

137

 

 

6.8   Autoencoders

138

 

 

6.9   Convolutional neural networks

140

 

 

6.10 Recurrent neural networks

142

 

 

Summary

143

 

 

Short concept questions

144

 

 

Exercises

 

 

144

 

 

Chapter 7

Reinforcement Learning

147

 

 

7.1   The multi-armed bandit problem

148

 

 

7.2   Changing environment  

152

 

 

7.3   The game of Nim

154

 

 

7.4   Temporal difference learning

157

 

 

7.5   Deep Q-learning

159

 

 

7.6   Applications

159

 

 

Summary

161

 

 

Short concept questions

162

 

 

Exercises

163

 

 

 

 

 

Chapter 8

Natural Language Processing

165

 

 

8.1   Sources of data

168

 

 

8.2   Pre-processing

169

 

 

8.3   Bag of words model

170

 

 

8.4   Application of naïve Bayes classifier

172

 

 

8.5   Application of other algorithms

176

 

 

8.6   Information retrieval

177

 

 

8.7   Other NLP applications

178

 

 

Summary

180

 

 

Short concept questions

181

 

 

Exercises

181

 

 

Chapter 9    

 

Model Interpretability

 

183

 

 

9.1   Linear regression

185

 

 

9.2   Logistic regression

189

 

 

9.3   Black-box models

192

 

 

9.4   Shapley values

193

 

 

9.5   LIME

196

 

 

Summary

196

 

 

Short concept questions

197

 

 

Exercises

198

 

 

 

 

 

Chapter 10

Applications in Finance

199

 

 

10.1  Derivatives

199

 

 

10.2  Delta

202

 

 

10.3  Volatility surfaces

203

 

 

10.4  Understanding volatility surface movements

204

 

 

10.5  Using reinforcement learning for hedging

208

 

 

10.6  Extensions

210

 

 

10.7  Other finance applications

212

 

 

Summary

213

 

 

Short concept questions

214

 

 

Exercises

214

 

 

 

 

 

Chapter 11

Issues for Society

217

 

 

11.1  Data privacy

218

 

 

11.2  Biases

209

 

 

11.3  Ethics

220

 

 

11.4  Transparency

221

 

 

11.5  Adversarial machine learning

221

 

 

11.6  Legal issues

222

 

 

11.7  Man vs. machine

223

 

 

 

 

 

Answers to End of Chapter Questions

 

225

Glossary of Terms

 

243

Index

 

253