#[fit] Ai 1
##[fit] Lets Start
The make you dangerous workshop.
And then, the make you super dangerous and rigorous full course.
(for those taking the full deal...)
- your machine
- Google Colab
- binder
(coming this week)
- Formation of groups for homework and fun
- Discussion Forum across college campuses
- Educational platform
- GPU based custom compute (for project)
- TA mentorship and office hours
- Professor office hours
##[fit] Do not feel shy
##[fit] to ask anything
##[fit]
##[fit] Learning a
##[fit] 3
we want a non-linearity as othersie combining linear regressions just gives a big honking linear regression
THEOREM:
- any one hidden layer net can approximate any continuous function with finite support, with appropriate choice of nonlinearity
- but may need lots of units
- and will learn the function it thinks the data has, not what you think
One hidden, 1 vs 2 neurons
Two hidden, 4 vs 8 neurons
##[fit] How
##[fit] do we learn?
- Automatic differentiation
- GPU
- Learning Recursive Representations
Something like:
and so on.
# load data
(X_train, y_train), (X_test, y_test) = mnist.load_data()
img_width = X_train.shape[1]
img_height = X_train.shape[2]
X_train = X_train.astype('float32')
X_train /= 255.
X_test = X_test.astype('float32')
X_test /= 255.
# one hot encode outputs
y_train = np_utils.to_categorical(y_train)
y_test = np_utils.to_categorical(y_test)
labels = range(10)
num_classes = y_train.shape[1]
# create model
model = Sequential()
model.add(Flatten(input_shape=(img_width, img_height)))
model.add(Dense(config.hidden_nodes, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer=config.optimizer,
metrics=['accuracy'])
model.summary()
# Fit the model
model.fit(X_train, y_train, validation_data=(X_test, y_test),
epochs=config.epochs,
callbacks=[WandbCallback(data_type="image", labels=labels)])
- pay attention to the spatial locality of images
- this is done through the use of "filters"
- thus the representations learnt are spatial and bear a mapping to reality
- and are hierarchical..later layers learn features composed from the previous layers
- perhaps even approximating what the visual cortex does.
##[fit] Convolutional Components
Fully Connected layers, 1-1 layers, regularization layers like dropout
##[fit] Convolution looks for ##[fit] patterns
Movethe filter over the original image and produce a new one
- Image size decrease by a convolution is called a "valid" convolution.
- Keeping the same size by 0-padding is called a "same" convolution
- do we then need to know every pattern we can find? NO! We learn the weights.
- now we do this hierarchically, with each filter at the next layer
- we hope to learn representations made up from smaller scale representations and so on
- here is an example: find the LHS face in the RHS image...
- Move layer 1 filters around
- max pool 27x27 to 9x9
- x means dont care about value
- now apply second level filter to 9x9 image
- max pool again to 3x3 image
- apply level 3 filters and see if we activate
input is (say) 26x26x6, so filters MUST have 6 channels and we have 4 new featuremaps: Conv2D(4,(3,3))
batch_size = 128
num_classes = 10
epochs = 12
# input image dimensions
img_rows, img_cols = 28, 28
input_shape = (28, 28)
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])