Interview questions https://www.analyticsvidhya.com/blog/2017/01/must-know-questions-deep-learning/
Pyspark ml support :-- http://spark.apache.org/docs/latest/api/python/pyspark.ml.html
Bio tagging applicable for NER
Knowledge bases L--wordnet,conceptual,wikitext
Which of the following statement is the best description of early stopping? = Simulate the network on a test dataset after every epoch of training. Stop training when the generalization error starts to increase
Vocabulary size of above corpus :--15
Identify the following activation function : φ(V) = Z + (1/ 1 + exp (– x * V + Y) ), Z, X, Y are parameters
Sigmoid
which of the following is not a step for word normalization ==run sentences
Sequence of tasks in perceptron--> 1432
How many times I will appear with love in cooccurence matrix :--2
which of these are built in models/functions in Gensim==all options
Spell correction assumption :--false
While creating a machine learning model on text data, you created a document term matrix of the input data of 100K documents. Which of the following remedies can be used to reduce the dimensions of data – 1. Latent Dirichlet Allocation 2. Latent Semantic Indexing 3. Keyword Normalization
Ans --1,2,3
Social Media platforms are the most intuitive form of text data. You are given a corpus of complete social media data of tweets. How can you create a model that suggests the hashtags?
A) Perform Topic Models to obtain most significant words of the corpus B) Train a Bag of Ngrams model to capture top n-grams – words and their combinations C) Train a word2vector model to learn repeating contexts in the sentences D) All of these Solution: (D)
While working with context extraction from a text data, you encountered two different sentences: The tank is full of soldiers. The tank is full of nitrogen. Which of the following measures can be used to remove the problem of word sense disambiguation in the sentences? A) Compare the dictionary definition of an ambiguous word with the terms contained in its neighborhood B) Co-reference resolution in which one resolute the meaning of ambiguous word with the proper noun present in the previous sentence C) Use dependency parsing of sentence to understand the meanings (A) From https://www.analyticsvidhya.com/blog/2017/07/30-questions-test-data-scientist-natural-language-processing-solution-skilltest-nlp/
While working with text data obtained from news sentences, which are structured in nature, which of the grammar-based text parsing techniques can be used for noun phrase detection, verb phrase detection, subject detection and object detection. A) Part of speech tagging B) Dependency Parsing and Constituency Parsing C) Skip Gram and N-Gram extraction D) Continuous Bag of Words Solution: (B) Dependency and constituent parsing extract these relations from the text
Types of parsing:--Deep parsing andShallow parsing A ReLU unit in neural network never gets saturated--True
when does a neural network become a deep learning model-- add more hidden layers and increase depth of neural network For k fold cross validation , smaller k value implies less variance:--true One fast growing language for building semantic web applications is RSS.
CRF is discrimative and HMM is generative
Hyperparams in neural network--learning rate, batch size and number of epochs
In Gradient Descent or Batch Gradient Descent, we use the whole training data per epoch whereas, in Stochastic Gradient Descent, we use only single training example per epoch and Mini-batch Gradient Descent lies in between of these two extremes, in which we can use a mini-batch(small portion) of training data per epoch, thumb rule for selecting the size of mini-batch is in power of 2 like 32, 64, 128 etc.
Can be computationally efficient Can be used to improve convergence
https://www.kdnuggets.com/2020/01/intro-guide-nlp-data-scientists.html
https://www.analyticsvidhya.com/blog/2017/01/must-know-questions-deep-learning/ https://www.analyticsvidhya.com/blog/2017/07/30-questions-test-data-scientist-natural-language-processing-solution-skilltest-nlp/
https://www.mygreatlearning.com/blog/nlp-interview-questions/
https://compsciedu.com/Neural-Networks/UGC-NET-computer-science-question-paper/discussion/7522 https://www.analyticsvidhya.com/blog/2017/07/30-questions-test-data-scientist-natural-language-processing-solution-skilltest-nlp/