You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
text <- scan("ca10.txt", what = "char", sep = "\n") # ca10.txt is a file in the Brown corpus
text <- tolower(text)
text <- gsub("[^a-z- ]", "", text, perl = T)
quad <- get.phrasetable(ngram(text, n = 4))
This last line croaks the error msg. I don't understand why it says nwords=3 which is obviously untrue. Guess it's because one line in the file contains only three tokens? How can I work around this issue? (BTW, I work with R 3.6.3 on Linux Mint 19.3.) ca10.txt
The text was updated successfully, but these errors were encountered:
text <- scan("ca10.txt", what = "char", sep = "\n") # ca10.txt is a file in the Brown corpus
text <- tolower(text)
text <- gsub("[^a-z- ]", "", text, perl = T)
quad <- get.phrasetable(ngram(text, n = 4))
This last line croaks the error msg. I don't understand why it says nwords=3 which is obviously untrue. Guess it's because one line in the file contains only three tokens? How can I work around this issue? (BTW, I work with R 3.6.3 on Linux Mint 19.3.)
ca10.txt
The text was updated successfully, but these errors were encountered: