This is what I hate from this kind of hype (well-known authors from best institutions/companies) papers: they avoid important details and their explanations about the key points of the paper are naive.
They said: "The contexts are fixed-length and sampled from a sliding windows over the paragraph. The paragraph vector is shared across all contexts generated from the same paragraph but not across paragraph". I understood from the Figure 2 that the context is represented as word vectors (in the example of "the", "cat", and "sat"). But they don't say a word about how to build the paragraph matrix.
In the last paragraph of 2.2 section they said: "'the inference stage' to get paragraph vectors D for new paragraphs (never seen before) by adding more columns in D and gradient descending on D while holding W, U, b fixed. We use D to make a prediction about some particular labels using a standard classifier, e.g., logistic regression." but, how new columns in D are added?, why they apply gradient descending to build matrix D in the inference stage?
After reading section 2.3 I understood that they build a kind of bag-of-word vector to represent the paragraph matrix, but I didn't understand the concept PV-DM that they said it is explained in section 2.2.
My assessment is that this paper introduced an interesting idea (paragraph vector representations + word vector representations) but the paper itself isn't good enough.
Sure! count on me!
I agree with jdry1729 regarding Sentiment Analysis problem. You must first ensure that your training set has a vocabulary that covers enough the future streaming reviews. You must also log the proportion of unknown words that your system receives in the streams in order to decide when it is a good time to re-train your models.
Regarding the second problem. I'd simply compute centroids of your clusters and simply assign each stream review to the cluster whose centroid is the nearest one. Like the prior problem, I would measure to which clusters the streaming reviews are assigned the more in order to decide when to re-compute your clusters.
I don't known your system's details but if you have the opportunity to receive feedback from users regarding sentiment analysis or clustering, I'd use it to improve the models. There are a lot of literature about active learning that may help you in this way.
Hi there, I'm Fernando.
I'm about to finish my PhD in computer science at the University of Sevilla. I'm specialised in NLP and Software Engineering and I'm here to learn as much as possible from all of you. Regarding my non-work life, I have a daughter and two dogs that occupy all my time :)
I'm happy to joining this great community. It's a cool idea to share knowledge in a fun way.